Age | Commit message (Collapse) | Author |
|
Merge cpufreq updates for 6.16-rc1:
- Refactor cpufreq_online(), add and use cpufreq policy locking guards,
use __free() in policy reference counting, and clean up core cpufreq
code on top of that (Rafael Wysocki).
- Fix boost handling on CPU suspend/resume and sysfs updates (Viresh
Kumar).
- Fix des_perf clamping with max_perf in amd_pstate_update() (Dhananjay
Ugwekar).
- Add offline, online and suspend callbacks to the amd-pstate driver,
rename and use the existing amd_pstate_epp callbacks in it (Dhananjay
Ugwekar).
- Add support for the "Requested CPU Min frequency" BIOS option to the
amd-pstate driver (Dhananjay Ugwekar).
- Reset amd-pstate driver mode after running selftests (Swapnil
Sapkal).
- Add helper for governor checks to the schedutil cpufreq governor and
move cpufreq-specific EAS checks to cpufreq (Rafael Wysocki).
- Populate the cpu_capacity sysfs entries from the intel_pstate driver
after registering asym capacity support (Ricardo Neri).
- Add support for enabling Energy-aware scheduling (EAS) to the
intel_pstate driver when operating in the passive mode on a hybrid
platform (Rafael Wysocki).
- Avoid shadowing ret in amd_pstate_ut_check_driver() (Nathan
Chancellor).
- Drop redundant cpus_read_lock() from store_local_boost() in the
cpufreq core (Seyediman Seyedarab).
- Replace sscanf() with kstrtouint() in the cpufreq code and use a
symbol instead of a raw number in it (Bowen Yu).
- Add support for autonomous CPU performance state selection to the
CPPC cpufreq driver (Lifeng Zheng).
* pm-cpufreq: (31 commits)
cpufreq: CPPC: Add support for autonomous selection
cpufreq: Update sscanf() to kstrtouint()
cpufreq: Replace magic number
cpufreq: drop redundant cpus_read_lock() from store_local_boost()
cpufreq/amd-pstate: Avoid shadowing ret in amd_pstate_ut_check_driver()
cpufreq: intel_pstate: Document hybrid processor support
cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache
cpufreq: intel_pstate: EAS support for hybrid platforms
cpufreq: Drop policy locking from cpufreq_policy_is_good_for_eas()
cpufreq: intel_pstate: Populate the cpu_capacity sysfs entries
arch_topology: Relocate cpu_scale to topology.[h|c]
cpufreq/sched: Move cpufreq-specific EAS checks to cpufreq
cpufreq/sched: schedutil: Add helper for governor checks
amd-pstate-ut: Reset amd-pstate driver mode after running selftests
cpufreq/amd-pstate: Add support for the "Requested CPU Min frequency" BIOS option
cpufreq/amd-pstate: Add offline, online and suspend callbacks for amd_pstate_driver
cpufreq: Force sync policy boost with global boost on sysfs update
cpufreq: Preserve policy's boost state after resume
cpufreq: Introduce policy_set_boost()
cpufreq: Don't unnecessarily call set_boost()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull coredump updates from Christian Brauner:
"This adds support for sending coredumps over an AF_UNIX socket. It
also makes (implicit) use of the new SO_PEERPIDFD ability to hand out
pidfds for reaped peer tasks
The new coredump socket will allow userspace to not have to rely on
usermode helpers for processing coredumps and provides a saf way to
handle them instead of relying on super privileged coredumping helpers
This will also be significantly more lightweight since the kernel
doens't have to do a fork()+exec() for each crashing process to spawn
a usermodehelper. Instead the kernel just connects to the AF_UNIX
socket and userspace can process it concurrently however it sees fit.
Support for userspace is incoming starting with systemd-coredump
There's more work coming in that direction next cycle. The rest below
goes into some details and background
Coredumping currently supports two modes:
(1) Dumping directly into a file somewhere on the filesystem.
(2) Dumping into a pipe connected to a usermode helper process
spawned as a child of the system_unbound_wq or kthreadd
For simplicity I'm mostly ignoring (1). There's probably still some
users of (1) out there but processing coredumps in this way can be
considered adventurous especially in the face of set*id binaries
The most common option should be (2) by now. It works by allowing
userspace to put a string into /proc/sys/kernel/core_pattern like:
|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
The "|" at the beginning indicates to the kernel that a pipe must be
used. The path following the pipe indicator is a path to a binary that
will be spawned as a usermode helper process. Any additional
parameters pass information about the task that is generating the
coredump to the binary that processes the coredump
In the example the core_pattern shown causes the kernel to spawn
systemd-coredump as a usermode helper. There's various conceptual
consequences of this (non-exhaustive list):
- systemd-coredump is spawned with file descriptor number 0 (stdin)
connected to the read-end of the pipe. All other file descriptors
are closed. That specifically includes 1 (stdout) and 2 (stderr).
This has already caused bugs because userspace assumed that this
cannot happen (Whether or not this is a sane assumption is
irrelevant)
- systemd-coredump will be spawned as a child of system_unbound_wq.
So it is not a child of any userspace process and specifically not
a child of PID 1. It cannot be waited upon and is in a weird hybrid
upcall which are difficult for userspace to control correctly
- systemd-coredump is spawned with full kernel privileges. This
necessitates all kinds of weird privilege dropping excercises in
userspace to make this safe
- A new usermode helper has to be spawned for each crashing process
This adds a new mode:
(3) Dumping into an AF_UNIX socket
Userspace can set /proc/sys/kernel/core_pattern to:
@/path/to/coredump.socket
The "@" at the beginning indicates to the kernel that an AF_UNIX
coredump socket will be used to process coredumps
The coredump socket must be located in the initial mount namespace.
When a task coredumps it opens a client socket in the initial network
namespace and connects to the coredump socket:
- The coredump server uses SO_PEERPIDFD to get a stable handle on the
connected crashing task. The retrieved pidfd will provide a stable
reference even if the crashing task gets SIGKILLed while generating
the coredump. That is a huge attack vector right now
- By setting core_pipe_limit non-zero userspace can guarantee that
the crashing task cannot be reaped behind it's back and thus
process all necessary information in /proc/<pid>. The SO_PEERPIDFD
can be used to detect whether /proc/<pid> still refers to the same
process
The core_pipe_limit isn't used to rate-limit connections to the
socket. This can simply be done via AF_UNIX socket directly
- The pidfd for the crashing task will contain information how the
task coredumps. The PIDFD_GET_INFO ioctl gained a new flag
PIDFD_INFO_COREDUMP which can be used to retreive the coredump
information
If the coredump gets a new coredump client connection the kernel
guarantees that PIDFD_INFO_COREDUMP information is available.
Currently the following information is provided in the new
@coredump_mask extension to struct pidfd_info:
* PIDFD_COREDUMPED is raised if the task did actually coredump
* PIDFD_COREDUMP_SKIP is raised if the task skipped coredumping
(e.g., undumpable)
* PIDFD_COREDUMP_USER is raised if this is a regular coredump and
doesn't need special care by the coredump server
* PIDFD_COREDUMP_ROOT is raised if the generated coredump should
be treated as sensitive and the coredump server should restrict
access to the generated coredump to sufficiently privileged
users"
* tag 'vfs-6.16-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
mips, net: ensure that SOCK_COREDUMP is defined
selftests/coredump: add tests for AF_UNIX coredumps
selftests/pidfd: add PIDFD_INFO_COREDUMP infrastructure
coredump: validate socket name as it is written
coredump: show supported coredump modes
pidfs, coredump: add PIDFD_INFO_COREDUMP
coredump: add coredump socket
coredump: reflow dump helpers a little
coredump: massage do_coredump()
coredump: massage format_corename()
|
|
Merge energy model management code updates for 6.16-rc1:
- Fix potential division-by-zero error in em_compute_costs() (Yaxiong
Tian).
- Fix typos in energy model documentation and example driver code (Moon
Hee Lee, Atul Kumar Pant).
- Rearrange the energy model management code and add a new function for
adjusting a CPU energy model after adjusting the capacity of the
given CPU to it (Rafael Wysocki).
* pm-em:
PM: EM: Introduce em_adjust_cpu_capacity()
PM: EM: Move CPU capacity check to em_adjust_new_capacity()
PM: EM: Documentation: Fix typos in example driver code
PM: EM: Documentation: fix typo in energy-model.rst
PM: EM: Fix potential division-by-zero error in em_compute_costs()
|
|
'acpi-docs'
Merge an ACPI resources management update, ACPI power management
updates, an ACPI platform profile driver fix and an ACPI documentation
update related to device properties for 6.16-rc1:
- Fix a typo for MECHREVO in irq1_edge_low_force_override[] (Mingcong
Bai).
- Add an LPS0 check() callback to the AMD pinctrl driver and fix up
config symbol dependencies in it (Mario Limonciello, Rafael Wysocki).
- Avoid initializing the ACPI platform profile driver on non-ACPI
platforms (Alexandre Ghiti).
- Document that references to ACPI data (non-device) nodes should use
string-only references in hierarchical data node packages (Sakari
Ailus).
* acpi-resource:
ACPI: resource: fix a typo for MECHREVO in irq1_edge_low_force_override[]
* acpi-pm:
pinctrl: amd: Fix hibernation support with CONFIG_SUSPEND unset
pinctrl: amd: Fix use of undeclared identifier 'pinctrl_amd_s2idle_dev_ops'
pinctrl: amd: Add an LPS0 check() callback
ACPI: Add missing prototype for non CONFIG_SUSPEND/CONFIG_X86 case
* acpi-platform-profile:
ACPI: platform_profile: Avoid initializing on non-ACPI platforms
* acpi-docs:
Documentation: ACPI: Use all-string data node references
|
|
Merge an ACPI PCI root driver update, ACPI battery driver updates, an
ACPI EC driver update and APEI updates for 6.16-rc1:
- Turn the acpi_pci_root_remap_iospace() fwnode_handle parameter into a
const pointer (Pei Xiao).
- Round battery capacity percengate in the ACPI battery driver to the
closest integer to avoid user confusion (shitao).
- Make the ACPI battery driver report the current as a negative number
to the power supply framework when the battery is discharging as
documented (Peter Marheine).
- Add TUXEDO InfinityBook Pro AMD Gen9 to the acpi_ec_no_wakeup[] list
to prevent spurious wakeups from suspend-to-idle (Werner Sembach).
- Convert the APEI EINJ driver to a faux device one (Sudeep Holla, Jon
Hunter).
- Remove redundant calls to einj_get_available_error_type() from the
APEI EINJ driver (Zaid Alali).
* acpi-pci:
ACPI: PCI: Constify fwnode_handle in acpi_pci_root_remap_iospace()
* acpi-battery:
ACPI: battery: negate current when discharging
ACPI: battery: Round capacity percengate to closest integer
* acpi-ec:
ACPI: EC: Add device to acpi_ec_no_wakeup[] qurik list
* acpi-apei:
ACPI: APEI: EINJ: Remove redundant calls to einj_get_available_error_type()
ACPI: APEI: EINJ: Fix probe error message
ACPI: APEI: EINJ: Transition to the faux device interface
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull pidfs updates from Christian Brauner:
"Features:
- Allow handing out pidfds for reaped tasks for AF_UNIX SO_PEERPIDFD
socket option
SO_PEERPIDFD is a socket option that allows to retrieve a pidfd for
the process that called connect() or listen(). This is heavily used
to safely authenticate clients in userspace avoiding security bugs
due to pid recycling races (dbus, polkit, systemd, etc.)
SO_PEERPIDFD currently doesn't support handing out pidfds if the
sk->sk_peer_pid thread-group leader has already been reaped. In
this case it currently returns EINVAL. Userspace still wants to get
a pidfd for a reaped process to have a stable handle it can pass
on. This is especially useful now that it is possible to retrieve
exit information through a pidfd via the PIDFD_GET_INFO ioctl()'s
PIDFD_INFO_EXIT flag
Another summary has been provided by David Rheinsberg:
> A pidfd can outlive the task it refers to, and thus user-space
> must already be prepared that the task underlying a pidfd is
> gone at the time they get their hands on the pidfd. For
> instance, resolving the pidfd to a PID via the fdinfo must be
> prepared to read `-1`.
>
> Despite user-space knowing that a pidfd might be stale, several
> kernel APIs currently add another layer that checks for this. In
> particular, SO_PEERPIDFD returns `EINVAL` if the peer-task was
> already reaped, but returns a stale pidfd if the task is reaped
> immediately after the respective alive-check.
>
> This has the unfortunate effect that user-space now has two ways
> to check for the exact same scenario: A syscall might return
> EINVAL/ESRCH/... *or* the pidfd might be stale, even though
> there is no particular reason to distinguish both cases. This
> also propagates through user-space APIs, which pass on pidfds.
> They must be prepared to pass on `-1` *or* the pidfd, because
> there is no guaranteed way to get a stale pidfd from the kernel.
>
> Userspace must already deal with a pidfd referring to a reaped
> task as the task may exit and get reaped at any time will there
> are still many pidfds referring to it
In order to allow handing out reaped pidfd SO_PEERPIDFD needs to
ensure that PIDFD_INFO_EXIT information is available whenever a
pidfd for a reaped task is created by PIDFD_INFO_EXIT. The uapi
promises that reaped pidfds are only handed out if it is guaranteed
that the caller sees the exit information:
TEST_F(pidfd_info, success_reaped)
{
struct pidfd_info info = {
.mask = PIDFD_INFO_CGROUPID | PIDFD_INFO_EXIT,
};
/*
* Process has already been reaped and PIDFD_INFO_EXIT been set.
* Verify that we can retrieve the exit status of the process.
*/
ASSERT_EQ(ioctl(self->child_pidfd4, PIDFD_GET_INFO, &info), 0);
ASSERT_FALSE(!!(info.mask & PIDFD_INFO_CREDS));
ASSERT_TRUE(!!(info.mask & PIDFD_INFO_EXIT));
ASSERT_TRUE(WIFEXITED(info.exit_code));
ASSERT_EQ(WEXITSTATUS(info.exit_code), 0);
}
To hand out pidfds for reaped processes we thus allocate a pidfs
entry for the relevant sk->sk_peer_pid at the time the
sk->sk_peer_pid is stashed and drop it when the socket is
destroyed. This guarantees that exit information will always be
recorded for the sk->sk_peer_pid task and we can hand out pidfds
for reaped processes
- Hand a pidfd to the coredump usermode helper process
Give userspace a way to instruct the kernel to install a pidfd for
the crashing process into the process started as a usermode helper.
There's still tricky race-windows that cannot be easily or
sometimes not closed at all by userspace. There's various ways like
looking at the start time of a process to make sure that the
usermode helper process is started after the crashing process but
it's all very very brittle and fraught with peril
The crashed-but-not-reaped process can be killed by userspace
before coredump processing programs like systemd-coredump have had
time to manually open a PIDFD from the PID the kernel provides
them, which means they can be tricked into reading from an
arbitrary process, and they run with full privileges as they are
usermode helper processes
Even if that specific race-window wouldn't exist it's still the
safest and cleanest way to let the kernel provide the pidfd
directly instead of requiring userspace to do it manually. In
parallel with this commit we already have systemd adding support
for this in [1]
When the usermode helper process is forked we install a pidfd file
descriptor three into the usermode helper's file descriptor table
so it's available to the exec'd program
Since usermode helpers are either children of the system_unbound_wq
workqueue or kthreadd we know that the file descriptor table is
empty and can thus always use three as the file descriptor number
Note, that we'll install a pidfd for the thread-group leader even
if a subthread is calling do_coredump(). We know that task linkage
hasn't been removed yet and even if this @current isn't the actual
thread-group leader we know that the thread-group leader cannot be
reaped until
@current has exited
- Allow telling when a task has not been found from finding the wrong
task when creating a pidfd
We currently report EINVAL whenever a struct pid has no tasked
attached anymore thereby conflating two concepts:
(1) The task has already been reaped
(2) The caller requested a pidfd for a thread-group leader but the
pid actually references a struct pid that isn't used as a
thread-group leader
This is causing issues for non-threaded workloads as in where they
expect ESRCH to be reported, not EINVAL
So allow userspace to reliably distinguish between (1) and (2)
- Make it possible to detect when a pidfs entry would outlive the
struct pid it pinned
- Add a range of new selftests
Cleanups:
- Remove unneeded NULL check from pidfd_prepare() for passed struct
pid
- Avoid pointless reference count bump during release_task()
Fixes:
- Various fixes to the pidfd and coredump selftests
- Fix error handling for replace_fd() when spawning coredump usermode
helper"
* tag 'vfs-6.16-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
pidfs: detect refcount bugs
coredump: hand a pidfd to the usermode coredump helper
coredump: fix error handling for replace_fd()
pidfs: move O_RDWR into pidfs_alloc_file()
selftests: coredump: Raise timeout to 2 minutes
selftests: coredump: Fix test failure for slow machines
selftests: coredump: Properly initialize pointer
net, pidfs: enable handing out pidfds for reaped sk->sk_peer_pid
pidfs: get rid of __pidfd_prepare()
net, pidfs: prepare for handing out pidfds for reaped sk->sk_peer_pid
pidfs: register pid in pidfs
net, pidfd: report EINVAL for ESRCH
release_task: kill the no longer needed get/put_pid(thread_pid)
pidfs: ensure consistent ENOENT/ESRCH reporting
exit: move wake_up_all() pidfd waiters into __unhash_process()
selftest/pidfd: add test for thread-group leader pidfd open for thread
pidfd: improve uapi when task isn't found
pidfd: remove unneeded NULL check from pidfd_prepare()
selftests/pidfd: adapt to recent changes
|
|
In `struct virtio_vsock_sock`, we maintain two counters:
- `rx_bytes`: used internally to track how many bytes have been read.
This supports mechanisms like .stream_has_data() and sock_rcvlowat().
- `fwd_cnt`: used for the credit mechanism to inform available receive
buffer space to the remote peer.
These counters are updated via virtio_transport_inc_rx_pkt() and
virtio_transport_dec_rx_pkt().
Since the beginning with commit 06a8fc78367d ("VSOCK: Introduce
virtio_vsock_common.ko"), we call virtio_transport_dec_rx_pkt() in
virtio_transport_stream_do_dequeue() only when we consume the entire
packet, so partial reads, do not update `rx_bytes` and `fwd_cnt`.
This is fine for `fwd_cnt`, because we still have space used for the
entire packet, and we don't want to update the credit for the other
peer until we free the space of the entire packet. However, this
causes `rx_bytes` to be stale on partial reads.
Previously, this didn’t cause issues because `rx_bytes` was used only by
.stream_has_data(), and any unread portion of a packet implied data was
still available. However, since commit 93b808876682
("virtio/vsock: fix logic which reduces credit update messages"), we now
rely on `rx_bytes` to determine if a credit update should be sent when
the data in the RX queue drops below SO_RCVLOWAT value.
This patch fixes the accounting by updating `rx_bytes` with the number
of bytes actually read, even on partial reads, while leaving `fwd_cnt`
untouched until the packet is fully consumed. Also introduce a new
`buf_used` counter to check that the remote peer is honoring the given
credit; this was previously done via `rx_bytes`.
Fixes: 93b808876682 ("virtio/vsock: fix logic which reduces credit update messages")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250521121705.196379-1-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
"This contains minor mount updates for this cycle:
- mnt->mnt_devname can never be NULL so simplify the code handling
that case
- Add a comment about concurrent changes during statmount() and
listmount()
- Update the STATMOUNT_SUPPORTED macro
- Convert mount flags to an enum"
* tag 'vfs-6.16-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
statmount: update STATMOUNT_SUPPORTED macro
fs: convert mount flags to enum
->mnt_devname is never NULL
mount: add a comment about concurrent changes with statmount()/listmount()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following batch contains Netfilter updates for net-next,
specifically 26 patches: 5 patches adding/updating selftests,
4 fixes, 3 PREEMPT_RT fixes, and 14 patches to enhance nf_tables):
1) Improve selftest coverage for pipapo 4 bit group format, from
Florian Westphal.
2) Fix incorrect dependencies when compiling a kernel without
legacy ip{6}tables support, also from Florian.
3) Two patches to fix nft_fib vrf issues, including selftest updates
to improve coverage, also from Florian Westphal.
4) Fix incorrect nesting in nft_tunnel's GENEVE support, from
Fernando F. Mancera.
5) Three patches to fix PREEMPT_RT issues with nf_dup infrastructure
and nft_inner to match in inner headers, from Sebastian Andrzej Siewior.
6) Integrate conntrack information into nft trace infrastructure,
from Florian Westphal.
7) A series of 13 patches to allow to specify wildcard netdevice in
netdev basechain and flowtables, eg.
table netdev filter {
chain ingress {
type filter hook ingress devices = { eth0, eth1, vlan* } priority 0; policy accept;
}
}
This also allows for runtime hook registration on NETDEV_{UN}REGISTER
event, from Phil Sutter.
netfilter pull request 25-05-23
* tag 'nf-next-25-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: (26 commits)
selftests: netfilter: Torture nftables netdev hooks
netfilter: nf_tables: Add notifications for hook changes
netfilter: nf_tables: Support wildcard netdev hook specs
netfilter: nf_tables: Sort labels in nft_netdev_hook_alloc()
netfilter: nf_tables: Handle NETDEV_CHANGENAME events
netfilter: nf_tables: Wrap netdev notifiers
netfilter: nf_tables: Respect NETDEV_REGISTER events
netfilter: nf_tables: Prepare for handling NETDEV_REGISTER events
netfilter: nf_tables: Have a list of nf_hook_ops in nft_hook
netfilter: nf_tables: Pass nf_hook_ops to nft_unregister_flowtable_hook()
netfilter: nf_tables: Introduce nft_register_flowtable_ops()
netfilter: nf_tables: Introduce nft_hook_find_ops{,_rcu}()
netfilter: nf_tables: Introduce functions freeing nft_hook objects
netfilter: nf_tables: add packets conntrack state to debug trace info
netfilter: conntrack: make nf_conntrack_id callable without a module dependency
netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit
netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx
netfilter: nf_dup{4, 6}: Move duplication check to task_struct
netfilter: nft_tunnel: fix geneve_opt dump
selftests: netfilter: nft_fib.sh: add type and oif tests with and without VRFs
...
====================
Link: https://patch.msgid.link/20250523132712.458507-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Merge ACPI processor driver updates and ACPI CPPC library updates for
6.16-rc1:
- Clean up the initialization of CPU data structures in the ACPI
processor driver (Zhang Rui).
- Remove an obsolete comment regarding the C-states handling in the
ACPI processor driver (Giovanni Gherdovich).
- Simplify PCC shared memory region handling (Sudeep Holla).
- Rework and extend functions for reading CPPC register values and for
updating CPPC registers (Lifeng Zheng).
- Add three functions related to autonomous CPU performance state
selection to the CPPC library (Lifeng Zheng).
* acpi-processor:
ACPI: processor: idle: Remove redundant pr->power.count assignment
ACPI: processor: idle: Set pr->flags.power unconditionally
ACPI: processor: idle: Remove obsolete comment
* acpi-cppc:
ACPI: CPPC: Add three functions related to autonomous selection
ACPI: CPPC: Modify cppc_get_auto_sel_caps() to cppc_get_auto_sel()
ACPI: CPPC: Refactor register value get and set ABIs
ACPI: CPPC: Add cppc_set_reg_val()
ACPI: CPPC: Extract cppc_get_reg_val_in_pcc()
ACPI: CPPC: Rename cppc_get_perf() to cppc_get_reg_val()
ACPI: CPPC: Optimize cppc_get_perf()
ACPI: CPPC: Add IS_OPTIONAL_CPC_REG macro to judge if a cpc_reg is optional
ACPI: CPPC: Simplify PCC shared memory region handling
ACPI: PCC: Simplify PCC shared memory region handling
|
|
Merge updates related to the handling of static (data-only) ACPI tables
for 6.16-rc1:
- Add __nonstring annotations for unterminated strings in the static
ACPI tables parsing code (Kees Cook).
- Add support for parsing the MRRM ACPI table and sysfs files to
describe memory regions listed in it (Tony Luck, Anil Keshavamurthy).
- Remove an (explicitly) unused header file include from the VIOT ACPI
table parser file (Andy Shevchenko).
- Improve logging around acpi_initialize_tables() (Bartosz Szczepanek).
* acpi-tables:
ACPI: MRRM: Fix default max memory region
ACPI: tables: Improve logging around acpi_initialize_tables()
ACPI: VIOT: Remove (explicitly) unused header
ACPI: Add documentation for exposing MRRM data
ACPI: MRRM: Add /sys files to describe memory ranges
ACPI: MRRM: Minimal parse of ACPI MRRM table
ACPI: tables: Add __nonstring annotations for unterminated strings
|
|
Merge ACPICA updates, including two upstream releases 20241212 and
20250404, for 6.16-rc1:
- Fix two ACPICA SLAB cache leaks (Seunghun Han).
- Add EINJv2 get error type action and define Error Injection Actions
in hex values to avoid inconsistencies between the specification and
the code (Zaid Alali).
- Fix typo in comments for SRAT structures (Adam Lackorzynski).
- Prevent possible loss of data in ACPICA because of u32 to u8
conversions (Saket Dumbre).
- Fix reading FFixedHW operation regions in ACPICA (Daniil Tatianin).
- Add support for printing AML arguments when the ACPICA debug level is
ACPI_LV_TRACE_POINT (Mario Limonciello).
- Drop a stale comment about the file content from actbl2.h (Sudeep
Holla).
- Apply pack(1) to union aml_resource (Tamir Duberstein).
- Fix overflow check in the ACPICA version of vsnprintf() (gldrk).
- Interpret SIDP structures in DMAR added revision 3.4 of the VT-d
specification (Alexey Neyman).
- Add typedef and other definitions related to MRRM to ACPICA (Tony
Luck).
- Add definitions for RIMT to ACPICA (Sunil V L).
- Fix spelling mistake "Incremement" -> "Increment" in the ACPICA
utilities code (Colin Ian King).
- Add typedef and other definitions for ERDT to ACPICA (Tony Luck).
- Introduce ACPI_NONSTRING and use it (Kees Cook, Ahmed Salem).
- Rename structure and field names of the RAS2 table in actbl2.h (Shiju
Jose).
- Fix up whitespace in acpica/utcache.c (Zhe Qiao).
- Avoid sequence overread in a call to strncmp() in ap_get_table_length()
and replace strncpy() with memcpy() in ACPICA in some places (Ahmed
Salem).
- Update copyright year in all ACPICA files (Saket Dumbre).
* acpica: (30 commits)
ACPICA: Update copyright year
ACPICA: Logfile: Changes for version 20250404
ACPICA: Replace strncpy() with memcpy()
ACPICA: Apply ACPI_NONSTRING in more places
ACPICA: Avoid sequence overread in call to strncmp()
ACPICA: Adjust the position of code lines
ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the RAS2 table
ACPICA: Apply ACPI_NONSTRING
ACPICA: Introduce ACPI_NONSTRING
ACPICA: actbl2.h: ERDT: Add typedef and other definitions
ACPICA: infrastructure: Add new DMT_BUF types and shorten a long name
ACPICA: Utilities: Fix spelling mistake "Incremement" -> "Increment"
ACPICA: MRRM: Some cleanups
ACPICA: actbl2: Add definitions for RIMT
ACPICA: actbl2.h: MRRM: Add typedef and other definitions
ACPICA: infrastructure: Add new header and ACPI_DMT_BUF26 types
ACPICA: Interpret SIDP structures in DMAR
ACPICA: utilities: Fix overflow check in vsnprintf()
ACPICA: Apply pack(1) to union aml_resource
ACPICA: Drop stale comment about the header file content
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs freezing updates from Christian Brauner:
"This contains various filesystem freezing related work for this cycle:
- Allow the power subsystem to support filesystem freeze for suspend
and hibernate.
Now all the pieces are in place to actually allow the power
subsystem to freeze/thaw filesystems during suspend/resume.
Filesystems are only frozen and thawed if the power subsystem does
actually own the freeze.
If the filesystem is already frozen by the time we've frozen all
userspace processes we don't care to freeze it again. That's
userspace's job once the process resumes. We only actually freeze
filesystems if we absolutely have to and we ignore other failures
to freeze.
We could bubble up errors and fail suspend/resume if the error
isn't EBUSY (aka it's already frozen) but I don't think that this
is worth it. Filesystem freezing during suspend/resume is
best-effort. If the user has 500 ext4 filesystems mounted and 4
fail to freeze for whatever reason then we simply skip them.
What we have now is already a big improvement and let's see how we
fare with it before making our lives even harder (and uglier) than
we have to.
- Allow efivars to support freeze and thaw
Allow efivarfs to partake to resync variable state during system
hibernation and suspend. Add freeze/thaw support.
This is a pretty straightforward implementation. We simply add
regular freeze/thaw support for both userspace and the kernel.
efivars is the first pseudofilesystem that adds support for
filesystem freezing and thawing.
The simplicity comes from the fact that we simply always resync
variable state after efivarfs has been frozen. It doesn't matter
whether that's because of suspend, userspace initiated freeze or
hibernation. Efivars is simple enough that it doesn't matter that
we walk all dentries. There are no directories and there aren't
insane amounts of entries and both freeze/thaw are already
heavy-handed operations. If userspace initiated a freeze/thaw cycle
they would need CAP_SYS_ADMIN in the initial user namespace (as
that's where efivarfs is mounted) so it can't be triggered by
random userspace. IOW, we really really don't care"
* tag 'vfs-6.16-rc1.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
f2fs: fix freezing filesystem during resize
kernfs: add warning about implementing freeze/thaw
efivarfs: support freeze/thaw
power: freeze filesystems during suspend/resume
libfs: export find_next_child()
super: add filesystem freezing helpers for suspend and hibernate
gfs2: pass through holder from the VFS for freeze/thaw
super: use common iterator (Part 2)
super: use a common iterator (Part 1)
super: skip dying superblocks early
super: simplify user_get_super()
super: remove pointless s_root checks
fs: allow all writers to be frozen
locking/percpu-rwsem: add freezable alternative to down_read
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
1) Remove some unnecessary strscpy_pad() size arguments.
From Thorsten Blum.
2) Correct use of xso.real_dev on bonding offloads.
Patchset from Cosmin Ratiu.
3) Add hardware offload configuration to XFRM_MSG_MIGRATE.
From Chiachang Wang.
4) Refactor migration setup during cloning. This was
done after the clone was created. Now it is done
in the cloning function itself.
From Chiachang Wang.
5) Validate assignment of maximal possible SEQ number.
Prevent from setting to the maximum sequrnce number
as this would cause for traffic drop.
From Leon Romanovsky.
6) Prevent configuration of interface index when offload
is used. Hardware can't handle this case.i
From Leon Romanovsky.
7) Always use kfree_sensitive() for SA secret zeroization.
From Zilin Guan.
ipsec-next-2025-05-23
* tag 'ipsec-next-2025-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: use kfree_sensitive() for SA secret zeroization
xfrm: prevent configuration of interface index when offload is used
xfrm: validate assignment of maximal possible SEQ number
xfrm: Refactor migration setup during the cloning process
xfrm: Migrate offload configuration
bonding: Fix multiple long standing offload races
bonding: Mark active offloaded xfrm_states
xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free}
xfrm: Remove unneeded device check from validate_xmit_xfrm
xfrm: Use xdo.dev instead of xdo.real_dev
net/mlx5: Avoid using xso.real_dev unnecessarily
xfrm: Remove unnecessary strscpy_pad() size arguments
====================
Link: https://patch.msgid.link/20250523075611.3723340-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2025-05-22
this is a pull request of 22 patches for net-next/main.
The series by Biju Das contains 19 patches and adds RZ/G3E CANFD
support to the rcar_canfd driver.
The patch by Vincent Mailhol adds a struct data_bittiming_params to
group FD parameters as a preparation patch for CAN-XL support.
Felix Maurer's patch imports tst-filter from can-tests into the kernel
self tests and Vincent Mailhol adds support for physical CAN
interfaces.
linux-can-next-for-6.16-20250522
* tag 'linux-can-next-for-6.16-20250522' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (22 commits)
selftests: can: test_raw_filter.sh: add support of physical interfaces
selftests: can: Import tst-filter from can-tests
can: dev: add struct data_bittiming_params to group FD parameters
can: rcar_canfd: Add RZ/G3E support
can: rcar_canfd: Enhance multi_channel_irqs handling
can: rcar_canfd: Add external_clk variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add sh variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add struct rcanfd_regs variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add shared_can_regs variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add ch_interface_mode variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add {nom,data}_bittiming variables to struct rcar_canfd_hw_info
can: rcar_canfd: Add max_cftml variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add max_aflpn variable to struct rcar_canfd_hw_info
can: rcar_canfd: Add rnc_field_width variable to struct rcar_canfd_hw_info
can: rcar_canfd: Update RCANFD_GAFLCFG macro
can: rcar_canfd: Add rcar_canfd_setrnc()
can: rcar_canfd: Drop the mask operation in RCANFD_GAFLCFG_SETRNC macro
can: rcar_canfd: Update RCANFD_GERFL_ERR macro
can: rcar_canfd: Drop RCANFD_GAFLCFG_GETRNC macro
can: rcar_canfd: Use of_get_available_child_by_name()
...
====================
Link: https://patch.msgid.link/20250522084128.501049-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Send link events one after another otherwise new message
is overwriting the message which is being processed by PF.
Fixes: a88e0f936ba9 ("octeontx2: Detect the mbox up or down message via register")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1747823443-404-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains the usual selections of misc updates for this cycle.
Features:
- Use folios for symlinks in the page cache
FUSE already uses folios for its symlinks. Mirror that conversion
in the generic code and the NFS code. That lets us get rid of a few
folio->page->folio conversions in this path, and some of the few
remaining users of read_cache_page() / read_mapping_page()
- Try and make a few filesystem operations killable on the VFS
inode->i_mutex level
- Add sysctl vfs_cache_pressure_denom for bulk file operations
Some workloads need to preserve more dentries than we currently
allow through out sysctl interface
A HDFS servers with 12 HDDs per server, on a HDFS datanode startup
involves scanning all files and caching their metadata (including
dentries and inodes) in memory. Each HDD contains approximately 2
million files, resulting in a total of ~20 million cached dentries
after initialization
To minimize dentry reclamation, they set vfs_cache_pressure to 1.
Despite this configuration, memory pressure conditions can still
trigger reclamation of up to 50% of cached dentries, reducing the
cache from 20 million to approximately 10 million entries. During
the subsequent cache rebuild period, any HDFS datanode restart
operation incurs substantial latency penalties until full cache
recovery completes
To maintain service stability, more dentries need to be preserved
during memory reclamation. The current minimum reclaim ratio (1/100
of total dentries) remains too aggressive for such workload. This
patch introduces vfs_cache_pressure_denom for more granular cache
pressure control
The configuration [vfs_cache_pressure=1,
vfs_cache_pressure_denom=10000] effectively maintains the full 20
million dentry cache under memory pressure, preventing datanode
restart performance degradation
- Avoid some jumps in inode_permission() using likely()/unlikely()
- Avid a memory access which is most likely a cache miss when
descending into devcgroup_inode_permission()
- Add fastpath predicts for stat() and fdput()
- Anonymous inodes currently don't come with a proper mode causing
issues in the kernel when we want to add useful VFS debug assert.
Fix that by giving them a proper mode and masking it off when we
report it to userspace which relies on them not having any mode
- Anonymous inodes currently allow to change inode attributes because
the VFS falls back to simple_setattr() if i_op->setattr isn't
implemented. This means the ownership and mode for every single
user of anon_inode_inode can be changed. Block that as it's either
useless or actively harmful. If specific ownership is needed the
respective subsystem should allocate anonymous inodes from their
own private superblock
- Raise SB_I_NODEV and SB_I_NOEXEC on the anonymous inode superblock
- Add proper tests for anonymous inode behavior
- Make it easy to detect proper anonymous inodes and to ensure that
we can detect them in codepaths such as readahead()
Cleanups:
- Port pidfs to the new anon_inode_{g,s}etattr() helpers
- Try to remove the uselib() system call
- Add unlikely branch hint return path for poll
- Add unlikely branch hint on return path for core_sys_select
- Don't allow signals to interrupt getdents copying for fuse
- Provide a size hint to dir_context for during readdir()
- Use writeback_iter directly in mpage_writepages
- Update compression and mtime descriptions in initramfs
documentation
- Update main netfs API document
- Remove useless plus one in super_cache_scan()
- Remove unnecessary NULL-check guards during setns()
- Add separate separate {get,put}_cgroup_ns no-op cases
Fixes:
- Fix typo in root= kernel parameter description
- Use KERN_INFO for infof()|info_plog()|infofc()
- Correct comments of fs_validate_description()
- Mark an unlikely if condition with unlikely() in
vfs_parse_monolithic_sep()
- Delete macro fsparam_u32hex()
- Remove unused and problematic validate_constant_table()
- Fix potential unsigned integer underflow in fs_name()
- Make file-nr output the total allocated file handles"
* tag 'vfs-6.16-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (43 commits)
fs: Pass a folio to page_put_link()
nfs: Use a folio in nfs_get_link()
fs: Convert __page_get_link() to use a folio
fs/read_write: make default_llseek() killable
fs/open: make do_truncate() killable
fs/open: make chmod_common() and chown_common() killable
include/linux/fs.h: add inode_lock_killable()
readdir: supply dir_context.count as readdir buffer size hint
vfs: Add sysctl vfs_cache_pressure_denom for bulk file operations
fuse: don't allow signals to interrupt getdents copying
Documentation: fix typo in root= kernel parameter description
include/cgroup: separate {get,put}_cgroup_ns no-op case
kernel/nsproxy: remove unnecessary guards
fs: use writeback_iter directly in mpage_writepages
fs: remove useless plus one in super_cache_scan()
fs: add S_ANON_INODE
fs: remove uselib() system call
device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()
fs/fs_parse: Remove unused and problematic validate_constant_table()
fs: touch up predicts in inode_permission()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount api conversions from Christian Brauner:
"This converts the bfs and omfs filesystems to the new mount api"
* tag 'vfs-6.16-rc1.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
omfs: convert to new mount API
bfs: convert bfs to use the new mount api
|
|
Jakub suggests:
> I have a different request :) Matt, once this ends up in net-next
> (end of this week) could you refactor it to use nlmsg_payload() ?
> It doesn't exist in net but this is exactly why it was added.
This refactors the additions to both mctp_dump_addrinfo(), and
mctp_rtm_getneigh() - two cases where we're calling nlh_data() on an
an incoming netlink message, without a prior nlmsg_parse().
For the neigh.c case, we cannot hit the failure where the nlh does not
contain a full ndmsg at present, as the core handler
(net/core/neighbour.c, neigh_get()) has already validated the size
through neigh_valid_req_get(), and would have failed the get operation
before the MCTP hander is called.
However, relying on that is a bit fragile, so apply the nlmsg_payload
refector here too.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://patch.msgid.link/20250521-mctp-nlmsg-payload-v2-1-e85df160c405@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
'add-the-capability-to-consume-sram-for-hwfd-descriptor-queue-in-airoha_eth-driver'
Lorenzo Bianconi says:
====================
Add the capability to consume SRAM for hwfd descriptor queue in airoha_eth driver
In order to improve packet processing and packet forwarding
performances, EN7581 SoC supports consuming SRAM instead of DRAM for hw
forwarding descriptors queue. For downlink hw accelerated traffic
request to consume SRAM memory for hw forwarding descriptors queue.
Moreover, in some configurations QDMA blocks require a contiguous block
of system memory for hwfd buffers queue. Introduce the capability to
allocate hw buffers forwarding queue via the reserved-memory DTS
property instead of running dmam_alloc_coherent().
v2: https://lore.kernel.org/r/20250509-airopha-desc-sram-v2-0-9dc3d8076dfb@kernel.org
v1: https://lore.kernel.org/r/20250507-airopha-desc-sram-v1-0-d42037431bfa@kernel.org
====================
Link: https://patch.msgid.link/20250521-airopha-desc-sram-v3-0-a6e9b085b4f0@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In order to improve packet processing and packet forwarding
performances, EN7581 SoC supports consuming SRAM instead of DRAM for
hw forwarding descriptors queue.
For downlink hw accelerated traffic request to consume SRAM memory
for hw forwarding descriptors queue.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250521-airopha-desc-sram-v3-4-a6e9b085b4f0@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In some configurations QDMA blocks require a contiguous block of
system memory for hwfd buffers queue. Introduce the capability to allocate
hw buffers forwarding queue via the reserved-memory DTS property instead of
running dmam_alloc_coherent().
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250521-airopha-desc-sram-v3-3-a6e9b085b4f0@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Since hfwd descriptor and buffer queues are allocated via
dmam_alloc_coherent() we do not need to store their references
in airoha_qdma struct. This patch does not introduce any logical changes,
just code clean-up.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250521-airopha-desc-sram-v3-2-a6e9b085b4f0@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Introduce memory-region and memory-region-names properties for the
ethernet node available on EN7581 SoC in order to reserve system memory
for hw forwarding buffers queue used by the QDMA modules.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250521-airopha-desc-sram-v3-1-a6e9b085b4f0@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Jiawen Wu says:
====================
Support phylink and link/gpio irqs for AML 25G/10G devices, and complete
PTP and SRIOV.
====================
Link: https://patch.msgid.link/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Since .mac_link_up and .mac_link_down are changed for AML 25G/10G NICs,
the SR-IOV related function should be invoked in these new functions, to
bring VFs link up.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/BA8B302B7AAB6EA6+20250521064402.22348-10-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Support PTP clock and 1PPS output signal for AML devices.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://patch.msgid.link/F2F6E5E8899D2C20+20250521064402.22348-9-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The new added mailbox commands require a new released firmware version.
Otherwise, a lot of logs "Unknown FW command" would be printed. And the
devices may not work properly. So add the test command in the probe
function.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/18283F17BE0FA335+20250521064402.22348-8-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
For AML 25G/10G devices, some of the information returned from
phylink_ethtool_ksettings_get() is not correct, since there is a
fixed-link mode. So add additional corrections.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/C94BF867617C544D+20250521064402.22348-7-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The driver needs to handle GPIO interrupts to identify SFP module and
configure PHY by sending mailbox messages to firmware.
Since the SFP module needs to wait for ready to get information when it
is inserted, workqueue is added to handle delayed tasks. And each SW-FW
interaction takes time to wait, so they are processed in the workqueue
instead of IRQ handler function.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/399624AF221E8E28+20250521064402.22348-6-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There is a new PHY attached to AML 25G/10G NIC, which is different from
SP 10G/1G NIC. But the PHY configuration is handed over to firmware, and
also I2C is controlled by firmware. So the different PHYLINK fixed-link
mode is added for these devices.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/987B973A5929CD48+20250521064402.22348-5-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
For the following patches to support PHYLINK for AML 25G devices,
separate MAC type wx_mac_aml40 to maintain the driver of 40G devices.
Because 40G devices will complete support later, not now.
And this patch makes the 25G devices use some PHYLINK interfaces, but it
is not yet create PHYLINK and cannot be used on its own. It is just
preparation for the next patches.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/592B1A6920867D0C+20250521064402.22348-4-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Most of the different code that requires MAC type in the common library
is due to NGBE only supports a few queues and pools, unlike TXGBE, which
supports 128 queues and 64 pools. This difference accounts for most of
the hardware configuration differences in the driver code. So add a flag
bit "WX_FLAG_MULTI_64_FUNC" for them to clean-up the driver code.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/C731132E124D75E5+20250521064402.22348-3-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Since AML devices are going to reuse some definitions, remove the "SP"
qualifier from these definitions.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8EF712EC14B8FF70+20250521064402.22348-2-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull final writepage conversion from Christian Brauner:
"This converts vboxfs from ->writepage() to ->writepages().
This was the last user of the ->writepage() method. So remove
->writepage() completely and all references to it"
* tag 'vfs-6.16-rc1.writepage' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: Remove aops->writepage
mm: Remove swap_writepage() and shmem_writepage()
ttm: Call shmem_writeout() from ttm_backup_backup_page()
i915: Use writeback_iter()
shmem: Add shmem_writeout()
writeback: Remove writeback_use_writepage()
migrate: Remove call to ->writepage
vboxsf: Convert to writepages
9p: Add a migrate_folio method
|
|
The KSZ9477 switch driver uses the XPCS driver to operate its SGMII
port. However there are some hardware bugs in the KSZ9477 SGMII
module so workarounds are needed. There was a proposal to update the
XPCS driver to accommodate KSZ9477, but the new code is not generic
enough to be used by other vendors. It is better to do all these
workarounds inside the KSZ9477 driver instead of modifying the XPCS
driver.
There are 3 hardware issues. The first is the MII_ADVERTISE register
needs to be write once after reset for the correct code word to be
sent. The XPCS driver disables auto-negotiation first before
configuring the SGMII/1000BASE-X mode and then enables it back. The
KSZ9477 driver then writes the MII_ADVERTISE register before enabling
auto-negotiation. In 1000BASE-X mode the MII_ADVERTISE register will
be set, so KSZ9477 driver does not need to write it.
The second issue is the MII_BMCR register needs to set the exact speed
and duplex mode when running in SGMII mode. During link polling the
KSZ9477 will check the speed and duplex mode are different from
previous ones and update the MII_BMCR register accordingly.
The last issue is 1000BASE-X mode does not work with auto-negotiation
on. The cause is the local port hardware does not know the link is up
and so network traffic is not forwarded. The workaround is to write 2
additional bits when 1000BASE-X mode is configured.
Note the SGMII interrupt in the port cannot be masked. As that
interrupt is not handled in the KSZ9477 driver the SGMII interrupt bit
will not be set even when the XPCS driver sets it.
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250520230720.23425-1-Tristram.Ha@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs directory lookup updates from Christian Brauner:
"This contains cleanups for the lookup_one*() family of helpers.
We expose a set of functions with names containing "lookup_one_len"
and others without the "_len". This difference has nothing to do with
"len". It's rater a historical accident that can be confusing.
The functions without "_len" take a "mnt_idmap" pointer. This is found
in the "vfsmount" and that is an important question when choosing
which to use: do you have a vfsmount, or are you "inside" the
filesystem. A related question is "is permission checking relevant
here?".
nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
functions. They pass nop_mnt_idmap and refuse to work on filesystems
which have any other idmap.
This work changes nfsd and cachefile to use the lookup_one family of
functions and to explictily pass &nop_mnt_idmap which is consistent
with all other vfs interfaces used where &nop_mnt_idmap is explicitly
passed.
The remaining uses of the "_one" functions do not require permission
checks so these are renamed to be "_noperm" and the permission
checking is removed.
This series also changes these lookup function to take a qstr instead
of separate name and len. In many cases this simplifies the call"
* tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
VFS: change lookup_one_common and lookup_noperm_common to take a qstr
Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS
VFS: rename lookup_one_len family to lookup_noperm and remove permission check
cachefiles: Use lookup_one() rather than lookup_one_len()
nfsd: Use lookup_one() rather than lookup_one_len()
VFS: improve interface for lookup_one functions
|
|
Syzkaller, courtesy of syzbot, identified an error (see report [1]) in
aqc111 driver, caused by incomplete sanitation of usb read calls'
results. This problem is quite similar to the one fixed in commit
920a9fa27e78 ("net: asix: add proper error handling of usb read errors").
For instance, usbnet_read_cmd() may read fewer than 'size' bytes,
even if the caller expected the full amount, and aqc111_read_cmd()
will not check its result properly. As [1] shows, this may lead
to MAC address in aqc111_bind() being only partly initialized,
triggering KMSAN warnings.
Fix the issue by verifying that the number of bytes read is
as expected and not less.
[1] Partial syzbot report:
BUG: KMSAN: uninit-value in is_valid_ether_addr include/linux/etherdevice.h:208 [inline]
BUG: KMSAN: uninit-value in usbnet_probe+0x2e57/0x4390 drivers/net/usb/usbnet.c:1830
is_valid_ether_addr include/linux/etherdevice.h:208 [inline]
usbnet_probe+0x2e57/0x4390 drivers/net/usb/usbnet.c:1830
usb_probe_interface+0xd01/0x1310 drivers/usb/core/driver.c:396
call_driver_probe drivers/base/dd.c:-1 [inline]
really_probe+0x4d1/0xd90 drivers/base/dd.c:658
__driver_probe_device+0x268/0x380 drivers/base/dd.c:800
...
Uninit was stored to memory at:
dev_addr_mod+0xb0/0x550 net/core/dev_addr_lists.c:582
__dev_addr_set include/linux/netdevice.h:4874 [inline]
eth_hw_addr_set include/linux/etherdevice.h:325 [inline]
aqc111_bind+0x35f/0x1150 drivers/net/usb/aqc111.c:717
usbnet_probe+0xbe6/0x4390 drivers/net/usb/usbnet.c:1772
usb_probe_interface+0xd01/0x1310 drivers/usb/core/driver.c:396
...
Uninit was stored to memory at:
ether_addr_copy include/linux/etherdevice.h:305 [inline]
aqc111_read_perm_mac drivers/net/usb/aqc111.c:663 [inline]
aqc111_bind+0x794/0x1150 drivers/net/usb/aqc111.c:713
usbnet_probe+0xbe6/0x4390 drivers/net/usb/usbnet.c:1772
usb_probe_interface+0xd01/0x1310 drivers/usb/core/driver.c:396
call_driver_probe drivers/base/dd.c:-1 [inline]
...
Local variable buf.i created at:
aqc111_read_perm_mac drivers/net/usb/aqc111.c:656 [inline]
aqc111_bind+0x221/0x1150 drivers/net/usb/aqc111.c:713
usbnet_probe+0xbe6/0x4390 drivers/net/usb/usbnet.c:1772
Reported-by: syzbot+3b6b9ff7b80430020c7b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=3b6b9ff7b80430020c7b
Tested-by: syzbot+3b6b9ff7b80430020c7b@syzkaller.appspotmail.com
Fixes: df2d59a2ab6c ("net: usb: aqc111: Add support for getting and setting of MAC address")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://patch.msgid.link/20250520113240.2369438-1-n.zhandarovich@fintech.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
'struct thermal_zone_device_ops' could be left unmodified in this driver.
Constifying this structure moves some data to a read-only section, so
increases overall security, especially when the structure holds some
function pointers.
While at it, also constify a struct thermal_zone_params.
On a x86_64, with allmodconfig:
Before:
======
text data bss dec hex filename
26422 12584 512 39518 9a5e drivers/platform/x86/acerhdf.o
After:
=====
text data bss dec hex filename
26646 12360 512 39518 9a5e drivers/platform/x86/acerhdf.o
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/e502fadf2c6b24fc4ec3a7880533f7ca68429720.1748177235.git.christophe.jaillet@wanadoo.fr
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
xfstests generic/482 tests the file system consistency after each
FUA operation. It fails when run on exfat.
exFAT clears the volume dirty flag with a FUA operation during sync.
Since s_lock is not held when data is being written to a file, sync
can be executed at the same time. When data is being written to a
file, the FAT chain is updated first, and then the file size is
updated. If sync is executed between updating them, the length of the
FAT chain may be inconsistent with the file size.
To avoid the situation where the file system is inconsistent but the
volume dirty flag is cleared, this commit moves the clearing of the
volume dirty flag from exfat_fs_sync() to exfat_put_super(), so that
the volume dirty flag is not cleared until unmounting. After the
move, there is no additional action during sync, so exfat_fs_sync()
can be deleted.
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The double free could happen in the following path.
exfat_create_upcase_table()
exfat_create_upcase_table() : return error
exfat_free_upcase_table() : free ->vol_utbl
exfat_load_default_upcase_table : return error
exfat_kill_sb()
delayed_free()
exfat_free_upcase_table() <--------- double free
This patch set ->vol_util as NULL after freeing it.
Reported-by: Jianzhou Zhao <xnxc22xnxc22@qq.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The doctest example uses a function only available for CONFIG_OF and so
the build with doc tests fails when it isn't enabled.
error[E0599]: no function or associated item named `from_of_cpumask`
found for struct `rust_doctest_kernel_alloc_kbox_rs_4::kernel::opp::Table`
in the current scope
Fix this by making the doctest depend on CONFIG_OF.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505260856.ZQWHW2xT-lkp@intel.com/
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://patch.msgid.link/a80bfedcb4d94531dc27d3b48062db5042078e88.1748237646.git.viresh.kumar@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Merge operating performance points (OPP) updates for 6.16 from Viresh
Kumar:
"- OPP: Add dev_pm_opp_set_level() (Praveen Talari).
- Introduce scope-based cleanup headers and mutex locking guards in OPP
core (Viresh Kumar).
- switch to use kmemdup_array() (Zhang Enpei)."
* tag 'opp-updates-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
OPP: switch to use kmemdup_array()
OPP: Add dev_pm_opp_set_level()
OPP: Use mutex locking guards
OPP: Define and use scope-based cleanup helpers
OPP: Use scope-based OF cleanup helpers
OPP: Return opp_table from dev_pm_opp_get_opp_table_ref()
OPP: Return opp from dev_pm_opp_get()
OPP: Remove _get_opp_table_kref()
|
|
Lower the I2C2 bus clock frequency on the RZ/G3E SMARC SoM from 1MHz to
400kHz to improve compatibility with a wider range of I2C peripherals.
As the GreenPAK device is programmed to operate at 400kHz, the previous
1MHz setting was too aggressive, causing it to experience timing issues.
Fixes: f7a98e256ee3 ("arm64: dts: renesas: rzg3e-smarc-som: Add I2C2 device pincontrol")
Signed-off-by: John Madieu <john.madieu.xa@bp.renesas.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250518220812.1480696-1-john.madieu.xa@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
neigh_connected_output()
Replace kfree_skb() used in neigh_resolve_output() and
neigh_connected_output() with kfree_skb_reason().
Following new skb drop reason is added:
/* failed to fill the device hard header */
SKB_DROP_REASON_NEIGH_HH_FILLFAIL
Signed-off-by: Qiu Yutan <qiu.yutan@zte.com.cn>
Signed-off-by: Jiang Kun <jiang.kun2@zte.com.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Xu Xin <xu.xin16@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use prime 3 for length to make offset slowly drift away.
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add new -z argument to specify max IOV size. By default, use
single large IOV.
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sendmsg() with a single iov becomes ITER_UBUF, sendmsg() with multiple
iovs becomes ITER_IOVEC. iter_iov_len does not return correct
value for UBUF, so teach to treat UBUF differently.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Fixes: bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
irq_fpu_usable() incorrectly returned true before the FPU is
initialized. The x86 CPU onlining code can call sha256() to checksum
AMD microcode images, before the FPU is initialized. Since sha256()
recently gained a kernel-mode FPU optimized code path, a crash occurred
in kernel_fpu_begin_mask() during hotplug CPU onlining.
(The crash did not occur during boot-time CPU onlining, since the
optimized sha256() code is not enabled until subsys_initcalls run.)
Fix this by making irq_fpu_usable() return false before fpu__init_cpu()
has run. To do this without adding any additional overhead to
irq_fpu_usable(), replace the existing per-CPU bool in_kernel_fpu with
kernel_fpu_allowed which tracks both initialization and usage rather
than just usage. The initial state is false; FPU initialization sets it
to true; kernel-mode FPU sections toggle it to false and then back to
true; and CPU offlining restores it to the initial state of false.
Fixes: 11d7956d526f ("crypto: x86/sha256 - implement library instead of shash")
Reported-by: Ayush Jain <Ayush.Jain3@amd.com>
Closes: https://lore.kernel.org/r/20250516112217.GBaCcf6Yoc6LkIIryP@fat_crate.local
Signed-off-by: Eric Biggers <ebiggers@google.com>
Tested-by: Ayush Jain <Ayush.Jain3@amd.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|