Age | Commit message (Collapse) | Author |
|
There is a spelling mistake in a lpfc_printf_log message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20250227225046.660865-1-colin.i.king@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit b35108a51cf7 ("jiffies: Define secs_to_jiffies()") introduced
secs_to_jiffies(). As the value here is a multiple of 1000, use
secs_to_jiffies() instead of msecs_to_jiffies() to avoid the multiplication
This is converted using scripts/coccinelle/misc/secs_to_jiffies.cocci with
the following Coccinelle rules:
@depends on patch@
expression E;
@@
-msecs_to_jiffies(E * 1000)
+secs_to_jiffies(E)
-msecs_to_jiffies(E * MSEC_PER_SEC)
+secs_to_jiffies(E)
While here, convert some timeouts that are denominated in seconds
manually.
[mkp: Fix compilation error]
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250225-converge-secs-to-jiffies-part-two-v3-2-a43967e36c88@linux.microsoft.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Building with W=1 shows a warning about sas_v2_acpi_match being unused when
CONFIG_OF is disabled:
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c:3635:36: error: unused variable 'sas_v2_acpi_match' [-Werror,-Wunused-const-variable]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250225163637.4169300-1-arnd@kernel.org
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
./drivers/ufs/host/ufs-rockchip.c:268:70-75: WARNING: conversion to bool not needed here.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=19055
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250226021157.77934-1-jiapeng.chong@linux.alibaba.com
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
A positive value is for the number of clocks obtained if assigned.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/1740552733-182527-1-git-send-email-shawn.lin@rock-chips.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is a spelling mistake in a dev_err message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20250225101142.161474-1-colin.i.king@gmail.com
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add quirk for a copper SFP that identifies itself as "FS" "SFP-10GM-T".
It uses RollBall protocol to talk to the PHY and needs 4 sec wait before
probing the PHY.
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Link: https://patch.msgid.link/20250227071058.1520027-1-ms@dev.tdt.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 6639498ed85f ("mptcp: cleanup mem accounting")
removed the implementation but leave declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228095148.4003065-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Geliang Tang says:
====================
add sock_kmemdup helper
While developing MPTCP BPF path manager [1], I found it's useful to
add a new sock_kmemdup() helper.
My use case is this:
In mptcp_userspace_pm_append_new_local_addr() function (see patch 3
in this patchset), it uses sock_kmalloc() to allocate an address
entry "e", then immediately duplicate the input "entry" to it:
'''
e = sock_kmalloc(sk, sizeof(*e), GFP_ATOMIC);
if (!e) {
ret = -ENOMEM;
goto append_err;
}
*e = *entry;
'''
When I implemented MPTCP BPF path manager, I needed to implement a
code similar to this in BPF.
The kfunc sock_kmalloc() can be easily invoked in BPF to allocate
an entry "e", but the code "*e = *entry;" that assigns "entry" to
"e" is not easy to implemented.
I had to implement such a "copy entry" helper in BPF:
'''
static void mptcp_pm_copy_addr(struct mptcp_addr_info *dst,
struct mptcp_addr_info *src)
{
dst->id = src->id;
dst->family = src->family;
dst->port = src->port;
if (src->family == AF_INET) {
dst->addr.s_addr = src->addr.s_addr;
} else if (src->family == AF_INET6) {
dst->addr6.s6_addr32[0] = src->addr6.s6_addr32[0];
dst->addr6.s6_addr32[1] = src->addr6.s6_addr32[1];
dst->addr6.s6_addr32[2] = src->addr6.s6_addr32[2];
dst->addr6.s6_addr32[3] = src->addr6.s6_addr32[3];
}
}
static void mptcp_pm_copy_entry(struct mptcp_pm_addr_entry *dst,
struct mptcp_pm_addr_entry *src)
{
mptcp_pm_copy_addr(&dst->addr, &src->addr);
dst->flags = src->flags;
dst->ifindex = src->ifindex;
}
'''
And add "write permission" for BPF to each field of mptcp_pm_addr_entry:
'''
@@ static int bpf_mptcp_pm_btf_struct_access(struct bpf_verifier_log *log,
case offsetof(struct mptcp_pm_addr_entry, addr.port):
end = offsetofend(struct mptcp_pm_addr_entry, addr.port);
break;
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[0]):
end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[0]);
break;
case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[1]):
end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[1]);
break;
case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[2]):
end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[2]);
break;
case offsetof(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[3]):
end = offsetofend(struct mptcp_pm_addr_entry, addr.addr6.s6_addr32[3]);
break;
#else
case offsetof(struct mptcp_pm_addr_entry, addr.addr.s_addr):
end = offsetofend(struct mptcp_pm_addr_entry, addr.addr.s_addr);
break;
#endif
'''
But if there's a sock_kmemdup() helper, it will become much simpler,
only need to call kfunc sock_kmemdup() instead in BPF.
So this patchset adds this new helper and uses it in several places.
[1]
https://lore.kernel.org/mptcp/cover.1738924875.git.tanggeliang@kylinos.cn/
====================
Link: https://patch.msgid.link/cover.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of using sock_kmalloc() to allocate an address
entry "e" and then immediately duplicate the input "entry"
to it, the newly added sock_kmemdup() helper can be used in
mptcp_userspace_pm_append_new_local_addr() to simplify the code.
More importantly, the code "*e = *entry;" that assigns "entry"
to "e" is not easy to implemented in BPF if we use the same code
to implement an append_new_local_addr() helper of a BFP path
manager. This patch avoids this type of memory assignment
operation.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/3e5a307aed213038a87e44ff93b5793229b16279.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of using sock_kmalloc() to allocate an ip_options and then
immediately duplicate another ip_options to the newly allocated one in
ipv6_dup_options(), mptcp_copy_ip_options() and sctp_v4_copy_ip_options(),
the newly added sock_kmemdup() helper can be used to simplify the code.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/91ae749d66600ec6fb679e0e518fda6acb5c3e6f.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch adds the sock version of kmemdup() helper, named sock_kmemdup(),
to duplicate the input "src" memory block using the socket's option memory
buffer.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/f828077394c7d1f3560123497348b438c875b510.1740735165.git.tanggeliang@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
tcp: misc changes
Minor changes, following recent changes in TCP stack.
====================
Link: https://patch.msgid.link/20250301201424.2046477-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove one indentation level.
Use max_t() and clamp() macros.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 8d52da23b6c6 ("tcp: Defer ts_recent changes
until req is owned"), req->ts_recent is not changed anymore.
It is set once in tcp_openreq_init(), bpf_sk_assign_tcp_reqsk()
or cookie_tcp_reqsk_alloc() before the req can be seen by other
cpus/threads.
This completes the revert of eba20811f326 ("tcp: annotate
data-races around tcp_rsk(req)->ts_recent").
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tcp4_check_fraglist_gro(), tcp6_check_fraglist_gro(),
udp4_gro_lookup_skb() and udp6_gro_lookup_skb()
assume RCU is held so that the net structure does not disappear.
Use dev_net_rcu() instead of dev_net() to get LOCKDEP support.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TCP uses of dev_net() are under RCU protection, change them
to dev_net_rcu() to get LOCKDEP support.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use two existing drop reasons in tcp_check_req():
- TCP_RFC7323_PAWS
- TCP_OVERWINDOW
Add two new ones:
- TCP_RFC7323_TSECR (corresponds to LINUX_MIB_TSECRREJECTED)
- TCP_LISTEN_OVERFLOW (when a listener accept queue is full)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We want to add new drop reasons for packets dropped in 3WHS in the
following patches.
tcp_rcv_state_process() has to set reason to TCP_FASTOPEN,
because tcp_check_req() will conditionally overwrite the drop_reason.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250301201424.2046477-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kuniyuki Iwashima says:
====================
ipv4: fib: Convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.
Patch 1 is misc cleanup.
Patch 2 ~ 8 converts two fib_info hash tables to per-netns.
Patch 9 ~ 12 converts rtnl_lock() to rtnl_net_lcok().
v2: https://lore.kernel.org/20250226192556.21633-1-kuniyu@amazon.com
v1: https://lore.kernel.org/20250225182250.74650-1-kuniyu@amazon.com
====================
Link: https://patch.msgid.link/20250228042328.96624-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We converted fib_info hash tables to per-netns one and now ready to
convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.
Let's hold rtnl_net_lock() in inet_rtm_newroute() and inet_rtm_delroute().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-13-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fib_valid_key_len() is called in the beginning of fib_table_insert()
or fib_table_delete() to check if the prefix length is valid.
fib_table_insert() and fib_table_delete() are called from 3 paths
- ip_rt_ioctl()
- inet_rtm_newroute() / inet_rtm_delroute()
- fib_magic()
In the first ioctl() path, rtentry_to_fib_config() checks the prefix
length with bad_mask(). Also, fib_magic() always passes the correct
prefix: 32 or ifa->ifa_prefixlen, which is already validated.
Let's move fib_valid_key_len() to the rtnetlink path, rtm_to_fib_config().
While at it, 2 direct returns in rtm_to_fib_config() are changed to
goto to match other places in the same function
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-12-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ioctl(SIOCADDRT/SIOCDELRT) calls ip_rt_ioctl() to add/remove a route in
the netns of the specified socket.
Let's hold rtnl_net_lock() there.
Note that rtentry_to_fib_config() can be called without rtnl_net_lock()
if we convert rtentry.dev handling to RCU later.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-11-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ip_fib_net_exit() requires RTNL and is called from fib_net_init()
and fib_net_exit_batch().
Let's hold rtnl_net_lock() before ip_fib_net_exit().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-10-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will convert RTM_NEWROUTE and RTM_DELROUTE to per-netns RTNL.
Then, we need to have per-netns hash tables for struct fib_info.
Let's allocate the hash tables per netns.
fib_info_hash, fib_info_hash_bits, and fib_info_cnt are now moved
to struct netns_ipv4 and accessed with net->ipv4.fib_XXX.
Also, the netns checks are removed from fib_find_info_nh() and
fib_find_info().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the number of struct fib_info exceeds the hash table size in
fib_create_info(), we try to allocate a new hash table with the
doubled size.
The allocation is done in fib_create_info(), and if successful, each
struct fib_info is moved to the new hash table by fib_info_hash_move().
Let's integrate the allocation and fib_info_hash_move() as
fib_info_hash_grow() to make the following change cleaner.
While at it, fib_info_hash_grow() is placed near other hash-table-specific
functions.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will allocate the fib_info hash tables per netns.
There are 5 global variables for fib_info hash tables:
fib_info_hash, fib_info_laddrhash, fib_info_hash_size,
fib_info_hash_bits, fib_info_cnt.
However, fib_info_laddrhash and fib_info_hash_size can be
easily calculated from fib_info_hash and fib_info_hash_bits.
Let's remove fib_info_hash_size and use (1 << fib_info_hash_bits)
instead.
Now we need not pass the new hash table size to fib_info_hash_move().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will allocate the fib_info hash tables per netns.
There are 5 global variables for fib_info hash tables:
fib_info_hash, fib_info_laddrhash, fib_info_hash_size,
fib_info_hash_bits, fib_info_cnt.
However, fib_info_laddrhash and fib_info_hash_size can be
easily calculated from fib_info_hash and fib_info_hash_bits.
Let's remove the fib_info_laddrhash pointer and instead use
fib_info_hash + (1 << fib_info_hash_bits).
While at it, fib_info_laddrhash_bucket() is moved near other
hash-table-specific functions.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Every time fib_info_hashfn() returns a hash value, we fetch
&fib_info_hash[hash].
Let's return the hlist_head pointer from fib_info_hashfn() and
rename it to fib_info_hash_bucket() to match a similar function,
fib_info_laddrhash_bucket().
Note that we need to move the fib_info_hash assignment earlier in
fib_info_hash_move() to use fib_info_hash_bucket() in the for loop.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will allocate fib_info_hash[] and fib_info_laddrhash[] for each netns.
Currently, fib_info_hash[] is allocated when the first route is added.
Let's move the first allocation to a new __net_init function.
Note that we must call fib4_semantics_exit() in fib_net_exit_batch()
because ->exit() is called earlier than ->exit_batch().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Both fib_info_hash[] and fib_info_laddrhash[] are hash tables for
struct fib_info and are allocated by kvzmalloc() separately.
Let's replace the two kvzmalloc() calls with kvcalloc() to remove
the fib_info_laddrhash pointer later.
Note that fib_info_hash_alloc() allocates a new hash table based on
fib_info_hash_bits because we will remove fib_info_hash_size later.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
net is available in fib_inetaddr_event(), let's use it.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Python lib based tests report that they are producing
"KTAP version 1", but really we aren't making use of any
KTAP features, like subtests. Our output is plain TAP.
Report TAP 13 instead of KTAP 1, this is what mptcp tests do,
and what NIPA knows how to parse best. For HW testing we need
precise subtest result tracking.
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228180007.83325-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Although there is nothing in NV that is fundamentally incompatible
with the lack of GICv3, there is no HW implementation without one,
at least on the virtual side (yes, even fruits have some form of
vGICv3).
We therefore make the decision to require GICv3, which will only
affect models such as QEMU. Booting with a GICv2 or something
even more exotic while asking for NV will result in KVM being
disabled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-17-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The VGIC maintenance IRQ signals various conditions about the LRs, when
the GIC's virtualization extension is used.
So far we didn't need it, but nested virtualization needs to know about
this interrupt, so add a userland interface to setup the IRQ number.
The architecture mandates that it must be a PPI, on top of that this code
only exports a per-device option, so the PPI is the same on all VCPUs.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
[added some bits of documentation]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-16-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Popular HW that is able to use NV also has a broken vgic implementation
that requires trapping.
On such HW, propagate the host trap bits into the guest's shadow
ICH_HCR_EL2 register, making sure we don't allow an L2 guest to bring
the system down.
This involves a bit of tweaking so that the emulation code correctly
poicks up the shadow state as needed, and to only partially sync
ICH_HCR_EL2 back with the guest state to capture EOIcount.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-15-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
We have so far made sure that L1 and L0 vgic contexts were
totally independent. There is however one spot of bother with
this approach, and that's in the GICv3 emulation code required by
our fruity friends.
The issue is that the emulation code needs to know how many LRs
are in flight. And while it is easy to reach the L0 version through
the vcpu pointer, doing so for the L1 is much more complicated,
as these structures are private to the nested code.
We could simply expose that structure and pick one or the other
depending on the context, but this seems extra complexity for not
much benefit.
Instead, just propagate the number of used LRs from the nested code
into the L0 context, and be done with it. Should this become a burden,
it can be easily rectified.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-14-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Running an L2 guest with GICv4 enabled goes absolutely nowhere, and gets
into a vicious cycle of nested ERET followed by nested exception entry
into the L1.
When KVM does a put on a runnable vCPU, it marks the vPE as nonresident
but does not request a doorbell IRQ. Behind the scenes in the ITS
driver's view of the vCPU, its_vpe::pending_last gets set to true to
indicate that context is still runnable.
This comes to a head when doing the nested ERET into L2. The vPE doesn't
get scheduled on the redistributor as it is exclusively part of the L1's
VGIC context. kvm_vgic_vcpu_pending_irq() returns true because the vPE
appears runnable, and KVM does a nested exception entry into the L1
before L2 ever gets off the ground.
This issue can be papered over by requesting a doorbell IRQ when
descheduling a vPE as part of a nested ERET. KVM needs this anyway to
kick the vCPU out of the L2 when an IRQ becomes pending for the L1.
Link: https://lore.kernel.org/r/20240823212703.3576061-4-oliver.upton@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-13-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Forward exceptions due to WFI or WFE instructions to the virtual EL2 if
they are not coming from the virtual EL2 and virtual HCR_EL2.TWx is set.
Signed-off-by: Jintack Lim <jintack.lim@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-12-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Emulating the vGIC means emulating the dreaded Maintenance Interrupt.
This is a two-pronged problem:
- while running L2, getting an MI translates into an MI injected
in the L1 based on the state of the HW.
- while running L1, we must accurately reflect the state of the
MI line, based on the in-memory state.
The MI INTID is added to the distributor, as expected on any
virtualisation-capable implementation, and further patches
will allow its configuration.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-11-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
An interrupt being delivered to L1 while running L2 must result
in the correct exception being delivered to L1.
This means that if, on entry to L2, we found ourselves with pending
interrupts in the L1 distributor, we need to take immediate action.
This is done by posting a request which will prevent the entry in
L2, and deliver an IRQ exception to L1, forcing the switch.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-10-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
When entering a nested VM, we set up the hypervisor control interface
based on what the guest hypervisor has set. Especially, we investigate
each list register written by the guest hypervisor whether HW bit is
set. If so, we translate hw irq number from the guest's point of view
to the real hardware irq number if there is a mapping.
Co-developed-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
[Christoffer: Redesigned execution flow around vcpu load/put]
Co-developed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
[maz: Rewritten to support GICv3 instead of GICv2, NV2 support]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-9-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
As ICH_HCR_EL2 is a VNCR accessor when runnintg NV, add some
sanitising to what gets written. Crucially, mark TDIR as RES0
if the HW doesn't support it (unlikely, but hey...), as well
as anything GICv4 related, since we only expose a GICv3 to the
uest.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-8-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Wire the handling of all GICv3 EL2 registers, and provide emulation
for all the non memory-backed registers (ICC_SRE_EL2, ICH_VTR_EL2,
ICH_MISR_EL2, ICH_ELRSR_EL2, and ICH_EISR_EL2).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-7-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
FEAT_NV2 comes with a bunch of register-to-memory redirection
involving the ICH_*_EL2 registers (LRs, APRs, VMCR, HCR).
Adds them to the vcpu_sysreg enumeration.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-6-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order for vgic_v3_load_nested to be able to observe which timer
interrupts have the HW bit set for the current context, the timers
must have been loaded in the new mode and the right timer mapped
to their corresponding HW IRQs.
At the moment, we load the GIC first, meaning that timer interrupts
injected to an L2 guest will never have the HW bit set (we see the
old configuration).
Swapping the two loads solves this particular problem.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-5-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The ICH_MISR_EL2-related macros are missing a number of status
bits that we are about to handle. Take this opportunity to fully
describe the layout of that register as part of the automatic
generation infrastructure.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-4-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The ICH_VTR_EL2-related macros are missing a number of config
bits that we are about to handle. Take this opportunity to fully
describe the layout of that register as part of the automatic
generation infrastructure.
This results in a bit of churn to repaint constants that are now
generated with a different format.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
The ICH_HCR_EL2-related macros are missing a number of control
bits that we are about to handle. Take this opportunity to fully
describe the layout of that register as part of the automatic
generation infrastructure.
This results in a bit of churn, unfortunately.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20250225172930.1850838-2-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
Add missing "avdd-0v9-supply" and "avdd-1v8-supply" properties to the "hdmi"
node in the Pine64 RockPro64 board dtsi file. To achieve this, also add the
associated "vcca_0v9" regulator that produces the 0.9 V supply, [1][2] which
hasn't been defined previously in the board dtsi file.
This also eliminates the following warnings from the kernel log:
dwhdmi-rockchip ff940000.hdmi: supply avdd-0v9 not found, using dummy regulator
dwhdmi-rockchip ff940000.hdmi: supply avdd-1v8 not found, using dummy regulator
There are no functional changes to the way board works with these additions,
because the "vcc1v8_dvp" and "vcca_0v9" regulators are always enabled, [1][2]
but these additions improve the accuracy of hardware description.
These changes apply to the both supported hardware revisions of the Pine64
RockPro64, i.e. to the production-run revisions 2.0 and 2.1. [1][2]
[1] https://files.pine64.org/doc/rockpro64/rockpro64_v21-SCH.pdf
[2] https://files.pine64.org/doc/rockpro64/rockpro64_v20-SCH.pdf
Fixes: e4f3fb490967 ("arm64: dts: rockchip: add initial dts support for Rockpro64")
Cc: stable@vger.kernel.org
Suggested-by: Diederik de Haas <didi.debian@cknow.org>
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Tested-by: Diederik de Haas <didi.debian@cknow.org>
Link: https://lore.kernel.org/r/df3d7e8fe74ed5e727e085b18c395260537bb5ac.1740941097.git.dsimic@manjaro.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|