Age | Commit message (Collapse) | Author |
|
Ido Schimmel says:
====================
ipv4: Two small fixes for bugs in IPv4 routing code.
A variation of the second bug was reported by an FRR 5.0 (released
06/18) user as this version was setting a table ID of 0 for the
default VRF, unlike iproute2 and newer FRR versions.
The first bug was discovered while fixing the second.
Both bugs are not regressions (never worked) and are not critical
in my opinion, so the fixes can be applied to net-next, if desired.
No regressions in other tests:
# ./fib_tests.sh
...
Tests passed: 191
Tests failed: 0
====================
Link: https://lore.kernel.org/r/20221204075045.3780097-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cited commit added the table ID to the FIB info structure, but did not
properly initialize it when table ID 0 is used. This can lead to a route
in the default VRF with a preferred source address not being flushed
when the address is deleted.
Consider the following example:
# ip address add dev dummy1 192.0.2.1/28
# ip address add dev dummy1 192.0.2.17/28
# ip route add 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 100
# ip route add table 0 198.51.100.0/24 via 192.0.2.2 src 192.0.2.17 metric 200
# ip route show 198.51.100.0/24
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 100
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200
Both routes are installed in the default VRF, but they are using two
different FIB info structures. One with a metric of 100 and table ID of
254 (main) and one with a metric of 200 and table ID of 0. Therefore,
when the preferred source address is deleted from the default VRF,
the second route is not flushed:
# ip address del dev dummy1 192.0.2.17/28
# ip route show 198.51.100.0/24
198.51.100.0/24 via 192.0.2.2 dev dummy1 src 192.0.2.17 metric 200
Fix by storing a table ID of 254 instead of 0 in the route configuration
structure.
Add a test case that fails before the fix:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Table ID 0
TEST: Route removed in default VRF when source address deleted [FAIL]
Tests passed: 8
Tests failed: 1
And passes after:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Table ID 0
TEST: Route removed in default VRF when source address deleted [ OK ]
Tests passed: 9
Tests failed: 0
Fixes: 5a56a0b3a45d ("net: Don't delete routes in different VRFs")
Reported-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cited commit added the table ID to the FIB info structure, but did not
prevent structures with different table IDs from being consolidated.
This can lead to routes being flushed from a VRF when an address is
deleted from a different VRF.
Fix by taking the table ID into account when looking for a matching FIB
info. This is already done for FIB info structures backed by a nexthop
object in fib_find_info_nh().
Add test cases that fail before the fix:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [FAIL]
TEST: Route in default VRF not removed [ OK ]
RTNETLINK answers: File exists
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [FAIL]
Tests passed: 6
Tests failed: 2
And pass after:
# ./fib_tests.sh -t ipv4_del_addr
IPv4 delete address route tests
Regular FIB info
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Identical FIB info with different table ID
TEST: Route removed from VRF when source address deleted [ OK ]
TEST: Route in default VRF not removed [ OK ]
TEST: Route removed in default VRF when source address deleted [ OK ]
TEST: Route in VRF is not removed by address delete [ OK ]
Tests passed: 8
Tests failed: 0
Fixes: 5a56a0b3a45d ("net: Don't delete routes in different VRFs")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Prior to the Fixes: commit, the initialization code went through the
same fec_enet_set_coalesce() function as used by ethtool, and that
function correctly checks whether the current variant has support for
irq coalescing.
Now that the initialization code instead calls fec_enet_itr_coal_set()
directly, that call needs to be guarded by a check for the
FEC_QUIRK_HAS_COALESCE bit.
Fixes: df727d4547de (net: fec: don't reset irq coalesce settings to defaults on "ip link up")
Reported-by: Greg Ungerer <gregungerer@westnet.com.au>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221205204604.869853-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In commit 4d633d1b468b ("bonding: fix ICMPv6 header handling when receiving
IPv6 messages"), there is a copy/paste issue for NA daddr. I found that
in my testing and fixed it in my local branch. But I forgot to re-format
the patch and sent the wrong mail.
Fix it by reading the correct dest address.
Fixes: 4d633d1b468b ("bonding: fix ICMPv6 header handling when receiving IPv6 messages")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Link: https://lore.kernel.org/r/20221206032055.7517-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When booting a arm 32-bit kernel with config CONFIG_AHCI_DWC enabled on
a am57xx-evm board. This happens when the clock references are unnamed
in DT, the strcmp() produces a NULL pointer dereference, see the
following oops, NULL pointer dereference:
[ 4.673950] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 4.682098] [00000000] *pgd=00000000
[ 4.685699] Internal error: Oops: 5 [#1] SMP ARM
[ 4.690338] Modules linked in:
[ 4.693420] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc7 #1
[ 4.699615] Hardware name: Generic DRA74X (Flattened Device Tree)
[ 4.705749] PC is at strcmp+0x0/0x34
[ 4.709350] LR is at ahci_platform_find_clk+0x3c/0x5c
[ 4.714416] pc : [<c130c494>] lr : [<c0c230e0>] psr: 20000013
[ 4.720703] sp : f000dda8 ip : 00000001 fp : c29b1840
[ 4.725952] r10: 00000020 r9 : c1b23380 r8 : c1b23368
[ 4.731201] r7 : c1ab4cc4 r6 : 00000001 r5 : c3c66040 r4 : 00000000
[ 4.737762] r3 : 00000080 r2 : 00000080 r1 : c1ab4cc4 r0 : 00000000
[...]
[ 4.998870] strcmp from ahci_platform_find_clk+0x3c/0x5c
[ 5.004302] ahci_platform_find_clk from ahci_dwc_probe+0x1f0/0x54c
[ 5.010589] ahci_dwc_probe from platform_probe+0x64/0xc0
[ 5.016021] platform_probe from really_probe+0xe8/0x41c
[ 5.021362] really_probe from __driver_probe_device+0xa4/0x204
[ 5.027313] __driver_probe_device from driver_probe_device+0x38/0xc8
[ 5.033782] driver_probe_device from __driver_attach+0xb4/0x1ec
[ 5.039825] __driver_attach from bus_for_each_dev+0x78/0xb8
[ 5.045532] bus_for_each_dev from bus_add_driver+0x17c/0x220
[ 5.051300] bus_add_driver from driver_register+0x90/0x124
[ 5.056915] driver_register from do_one_initcall+0x48/0x1e8
[ 5.062591] do_one_initcall from kernel_init_freeable+0x1cc/0x234
[ 5.068817] kernel_init_freeable from kernel_init+0x20/0x13c
[ 5.074584] kernel_init from ret_from_fork+0x14/0x2c
[ 5.079681] Exception stack(0xf000dfb0 to 0xf000dff8)
[ 5.084747] dfa0: 00000000 00000000 00000000 00000000
[ 5.092956] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 5.101165] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 5.107818] Code: e5e32001 e3520000 1afffffb e12fff1e (e4d03001)
[ 5.114013] ---[ end trace 0000000000000000 ]---
Add an extra check in the if-statement if hpriv-clks[i].id.
Fixes: 6ce73f3a6fc0 ("ata: libahci_platform: Add function returning a clock-handle by id")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Revert the dropping of the cache invalidation from the arm64
arch_dma_prep_coherent() as it caused a regression in the
qcom_q6v5_mss remoteproc driver.
The driver is already buggy but the original arm64 change made
the problem obvious. The change will be re-introduced once the
driver is fixed"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"
|
|
Pull kvm fixes from Paolo Bonzini:
"Unless anything comes from the ARM side, this should be the last pull
request for this release - and it's mostly documentation:
- Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns
- s390: fix multi-epoch extension in nested guests
- x86: fix uninitialized variable on nested triple fault"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Document the interaction between KVM_CAP_HALT_POLL and halt_poll_ns
KVM: Move halt-polling documentation into common directory
KVM: x86: fix uninitialized variable use on KVM_REQ_TRIPLE_FAULT
KVM: s390: vsie: Fix the initialization of the epoch extension (epdx) field
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Two zero-day fixes for the xen-netback driver (XSA-423 and XSA-424)"
* tag 'for-linus-xsa-6.1-rc9-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/netback: don't call kfree_skb() with interrupts disabled
xen/netback: Ensure protocol headers don't fall in the non-linear area
|
|
This reverts commit c44094eee32f32f175aadc0efcac449d99b1bbf7.
Although the semantics of the DMA API require only a clean operation
here, it turns out that the Qualcomm 'qcom_q6v5_mss' remoteproc driver
(ab)uses the DMA API for transferring the modem firmware to the secure
world via calls to Trustzone [1].
Once the firmware buffer has changed hands, _any_ access from the
non-secure side (i.e. Linux) will be detected on the bus and result in a
full system reset [2]. Although this is possible even with this revert
in place (due to speculative reads via the cacheable linear alias of
memory), anecdotally the problem occurs considerably more frequently
when the lines have not been invalidated, assumedly due to some
micro-architectural interactions with the cache hierarchy.
Revert the offending change for now, along with a comment, so that the
Qualcomm developers have time to fix the driver [3] to use a firmware
buffer which does not have a cacheable alias in the linear map.
Link: https://lore.kernel.org/r/20221114110329.68413-1-manivannan.sadhasivam@linaro.org [1]
Link: https://lore.kernel.org/r/CAMi1Hd3H2k1J8hJ6e-Miy5+nVDNzv6qQ3nN-9929B0GbHJkXEg@mail.gmail.com/ [2]
Link: https://lore.kernel.org/r/20221206092152.GD15486@thinkpad [2]
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Reported-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20221206103403.646-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
It is not allowed to call kfree_skb() from hardware interrupt
context or with interrupts being disabled. So remove kfree_skb()
from the spin_lock_irqsave() section and use the already existing
"drop" label in xenvif_start_xmit() for dropping the SKB. At the
same time replace the dev_kfree_skb() call there with a call of
dev_kfree_skb_any(), as xenvif_start_xmit() can be called with
disabled interrupts.
This is XSA-424 / CVE-2022-42328 / CVE-2022-42329.
Fixes: be81992f9086 ("xen/netback: don't queue unlimited number of packages")
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
In some cases, the frontend may send a packet where the protocol headers
are spread across multiple slots. This would result in netback creating
an skb where the protocol headers spill over into the non-linear area.
Some drivers and NICs don't handle this properly resulting in an
interface reset or worse.
This issue was introduced by the removal of an unconditional skb pull in
the tx path to improve performance. Fix this without reintroducing the
pull by setting up grant copy ops for as many slots as needed to reach
the XEN_NETBACK_TX_COPY_LEN size. Adjust the rest of the code to handle
multiple copy operations per skb.
This is XSA-423 / CVE-2022-3643.
Fixes: 7e5d7753956b ("xen-netback: remove unconditional __pskb_pull_tail() in guest Tx path")
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
Fix the potential risk of OOB if skb_linearize() fails in
tipc_link_proto_rcv().
Fixes: 5cbb28a4bf65 ("tipc: linearize arriving NAME_DISTR and LINK_PROTO buffers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20221203094635.29024-1-yuehaibing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The skb is delivered to napi_gro_receive() which may free it, after
calling this, dereferencing skb may trigger use-after-free.
Fixes: 57c5bc9ad7d7 ("net: hisilicon: add hix5hd2 mac driver")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Link: https://lore.kernel.org/r/20221203094240.1240211-2-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
W=1 build complains:
drivers/irqchip/irq-ti-sci-inta.c:177: warning: Function parameter or member 'vint_id' not described in 'ti_sci_inta_xlate_irq'
drivers/irqchip/irq-ti-sci-inta.c:177: warning: Excess function parameter 'irq' description in 'ti_sci_inta_xlate_irq'
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221121140048.596303869@linutronix.de
|
|
They are all part of the init sequence.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221121140048.534395323@linutronix.de
|
|
W=1 build complains:
drivers/irqchip/irq-gic-v2m.c:570:12: warning: no previous prototype for ‘gicv2m_init’ [-Wmissing-prototypes]
570 | int __init gicv2m_init(struct fwnode_handle *parent_handle,
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221121140048.470680255@linutronix.de
|
|
Assigning a void pointer which points to a struct to two different data
types only works by chance if the second type is the first member of the
struct.
Replace this works by chance code by using the primary struct pointer.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221121140048.344525618@linutronix.de
|
|
There is warning report about of_node refcount leak
while probing mdio device:
OF: ERROR: memory leak, expected refcount 1 instead of 2,
of_node_get()/of_node_put() unbalanced - destroy cset entry:
attach overlay node /spi/soc@0/mdio@710700c0/ethernet@4
In of_mdiobus_register_device(), we increase fwnode refcount
by fwnode_handle_get() before associating the of_node with
mdio device, but it has never been decreased in normal path.
Since that, in mdio_device_release(), it needs to call
fwnode_handle_put() in addition instead of calling kfree()
directly.
After above, just calling mdio_device_free() in the error handle
path of of_mdiobus_register_device() is enough to keep the
refcount balanced.
Fixes: a9049e0c513c ("mdio: Add support for mdio drivers.")
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20221203073441.3885317-1-zengheng4@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The skb is delivered to napi_gro_receive() which may free it, after
calling this, dereferencing skb may trigger use-after-free.
Fixes: 542ae60af24f ("net: hisilicon: Add Fast Ethernet MAC driver")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Link: https://lore.kernel.org/r/20221203094240.1240211-1-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The nicvf_probe() won't destroy workqueue when register_netdev()
failed. Add destroy_workqueue err handle case to fix this issue.
Fixes: 2ecbe4f4a027 ("net: thunderx: replace global nicvf_rx_mode_wq work queue for all VFs to private for each of them.")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20221203094125.602812-1-liuyongqiang13@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The skb is delivered to napi_gro_receive() which may free it, after calling this,
dereferencing skb may trigger use-after-free.
Fixes: 1c59eb678cbd ("ravb: Fillup ravb_rx_gbeth() stub")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20221203092941.10880-1-yuehaibing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The mchp_sparx5_probe() won't destroy workqueue created by
create_singlethread_workqueue() in sparx5_start() when later
inits failed. Add destroy_workqueue in the cleanup_ports case,
also add it in mchp_sparx5_remove()
Fixes: b37a1bae742f ("net: sparx5: add mactable support")
Signed-off-by: Qiheng Lin <linqiheng@huawei.com>
Link: https://lore.kernel.org/r/20221203070259.19560-1-linqiheng@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Although the type I ERSPAN is based on the barebones IP + GRE
encapsulation and no extra ERSPAN header. Report erspan version on GRE
interface looks unreasonable. Fix this by separating the erspan and gre
fill info.
IPv6 GRE does not have this info as IPv6 only supports erspan version
1 and 2.
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: f989d546a2d5 ("erspan: Add type I version 0 support.")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Link: https://lore.kernel.org/r/20221203032858.3130339-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When failed to alloc ipc_mux->ul_adb.pp_qlt in ipc_mux_init(), ipc_mux
is not released.
Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Link: https://lore.kernel.org/r/20221203020903.383235-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
After calling napi_complete_done(), the NAPIF_STATE_SCHED bit may be
cleared, and another CPU can start napi thread and access per-CQ variable,
cq->work_done. If the other thread (for example, from busy_poll) sets
it to a value >= budget, this thread will continue to run when it should
stop, and cause memory corruption and panic.
To fix this issue, save the per-CQ work_done variable in a local variable
before napi_complete_done(), so it won't be corrupted by a possible
concurrent thread after napi_complete_done().
Also, add a flag bit to advertise to the NIC firmware: the NAPI work_done
variable race is fixed, so the driver is able to reliably support features
like busy_poll.
Cc: stable@vger.kernel.org
Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1670010190-28595-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In dt-binding snps,dwmac.yaml, some properties under "snps,axi-config"
node are named without "axi_" prefix, but the driver expects the
prefix. Since the dt-binding has been there for a long time, we'd
better make driver match the binding for compatibility.
Fixes: afea03656add ("stmmac: rework DMA bus setting and introduce new platform AXI structure")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20221202161739.2203-1-jszhang@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The node returned by of_get_parent() with refcount incremented,
of_node_put() needs be called when finish using it. So add it in the
end of of_pinctrl_get().
Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
A device might have a core quirk for NVME_QUIRK_IGNORE_DEV_SUBNQN
(such as Samsung X5) but it would still give a:
"missing or invalid SUBNQN field"
warning as core quirks are filled after calling nvme_init_subnqn. Fill
ctrl->quirks from struct core_quirks before calling nvme_init_subsystem
to fix this.
Tested on a Samsung X5.
Fixes: ab9e00cc72fa ("nvme: track subsystems")
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
While running under CONFIG_FORTIFY_SOURCE=y, syzkaller reported:
memcpy: detected field-spanning write (size 129) of single field "target->sensf_res" at net/nfc/nci/ntf.c:260 (size 18)
This appears to be a legitimate lack of bounds checking in
nci_add_new_protocol(). Add the missing checks.
Reported-by: syzbot+210e196cef4711b65139@syzkaller.appspotmail.com
Link: https://lore.kernel.org/lkml/0000000000001c590f05ee7b3ff4@google.com
Fixes: 019c4fbaa790 ("NFC: Add NCI multiple targets support")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221202214410.never.693-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PCI/IMS works like PCI/MSI-X in the remapping. Just add the feature flag,
but only when on real hardware.
Virtualized IOMMUs need additional support.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232327.140571546@linutronix.de
|
|
PCI/IMS works like PCI/MSI-X in the remapping. Just add the feature flag,
but only when on real hardware.
Virtualized IOMMUs need additional support, e.g. for PASID.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232327.081482253@linutronix.de
|
|
Enable IMS in the domain init and allocation mapping code, but do not
enable it on the vector domain as discussed in various threads on LKML.
The interrupt remap domains can expand this setting like they do with
PCI multi MSI.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232327.022658817@linutronix.de
|
|
Single vector allocation which allocates the next free index in the IMS
space. The free function releases.
All allocated vectors are released also via pci_free_vectors() which is
also releasing MSI/MSI-X vectors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.961711347@linutronix.de
|
|
IMS (Interrupt Message Store) is a new specification which allows
implementation specific storage of MSI messages contrary to the
strict standard specified MSI and MSI-X message stores.
This requires new device specific interrupt domains to handle the
implementation defined storage which can be an array in device memory or
host/guest memory which is shared with hardware queues.
Add a function to create IMS domains for PCI devices. IMS domains are using
the new per device domain mechanism and are configured by the device driver
via a template. IMS domains are created as secondary device domains so they
work side on side with MSI[-X] on the same device.
The IMS domains have a few constraints:
- The index space is managed by the core code.
Device memory based IMS provides a storage array with a fixed size
which obviously requires an index. But there is no association between
index and functionality so the core can randomly allocate an index in
the array.
System memory based IMS does not have the concept of an index as the
storage is somewhere in memory. In that case the index is purely
software based to keep track of the allocations.
- There is no requirement for consecutive index ranges
This is currently a limitation of the MSI core and can be implemented
if there is a justified use case by changing the internal storage from
xarray to maple_tree. For now it's single vector allocation.
- The interrupt chip must provide the following callbacks:
- irq_mask()
- irq_unmask()
- irq_write_msi_msg()
- The interrupt chip must provide the following optional callbacks
when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks
cannot operate directly on hardware, e.g. in the case that the
interrupt message store is in queue memory:
- irq_bus_lock()
- irq_bus_unlock()
These callbacks are invoked from preemptible task context and are
allowed to sleep. In this case the mandatory callbacks above just
store the information. The irq_bus_unlock() callback is supposed to
make the change effective before returning.
- Interrupt affinity setting is handled by the underlying parent
interrupt domain and communicated to the IMS domain via
irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity()
callback. That's a reasonable restriction similar to the PCI/MSI
device domain implementations.
The domain is automatically destroyed when the PCI device is removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.904316841@linutronix.de
|
|
Provide the necessary constants for PCI/IMS support:
- A new bus token for MSI irqdomain identification
- A MSI feature flag for the MSI irqdomains to signal support
- A secondary domain id
The latter expands the device internal domain pointer storage array from 1
to 2 entries. That extra pointer is mostly unused today, but the
alternative solutions would not be free either and would introduce more
complexity all over the place. Trade the 8bytes for simplicity.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.846169830@linutronix.de
|
|
x86 MSI irqdomains can handle MSI-X allocation post MSI-X enable just out
of the box - on the vector domain and on the remapping domains,
Add the feature flag to the supported feature list
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.787373104@linutronix.de
|
|
MSI-X vectors can be allocated after the initial MSI-X enablement, but this
needs explicit support of the underlying interrupt domains.
Provide a function to query the ability and functions to allocate/free
individual vectors post-enable.
The allocation can either request a specific index in the MSI-X table or
with the index argument MSI_ANY_INDEX it allocates the next free vector.
The return value is a struct msi_map which on success contains both index
and the Linux interrupt number. In case of failure index is negative and
the Linux interrupt number is 0.
The allocation function is for a single MSI-X index at a time as that's
sufficient for the most urgent use case VFIO to get rid of the 'disable
MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts
and redirections to the legacy and obviously unhandled INTx.
As single index allocation is also sufficient for the use cases Jason
Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network
queue. See Link below.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20211126232735.547996838@linutronix.de
Link: https://lore.kernel.org/r/20221124232326.731233614@linutronix.de
|
|
The setup of MSI descriptors for PCI/MSI-X interrupts depends partially on
the MSI index for which the descriptor is initialized.
Dynamic MSI-X vector allocation post MSI-X enablement allows to allocate
vectors at a given index or at any free index in the available table
range. The latter requires that the descriptor is initialized after the
MSI core has chosen an index.
Implement the prepare_desc() op in the PCI/MSI-X specific msi_domain_ops
which is invoked before the core interrupt descriptor and the associated
Linux interrupt number is allocated.
That callback is also provided for the upcoming PCI/IMS implementations so
the implementation specific interrupt domain can do their domain specific
initialization of the MSI descriptors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.673658806@linutronix.de
|
|
The upcoming mechanism to allocate MSI-X vectors after enabling MSI-X needs
to share some of the MSI-X descriptor setup.
The regular descriptor setup on enable has the following code flow:
1) Allocate descriptor
2) Setup descriptor with PCI specific data
3) Insert descriptor
4) Allocate interrupts which in turn scans the inserted
descriptors
This cannot be easily changed because the PCI/MSI code needs to handle the
legacy architecture specific allocation model and the irq domain model
where quite some domains have the assumption that the above flow is how it
works.
Ideally the code flow should look like this:
1) Invoke allocation at the MSI core
2) MSI core allocates descriptor
3) MSI core calls back into the irq domain which fills in
the domain specific parts
This could be done for underlying parent MSI domains which support
post-enable allocation/free but that would create significantly different
code pathes for MSI/MSI-X enable.
Though for dynamic allocation which wants to share the allocation code with
the upcoming PCI/IMS support it's the right thing to do.
Split the MSI-X descriptor setup into the preallocation part which just sets
the index and fills in the horrible hack of virtual IRQs and the real PCI
specific MSI-X setup part which solely depends on the index in the
descriptor. This allows to provide a common dynamic allocation interface at
the MSI core level for both PCI/MSI-X and PCI/IMS.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.616292598@linutronix.de
|
|
Provide a new MSI feature flag in preparation for dynamic MSIX allocation
after the initial MSI-X enable has been done.
This needs to be an explicit MSI interrupt domain feature because quite
some implementations (both interrupt domains and legacy allocation mode)
have clear expectations that the allocation code is only invoked when MSI-X
is about to be enabled. They either talk to hypervisors or do some other
work and are not prepared to be invoked on an already MSI-X enabled device.
This is also explicit MSI-X only because rewriting the size of the MSI
entries is only possible when disabling MSI which in turn might cause lost
interrupts on the device.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.558843119@linutronix.de
|
|
For supporting post MSI-X enable allocations and for the upcoming PCI/IMS
support a separate interface is required which allows not only the
allocation of a specific index, but also the allocation of any, i.e. the
next free index. The latter is especially required for IMS because IMS
completely does away with index to functionality mappings which are
often found in MSI/MSI-X implementation.
But even with MSI-X there are devices where only the first few indices have
a fixed functionality and the rest is freely assignable by software,
e.g. to queues.
msi_domain_alloc_irq_at() is also different from the range based interfaces
as it always enforces that the MSI descriptor is allocated by the core code
and not preallocated by the caller like the PCI/MSI[-X] enable code path
does.
msi_domain_alloc_irq_at() can be invoked with the index argument set to
MSI_ANY_INDEX which makes the core code pick the next free index. The irq
domain can provide a prepare_desc() operation callback in it's
msi_domain_ops to do domain specific post allocation initialization before
the actual Linux interrupt and the associated interrupt descriptor and
hierarchy alloccations are conducted.
The function also takes an optional @icookie argument which is of type
union msi_instance_cookie. This cookie is not used by the core code and is
stored in the allocated msi_desc::data::icookie. The meaning of the cookie
is completely implementation defined. In case of IMS this might be a PASID
or a pointer to a device queue, but for the MSI core it's opaque and not
used in any way.
The function returns a struct msi_map which on success contains the
allocated index number and the Linux interrupt number so the caller can
spare the index to Linux interrupt number lookup.
On failure map::index contains the error code and map::virq is 0.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.501359457@linutronix.de
|
|
The existing MSI domain ops msi_prepare() and set_desc() turned out to be
unsuitable for implementing IMS support.
msi_prepare() does not operate on the MSI descriptors. set_desc() lacks
an irq_domain pointer and has a completely different purpose.
Introduce a prepare_desc() op which allows IMS implementations to amend an
MSI descriptor which was allocated by the core code, e.g. by adjusting the
iomem base or adding some data based on the allocated index. This is way
better than requiring that all IMS domain implementations preallocate the
MSI descriptor and then allocate the interrupt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.444560717@linutronix.de
|
|
The upcoming support for PCI/IMS requires to store some information related
to the message handling in the MSI descriptor, e.g. PASID or a pointer to a
queue.
Provide a generic storage struct which maps over the existing PCI specific
storage which means the size of struct msi_desc is not getting bigger.
This storage struct has two elements:
1) msi_domain_cookie
2) msi_instance_cookie
The domain cookie is going to be used to store domain specific information,
e.g. iobase pointer, data pointer.
The instance cookie is going to be handed in when allocating an interrupt
on an IMS domain so the irq chip callbacks of the IMS domain have the
necessary per vector information available. It also comes in handy when
cleaning up the platform MSI code for wire to MSI bridges which need to
hand down the type information to the underlying interrupt domain.
For the core code the cookies are opaque and meaningless. It just stores
the instance cookie during an allocation through the upcoming interfaces
for IMS and wire to MSI brigdes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.385036043@linutronix.de
|
|
A simple struct to hold a MSI index / Linux interrupt number pair. It will
be returned from the dynamic vector allocation function and handed back to
the corresponding free() function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.326410494@linutronix.de
|
|
and related code which is not longer required now that the interrupt remap
code has been converted to MSI parent domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.267353814@linutronix.de
|
|
Remove the global PCI/MSI irqdomain implementation and provide the required
MSI parent ops so the PCI/MSI code can detect the new parent and setup per
device domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.209212272@linutronix.de
|
|
Remove the global PCI/MSI irqdomain implementation and provide the required
MSI parent ops so the PCI/MSI code can detect the new parent and setup per
device domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.151226317@linutronix.de
|
|
The check for special MSI domains like VMD which prevents the interrupt
remapping code to overwrite device::msi::domain is not longer required and
has been replaced by an x86 specific version which is aware of MSI parent
domains.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.093093200@linutronix.de
|