Age | Commit message (Collapse) | Author |
|
This was a typo, we didn't actually want to return zero.
Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/491145/
Link: https://lore.kernel.org/r/20220624184528.4036837-1-robdclark@gmail.com
|
|
The commit e8aa7b17fe41 broke the 32-bit load-word unalignment exception
handler because it calculated the wrong amount of bits by which the value
should be shifted. This patch fixes it.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: e8aa7b17fe41 ("parisc/unaligned: Rewrite inline assembly of emulate_ldw()")
Cc: stable@vger.kernel.org # v5.18
|
|
Pull virtio fixes from Michael Tsirkin:
"Fixes all over the place, most notably we are disabling
IRQ hardening (again!)"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_ring: make vring_create_virtqueue_split prettier
vhost-vdpa: call vhost_vdpa_cleanup during the release
virtio_mmio: Restore guest page size on resume
virtio_mmio: Add missing PM calls to freeze/restore
caif_virtio: fix race between virtio_device_ready() and ndo_open()
virtio-net: fix race between ndo_open() and virtio_device_ready()
virtio: disable notification hardening by default
virtio: Remove unnecessary variable assignments
virtio_ring : keep used_wrap_counter in vq->last_used_idx
vduse: Tie vduse mgmtdev and its device
vdpa/mlx5: Initialize CVQ vringh only once
vdpa/mlx5: Update Control VQ callback information
|
|
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.
modpost used to detect it, but it had been broken for a decade.
Commit 28438794aba4 ("modpost: fix section mismatch check for exported
init/exit sections") fixed it so modpost started to warn it again, then
this showed up:
MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab_gpl+tick_nohz_full_setup+0x0): Section mismatch in reference from the variable __ksymtab_tick_nohz_full_setup to the function .init.text:tick_nohz_full_setup()
The symbol tick_nohz_full_setup is exported and annotated __init
Fix this by removing the __init annotation of tick_nohz_full_setup or drop the export.
Drop the export because tick_nohz_full_setup() is only called from the
built-in code in kernel/sched/isolation.c.
Fixes: ae9e557b5be2 ("time: Export tick start/stop functions for rcutorture")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When br_netfilter module is loaded, skbs may be diverted to the
ipv4/ipv6 hooks, just like as if we were routing.
Unfortunately, bridge filter hooks with priority 0 may be skipped
in this case.
Example:
1. an nftables bridge ruleset is loaded, with a prerouting
hook that has priority 0.
2. interface is added to the bridge.
3. no tcp packet is ever seen by the bridge prerouting hook.
4. flush the ruleset
5. load the bridge ruleset again.
6. tcp packets are processed as expected.
After 1) the only registered hook is the bridge prerouting hook, but its
not called yet because the bridge hasn't been brought up yet.
After 2), hook order is:
0 br_nf_pre_routing // br_netfilter internal hook
0 chain bridge f prerouting // nftables bridge ruleset
The packet is diverted to br_nf_pre_routing.
If call-iptables is off, the nftables bridge ruleset is called as expected.
But if its enabled, br_nf_hook_thresh() will skip it because it assumes
that all 0-priority hooks had been called previously in bridge context.
To avoid this, check for the br_nf_pre_routing hook itself, we need to
resume directly after it, even if this hook has a priority of 0.
Unfortunately, this still results in different packet flow.
With this fix, the eval order after in 3) is:
1. br_nf_pre_routing
2. ip(6)tables (if enabled)
3. nftables bridge
but after 5 its the much saner:
1. nftables bridge
2. br_nf_pre_routing
3. ip(6)tables (if enabled)
Unfortunately I don't see a solution here:
It would be possible to move br_nf_pre_routing to a higher priority
so that it will be called later in the pipeline, but this also impacts
ebtables evaluation order, and would still result in this very ordering
problem for all nftables-bridge hooks with the same priority as the
br_nf_pre_routing one.
Searching back through the git history I don't think this has
ever behaved in any other way, hence, no fixes-tag.
Reported-by: Radim Hrazdil <rhrazdil@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When verdict is NF_STOLEN, the skb might have been freed.
When tracing is enabled, this can result in a use-after-free:
1. access to skb->nf_trace
2. access to skb->mark
3. computation of trace id
4. dump of packet payload
To avoid 1, keep a cached copy of skb->nf_trace in the
trace state struct.
Refresh this copy whenever verdict is != STOLEN.
Avoid 2 by skipping skb->mark access if verdict is STOLEN.
3 is avoided by precomputing the trace id.
Only dump the packet when verdict is not "STOLEN".
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch fixes a race condition.
nft_rhash_update() might fail for two reasons:
- Element already exists in the hashtable.
- Another packet won race to insert an entry in the hashtable.
In both cases, new() has already bumped the counter via atomic_add_unless(),
therefore, decrement the set element counter.
Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Replace the deprecated ida_simple_{get,remove} with ida_{alloc,free}.
Link: https://lore.kernel.org/r/20220616055052.4559-1-liubo03@inspur.com
Signed-off-by: Bo Liu <liubo03@inspur.com>
[sudeep.holla: Replace ida_alloc_min with ida_alloc as suggested by Cristian]
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
|
|
For imported dma-buf objects we leave the object as cache_coherent = 0
across all platforms, which is reasonable given that have no clue what
the memory underneath is, and its not like the driver can ever manually
clflush the pages anyway (like with i915_gem_clflush_object) for such
objects. However on discrete we choose to treat cache_dirty = true as a
programmer error, leading to a warning. The simplest fix looks to be to
just change the ordering in cpu_write_needs_clflush to prevent ever
setting cache_dirty for dma-buf objects on discrete.
Fixes: d028a7690d87 ("drm/i915/dmabuf: Fix prime_mmap to work when using LMEM")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5266
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622155919.355081-1-matthew.auld@intel.com
(cherry picked from commit 563aaf4a928def2d36d1b3de0a4b515e2477b4da)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Currently i915 disables d3cold for i915 pci dev.
This blocks D3 for i915 gfx pci upstream bridge (VSP).
Let's disable d3cold at gfx root port to make sure that
i915 gfx VSP can transition to D3 to save some power.
We don't need to disable/enable d3cold in rpm, s2idle
suspend/resume handlers. Disabling/Enabling d3cold at
gfx root port in probe/remove phase is sufficient.
Fixes: 1a085e23411d ("drm/i915: Disable D3Cold in s2idle and runtime pm")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220616122249.5007-1-anshuman.gupta@intel.com
(cherry picked from commit 138c2fca6f408f397ea8fbbbf33203f244d96e01)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Add missing else in set_proto_ctx_param() to fix coverity issue.
Addresses-Coverity: ("Unused value")
Fixes: d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)")
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: katrinzhou <katrinzhou@tencent.com>
[tursulin: fixup alignment]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621124926.615884-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 7482a65664c16cc88eb84d2b545a1fed887378a1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
commit 555dbf1a9aac ("nfsd: Replace use of rwsem with errseq_t")
incidentally broke translation of -EINVAL to nfserr_notsupp.
The patch restores that.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 555dbf1a9aac ("nfsd: Replace use of rwsem with errseq_t")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
The recent change to split reads into chunks has several problems:
1. If an SPI controller has no transfer size limit, max_chunk is
SIZE_MAX, and num_msgs becomes zero, causing no data to be read
into the buffer, and exposing the original contents of the buffer
to userspace,
2. If the requested read size is not a multiple of the maximum
transfer size, the last transfer reads too much data, overflowing
the buffer,
3. The loop logic differs from the write case.
Fix the above by:
1. Keeping track of the number of bytes that are still to be
transferred, instead of precalculating the number of messages and
keeping track of the number of bytes tranfered,
2. Calculating the transfer size of each individual message, taking
into account the number of bytes left,
3. Switching from a "while"-loop to a "do-while"-loop, and renaming
"msg_count" to "segment".
While at it, drop the superfluous cast from "unsigned int" to "unsigned
int", also from at25_ee_write(), where it was probably copied from.
Fixes: 0a35780c755ccec0 ("eeprom: at25: Split reads into chunks and cap write size")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/7ae260778d2c08986348ea48ce02ef148100e088.1655817534.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
ideapad_dytc_v4_allow_table[]
The Ideapad 5 15ITL05 uses DYTC version 4 for platform-profile
control. This has been tested successfully with the ideapad-laptop
DYTC version 5 code; Add the Ideapad 5 15ITL05 to the
ideapad_dytc_v4_allow_table[].
Fixes: 599482c58ebd ("platform/x86: ideapad-laptop: Add platform support for Ideapad 5 Pro 16ACH6-82L5")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213297
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220627130850.313537-1-hdegoede@redhat.com
|
|
Add an allow_v4_dytc module parameter to allow users to easily test if
DYTC version 4 platform-profiles work on their laptop.
Fixes: 599482c58ebd ("platform/x86: ideapad-laptop: Add platform support for Ideapad 5 Pro 16ACH6-82L5")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213297
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220623115914.103001-1-hdegoede@redhat.com
|
|
Commit 30f8c74ca9b7 ("drm/vc4: Warn if some v3d code is run on BCM2711")
introduced a check in vc4_perfmon_get() that dereferences a pointer before
we checked whether that pointer is valid or not.
Let's rework that function a bit to do things in the proper order.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 30f8c74ca9b7 ("drm/vc4: Warn if some v3d code is run on BCM2711")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220622080243.22119-1-maxime@cerno.tech
|
|
Add some spaces to vring_alloc_queue(make it look prettier).
Signed-off-by: Deming Wang <wangdeming@inspur.com>
Message-Id: <20220622192306.4371-1-wangdeming@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Before commit 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB")
we call vhost_vdpa_iotlb_free() during the release to clean all regions
mapped in the iotlb.
That commit removed vhost_vdpa_iotlb_free() and added vhost_vdpa_cleanup()
to do some cleanup, including deleting all mappings, but we forgot to call
it in vhost_vdpa_release().
This causes that if an application does not remove all mappings explicitly
(or it crashes), the mappings remain in the iotlb and subsequent
applications may fail if they map the same addresses.
Calling vhost_vdpa_cleanup() also fixes a memory leak since we are not
freeing `v->vdev.vqs` during the release from the same commit.
Since vhost_vdpa_cleanup() calls vhost_dev_cleanup() we can remove its
call from vhost_vdpa_release().
Fixes: 3d5698793897 ("vhost-vdpa: introduce asid based IOTLB")
Cc: gautam.dawar@xilinx.com
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220622151407.51232-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
|
|
Virtio devices might lose their state when the VMM is restarted
after a suspend to disk (hibernation) cycle. This means that the
guest page size register must be restored for the virtio_mmio legacy
interface, since otherwise the virtio queues are not functional.
This is particularly problematic for QEMU that currently still defaults
to using the legacy interface for virtio_mmio. Write the guest page
size register again in virtio_mmio_restore() to make legacy virtio_mmio
devices work correctly after hibernation.
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Message-Id: <20220621110621.3638025-3-stephan.gerhold@kernkonzept.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Most virtio drivers provide freeze/restore callbacks to finish up
device usage before suspend and to reinitialize the virtio device after
resume. However, these callbacks are currently only called when using
virtio_pci. virtio_mmio does not have any PM ops defined.
This causes problems for example after suspend to disk (hibernation),
since the virtio devices might lose their state after the VMM is
restarted. Calling virtio_device_freeze()/restore() ensures that
the virtio devices are re-initialized correctly.
Fix this by implementing the dev_pm_ops for virtio_mmio,
similar to virtio_pci_common.
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Message-Id: <20220621110621.3638025-2-stephan.gerhold@kernkonzept.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
We currently depend on probe() calling virtio_device_ready() -
which happens after netdev
registration. Since ndo_open() can be called immediately
after register_netdev, this means there exists a race between
ndo_open() and virtio_device_ready(): the driver may start to use the
device (e.g. TX) before DRIVER_OK which violates the spec.
Fix this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().
Fixes: 0d2e1a2926b18 ("caif_virtio: Introduce caif over virtio")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220620051115.3142-3-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
We currently call virtio_device_ready() after netdev
registration. Since ndo_open() can be called immediately
after register_netdev, this means there exists a race between
ndo_open() and virtio_device_ready(): the driver may start to use the
device before DRIVER_OK which violates the spec.
Fix this by switching to use register_netdevice() and protect the
virtio_device_ready() with rtnl_lock() to make sure ndo_open() can
only be called after virtio_device_ready().
Fixes: 4baf1e33d0842 ("virtio_net: enable VQs early")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220617072949.30734-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
Some protocols check the response size with the expected value but optee
shared memory doesn't return such size whereas it is available in the
optee output buffer.
As an example, the base protocol compares the response size with the
expected result when requesting the list of protocol which triggers a
warning with optee shared memory:
arm-scmi firmware:scmi0: Malformed reply - real_sz:116 calc_sz:4 (loop_num_ret:4)
Save the output buffer length and use it when fetching the answer.
Link: https://lore.kernel.org/r/20220624074549.3298-1-vincent.guittot@linaro.org
Reviewed-by: Etienne Carriere <etienne.carriere@linaro.org>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
|
|
Shuang Li reported a NULL pointer dereference crash:
[] BUG: kernel NULL pointer dereference, address: 0000000000000068
[] RIP: 0010:tipc_link_is_up+0x5/0x10 [tipc]
[] Call Trace:
[] <IRQ>
[] tipc_bcast_rcv+0xa2/0x190 [tipc]
[] tipc_node_bc_rcv+0x8b/0x200 [tipc]
[] tipc_rcv+0x3af/0x5b0 [tipc]
[] tipc_udp_recv+0xc7/0x1e0 [tipc]
It was caused by the 'l' passed into tipc_bcast_rcv() is NULL. When it
creates a node in tipc_node_check_dest(), after inserting the new node
into hashtable in tipc_node_create(), it creates the bc link. However,
there is a gap between this insert and bc link creation, a bc packet
may come in and get the node from the hashtable then try to dereference
its bc link, which is NULL.
This patch is to fix it by moving the bc link creation before inserting
into the hashtable.
Note that for a preliminary node becoming "real", the bc link creation
should also be called before it's rehashed, as we don't create it for
preliminary nodes.
Fixes: 4cbf8ac2fe5a ("tipc: enable creating a "preliminary" node")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Recently added debug in commit f9aefd6b2aa3 ("net: warn if mac header
was not set") caught a bug in skb_tunnel_check_pmtu(), as shown
in this syzbot report [1].
In ndo_start_xmit() paths, there is really no need to use skb->mac_header,
because skb->data is supposed to point at it.
[1] WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_mac_header_len include/linux/skbuff.h:2784 [inline]
WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413
Modules linked in:
CPU: 1 PID: 8604 Comm: syz-executor.3 Not tainted 5.19.0-rc2-syzkaller-00443-g8720bd951b8e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:skb_mac_header_len include/linux/skbuff.h:2784 [inline]
RIP: 0010:skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413
Code: 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 84 b9 fe ff ff 4c 89 ff e8 7c 0f d7 f9 e9 ac fe ff ff e8 c2 13 8a f9 <0f> 0b e9 28 fc ff ff e8 b6 13 8a f9 48 8b 54 24 70 48 b8 00 00 00
RSP: 0018:ffffc90002e4f520 EFLAGS: 00010212
RAX: 0000000000000324 RBX: ffff88804d5fd500 RCX: ffffc90005b52000
RDX: 0000000000040000 RSI: ffffffff87f05e3e RDI: 0000000000000003
RBP: ffffc90002e4f650 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000000 R12: 000000000000ffff
R13: 0000000000000000 R14: 000000000000ffcd R15: 000000000000001f
FS: 00007f3babba9700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000080 CR3: 0000000075319000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
geneve_xmit_skb drivers/net/geneve.c:927 [inline]
geneve_xmit+0xcf8/0x35d0 drivers/net/geneve.c:1107
__netdev_start_xmit include/linux/netdevice.h:4805 [inline]
netdev_start_xmit include/linux/netdevice.h:4819 [inline]
__dev_direct_xmit+0x500/0x730 net/core/dev.c:4309
dev_direct_xmit include/linux/netdevice.h:3007 [inline]
packet_direct_xmit+0x1b8/0x2c0 net/packet/af_packet.c:282
packet_snd net/packet/af_packet.c:3073 [inline]
packet_sendmsg+0x21f4/0x55d0 net/packet/af_packet.c:3104
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x6eb/0x810 net/socket.c:2489
___sys_sendmsg+0xf3/0x170 net/socket.c:2543
__sys_sendmsg net/socket.c:2572 [inline]
__do_sys_sendmsg net/socket.c:2581 [inline]
__se_sys_sendmsg net/socket.c:2579 [inline]
__x64_sys_sendmsg+0x132/0x220 net/socket.c:2579
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f3baaa89109
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3babba9168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f3baab9bf60 RCX: 00007f3baaa89109
RDX: 0000000000000000 RSI: 0000000020000a00 RDI: 0000000000000003
RBP: 00007f3baaae305d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe74f2543f R14: 00007f3babba9300 R15: 0000000000022000
</TASK>
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some Allwinner SoCs have 2 pinctrls (PIO and R_PIO).
Previous implementation used absolute pin numbering and it was incorrect
for R_PIO pinctrl.
It's necessary to take into account the base pin number.
Fixes: 90be64e27621 ("pinctrl: sunxi: implement pin_config_set")
Signed-off-by: Andrei Lalaev <andrey.lalaev@gmail.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20220525190423.410609-1-andrey.lalaev@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
BTC_NO indicates that hardware is not susceptible to Branch Type Confusion.
Zen3 CPUs don't suffer BTC.
Hypervisors are expected to synthesise BTC_NO when it is appropriate
given the migration pool, to prevent kernels using heuristics.
[ bp: Massage. ]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The whole MMIO/RETBLEED enumeration went overboard on steppings. Get
rid of all that and simply use ANY.
If a future stepping of these models would not be affected, it had
better set the relevant ARCH_CAP_$FOO_NO bit in
IA32_ARCH_CAPABILITIES.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
On VMX, there are some balanced returns between the time the guest's
SPEC_CTRL value is written, and the vmenter.
Balanced returns (matched by a preceding call) are usually ok, but it's
at least theoretically possible an NMI with a deep call stack could
empty the RSB before one of the returns.
For maximum paranoia, don't allow *any* returns (balanced or otherwise)
between the SPEC_CTRL write and the vmenter.
[ bp: Fix 32-bit build. ]
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Prevent RSB underflow/poisoning attacks with RSB. While at it, add a
bunch of comments to attempt to document the current state of tribal
knowledge about RSB attacks and what exactly is being mitigated.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
For legacy IBRS to work, the IBRS bit needs to be always re-written
after vmexit, even if it's already on.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
On eIBRS systems, the returns in the vmexit return path from
__vmx_vcpu_run() to vmx_vcpu_run() are exposed to RSB poisoning attacks.
Fix that by moving the post-vmexit spec_ctrl handling to immediately
after the vmexit.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Convert __vmx_vcpu_run()'s 'launched' argument to 'flags', in
preparation for doing SPEC_CTRL handling immediately after vmexit, which
will need another flag.
This is much easier than adding a fourth argument, because this code
supports both 32-bit and 64-bit, and the fourth argument on 32-bit would
have to be pushed on the stack.
Note that __vmx_vcpu_run_flags() is called outside of the noinstr
critical section because it will soon start calling potentially
traceable functions.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Move the vmx_vm{enter,exit}() functionality into __vmx_vcpu_run(). This
will make it easier to do the spec_ctrl handling before the first RET.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Commit
c536ed2fffd5 ("objtool: Remove SAVE/RESTORE hints")
removed the save/restore unwind hints because they were no longer
needed. Now they're going to be needed again so re-add them.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
This mask has been made redundant by kvm_spec_ctrl_test_value(). And it
doesn't even work when MSR interception is disabled, as the guest can
just write to SPEC_CTRL directly.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
There's no need to recalculate the host value for every entry/exit.
Just use the cached value in spec_ctrl_current().
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
If the SMT state changes, SSBD might get accidentally disabled. Fix
that.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The firmware entry code may accidentally clear STIBP or SSBD. Fix that.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
If a kernel is built with CONFIG_RETPOLINE=n, but the user still wants
to mitigate Spectre v2 using IBRS or eIBRS, the RSB filling will be
silently disabled.
There's nothing retpoline-specific about RSB buffer filling. Remove the
CONFIG_RETPOLINE guards around it.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Zen2 uarchs have an undocumented, unnamed, MSR that contains a chicken
bit for some speculation behaviour. It needs setting.
Note: very belatedly AMD released naming; it's now officially called
MSR_AMD64_DE_CFG2 and MSR_AMD64_DE_CFG2_SUPPRESS_NOBR_PRED_BIT
but shall remain the SPECTRAL CHICKEN.
Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Since entry asm is tricky, add a validation pass that ensures the
retbleed mitigation has been done before the first actual RET
instruction.
Entry points are those that either have UNWIND_HINT_ENTRY, which acts
as UNWIND_HINT_EMPTY but marks the instruction as an entry point, or
those that have UWIND_HINT_IRET_REGS at +0.
This is basically a variant of validate_branch() that is
intra-function and it will simply follow all branches from marked
entry points and ensures that all paths lead to ANNOTATE_UNRET_END.
If a path hits RET or an indirection the path is a fail and will be
reported.
There are 3 ANNOTATE_UNRET_END instances:
- UNTRAIN_RET itself
- exception from-kernel; this path doesn't need UNTRAIN_RET
- all early exceptions; these also don't need UNTRAIN_RET
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
When booting with retbleed=auto, if the kernel wasn't built with
CONFIG_CC_HAS_RETURN_THUNK, the mitigation falls back to IBPB. Make
sure a warning is printed in that case. The IBPB fallback check is done
twice, but it really only needs to be done once.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
jmp2ret mitigates the easy-to-attack case at relatively low overhead.
It mitigates the long speculation windows after a mispredicted RET, but
it does not mitigate the short speculation window from arbitrary
instruction boundaries.
On Zen2, there is a chicken bit which needs setting, which mitigates
"arbitrary instruction boundaries" down to just "basic block boundaries".
But there is no fix for the short speculation window on basic block
boundaries, other than to flush the entire BTB to evict all attacker
predictions.
On the spectrum of "fast & blurry" -> "safe", there is (on top of STIBP
or no-SMT):
1) Nothing System wide open
2) jmp2ret May stop a script kiddy
3) jmp2ret+chickenbit Raises the bar rather further
4) IBPB Only thing which can count as "safe".
Tentative numbers put IBPB-on-entry at a 2.5x hit on Zen2, and a 10x hit
on Zen1 according to lmbench.
[ bp: Fixup feature bit comments, document option, 32-bit build fix. ]
Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Ensure the Xen entry also passes through UNTRAIN_RET.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Native SYS{CALL,ENTER} entry points are called
entry_SYS{CALL,ENTER}_{64,compat}, make sure the Xen versions are
named consistently.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Update retpoline validation with the new CONFIG_RETPOLINE requirement of
not having bare naked RET instructions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Having IBRS enabled while the SMT sibling is idle unnecessarily slows
down the running sibling. OTOH, disabling IBRS around idle takes two
MSR writes, which will increase the idle latency.
Therefore, only disable IBRS around deeper idle states. Shallow idle
states are bounded by the tick in duration, since NOHZ is not allowed
for them by virtue of their short target residency.
Only do this for mwait-driven idle, since that keeps interrupts disabled
across idle, which makes disabling IBRS vs IRQ-entry a non-issue.
Note: C6 is a random threshold, most importantly C1 probably shouldn't
disable IBRS, benchmarking needed.
Suggested-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Skylake suffers from RSB underflow speculation issues; report this
vulnerability and it's mitigation (spectre_v2=ibrs).
[jpoimboe: cleanups, eibrs]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
spectre_v2_user_select_mitigation()
retbleed will depend on spectre_v2, while spectre_v2_user depends on
retbleed. Break this cycle.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|