Age | Commit message (Collapse) | Author |
|
Pull io_uring fixes from Jens Axboe:
"Two fixes for the max workers limit API that was introduced this
series: one fix for an issue with that code, and one fixing a linked
timeout regression in this series"
* tag 'io_uring-5.15-2021-10-22' of git://git.kernel.dk/linux-block:
io_uring: apply worker limits to previous users
io_uring: fix ltimeout unprep
io_uring: apply max_workers limit to all future users
io-wq: max_worker fixes
|
|
If CONFIG_BLOCK isn't set, then it's an empty struct anyway. Just make
it generally available, so we don't break the compile:
kernel/sched/core.c: In function ‘sched_submit_work’:
kernel/sched/core.c:6346:35: error: ‘struct task_struct’ has no member named ‘plug’
6346 | blk_flush_plug(tsk->plug, true);
| ^~
kernel/sched/core.c: In function ‘io_schedule_prepare’:
kernel/sched/core.c:8357:20: error: ‘struct task_struct’ has no member named ‘plug’
8357 | if (current->plug)
| ^~
kernel/sched/core.c:8358:39: error: ‘struct task_struct’ has no member named ‘plug’
8358 | blk_flush_plug(current->plug, true);
| ^~
Reported-by: Nathan Chancellor <nathan@kernel.org>
Fixes: 008f75a20e70 ("block: cleanup the flush plug helpers")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The current logic of requests with IOSQE_ASYNC is first queueing it to
io-worker, then execute it in a synchronous way. For unbound works like
pollable requests(e.g. read/write a socketfd), the io-worker may stuck
there waiting for events for a long time. And thus other works wait in
the list for a long time too.
Let's introduce a new way for unbound works (currently pollable
requests), with this a request will first be queued to io-worker, then
executed in a nonblock try rather than a synchronous way. Failure of
that leads it to arm poll stuff and then the worker can begin to handle
other works.
The detail process of this kind of requests is:
step1: original context:
queue it to io-worker
step2: io-worker context:
nonblock try(the old logic is a synchronous try here)
|
|--fail--> arm poll
|
|--(fail/ready)-->synchronous issue
|
|--(succeed)-->worker finish it's job, tw
take over the req
This works much better than the old IOSQE_ASYNC logic in cases where
unbound max_worker is relatively small. In this case, number of
io-worker eazily increments to max_worker, new worker cannot be created
and running workers stuck there handling old works in IOSQE_ASYNC mode.
In my 64-core machine, set unbound max_worker to 20, run echo-server,
turns out:
(arguments: register_file, connetion number is 1000, message size is 12
Byte)
original IOSQE_ASYNC: 76664.151 tps
after this patch: 166934.985 tps
Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20211018133445.103438-1-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When enabling CONFIG_CGROUP_BPF, kmemleak can be observed by running
the command as below:
$mount -t cgroup -o none,name=foo cgroup cgroup/
$umount cgroup/
unreferenced object 0xc3585c40 (size 64):
comm "mount", pid 425, jiffies 4294959825 (age 31.990s)
hex dump (first 32 bytes):
01 00 00 80 84 8c 28 c0 00 00 00 00 00 00 00 00 ......(.........
00 00 00 00 00 00 00 00 6c 43 a0 c3 00 00 00 00 ........lC......
backtrace:
[<e95a2f9e>] cgroup_bpf_inherit+0x44/0x24c
[<1f03679c>] cgroup_setup_root+0x174/0x37c
[<ed4b0ac5>] cgroup1_get_tree+0x2c0/0x4a0
[<f85b12fd>] vfs_get_tree+0x24/0x108
[<f55aec5c>] path_mount+0x384/0x988
[<e2d5e9cd>] do_mount+0x64/0x9c
[<208c9cfe>] sys_mount+0xfc/0x1f4
[<06dd06e0>] ret_fast_syscall+0x0/0x48
[<a8308cb3>] 0xbeb4daa8
This is because that since the commit 2b0d3d3e4fcf ("percpu_ref: reduce
memory footprint of percpu_ref in fast path") root_cgrp->bpf.refcnt.data
is allocated by the function percpu_ref_init in cgroup_bpf_inherit which
is called by cgroup_setup_root when mounting, but not freed along with
root_cgrp when umounting. Adding cgroup_bpf_offline which calls
percpu_ref_kill to cgroup_kill_sb can free root_cgrp->bpf.refcnt.data in
umount path.
This patch also fixes the commit 4bfc0bb2c60e ("bpf: decouple the lifetime
of cgroup_bpf from cgroup itself"). A cgroup_bpf_offline is needed to do a
cleanup that frees the resources which are allocated by cgroup_bpf_inherit
in cgroup_setup_root.
And inside cgroup_bpf_offline, cgroup_get() is at the beginning and
cgroup_put is at the end of cgroup_bpf_release which is called by
cgroup_bpf_offline. So cgroup_bpf_offline can keep the balance of
cgroup's refcount.
Fixes: 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Fixes: 4bfc0bb2c60e ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211018075623.26884-1-quanyang.wang@windriver.com
|
|
1. The ufd in generic_map_update_batch() should be read from batch.map_fd;
2. A call to fdget() should be followed by a symmetric call to fdput().
Fixes: aa2e93b8e58e ("bpf: Add generic support for update and delete batch ops")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211019032934.1210517-1-xukuohai@huawei.com
|
|
Lorenz Bauer says:
====================
Fix some inconsistencies of bpf_jit_limit on non-x86 platforms.
I've dropped exposing bpf_jit_current since we couldn't agree on
file modes, correct names, etc.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Restrict bpf_jit_limit to the maximum supported by the arch's JIT.
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-4-lmb@cloudflare.com
|
|
Expose the maximum amount of useable memory from the arm64 JIT.
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-3-lmb@cloudflare.com
|
|
Expose the maximum amount of useable memory from the riscv JIT.
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Luke Nelson <luke.r.nels@gmail.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20211014142554.53120-2-lmb@cloudflare.com
|
|
On my box I see a bunch of ping/nettest processes hanging
around after fcntal-test.sh is done.
Clean those up before netns deletion.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211021140247.29691-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"Syzbot discovered a race in case of reusing the fuse sb (introduced in
this cycle).
Fix it by doing the s_fs_info initialization at the proper place"
* tag 'fuse-fixes-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: clean up error exits in fuse_fill_super()
fuse: always initialize sb->s_fs_info
fuse: clean up fuse_mount destruction
fuse: get rid of fuse_put_super()
fuse: check s_root when destroying sb
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyper-v fix from Wei Liu:
- Fix vmbus ARM64 build (Arnd Bergmann)
* tag 'hyperv-fixes-signed-20211022' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyperv/vmbus: include linux/bitops.h
|
|
Xin Long says:
====================
sctp: enhancements for the verification tag
This patchset is to address CVE-2021-3772:
A flaw was found in the Linux SCTP stack. A blind attacker may be able to
kill an existing SCTP association through invalid chunks if the attacker
knows the IP-addresses and port numbers being used and the attacker can
send packets with spoofed IP addresses.
This is caused by the missing VTAG verification for the received chunks
and the incorrect vtag for the ABORT used to reply to these invalid
chunks.
This patchset is to go over all processing functions for the received
chunks and do:
1. Make sure sctp_vtag_verify() is called firstly to verify the vtag from
the received chunk and discard this chunk if it fails. With some
exceptions:
a. sctp_sf_do_5_1B_init()/5_2_2_dupinit()/9_2_reshutack(), processing
INIT chunk, as sctphdr vtag is always 0 in INIT chunk.
b. sctp_sf_do_5_2_4_dupcook(), processing dupicate COOKIE_ECHO chunk,
as the vtag verification will be done by sctp_tietags_compare() and
then it takes right actions according to the return.
c. sctp_sf_shut_8_4_5(), processing SHUTDOWN_ACK chunk for cookie_wait
and cookie_echoed state, as RFC demand sending a SHUTDOWN_COMPLETE
even if the vtag verification failed.
d. sctp_sf_ootb(), called in many types of chunks for closed state or
no asoc, as the same reason to c.
2. Always use the vtag from the received INIT chunk to make the response
ABORT in sctp_ootb_pkt_new().
3. Fix the order for some checks and add some missing checks for the
received chunk.
This patch series has been tested with SCTP TAHI testing to make sure no
regression caused on protocol conformance.
====================
Link: https://lore.kernel.org/r/cover.1634730082.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sctp_sf_ootb() is called when processing DATA chunk in closed state,
and many other places are also using it.
The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.
When fails to verify the vtag from the chunk, this patch sets asoc
to NULL, so that the abort will be made with the vtag from the
received chunk later.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sctp_sf_do_8_5_1_E_sa() is called when processing SHUTDOWN_ACK chunk
in cookie_wait and cookie_echoed state.
The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.
Note that when fails to verify the vtag from SHUTDOWN-ACK chunk,
SHUTDOWN COMPLETE message will still be sent back to peer, but
with the vtag from SHUTDOWN-ACK chunk, as said in 5) of
rfc4960#section-8.4.
While at it, also remove the unnecessary chunk length check from
sctp_sf_shut_8_4_5(), as it's already done in both places where
it calls sctp_sf_shut_8_4_5().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sctp_sf_violation() is called when processing HEARTBEAT_ACK chunk
in cookie_wait state, and some other places are also using it.
The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
1. In closed state: in sctp_sf_do_5_1D_ce():
When asoc is NULL, making packet for abort will use chunk's vtag
in sctp_ootb_pkt_new(). But when asoc exists, vtag from the chunk
should be verified before using peer.i.init_tag to make packet
for abort in sctp_ootb_pkt_new(), and just discard it if vtag is
not correct.
2. In the other states: in sctp_sf_do_5_2_4_dupcook():
asoc always exists, but duplicate cookie_echo's vtag will be
handled by sctp_tietags_compare() and then take actions, so before
that we only verify the vtag for the abort sent for invalid chunk
length.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently INIT_ACK chunk in non-cookie_echoed state is processed in
sctp_sf_discard_chunk() to send an abort with the existent asoc's
vtag if the chunk length is not valid. But the vtag in the chunk's
sctphdr is not verified, which may be exploited by one to cook a
malicious chunk to terminal a SCTP asoc.
sctp_sf_discard_chunk() also is called in many other places to send
an abort, and most of those have this problem. This patch is to fix
it by sending abort with the existent asoc's vtag only if the vtag
from the chunk's sctphdr is verified in sctp_sf_discard_chunk().
Note on sctp_sf_do_9_1_abort() and sctp_sf_shutdown_pending_abort(),
the chunk length has been verified before sctp_sf_discard_chunk(),
so replace it with sctp_sf_discard(). On sctp_sf_do_asconf_ack() and
sctp_sf_do_asconf(), move the sctp_chunk_length_valid check ahead of
sctp_sf_discard_chunk(), then replace it with sctp_sf_discard().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch fixes the problems below:
1. In non-shutdown_ack_sent states: in sctp_sf_do_5_1B_init() and
sctp_sf_do_5_2_2_dupinit():
chunk length check should be done before any checks that may cause
to send abort, as making packet for abort will access the init_tag
from init_hdr in sctp_ootb_pkt_new().
2. In shutdown_ack_sent state: in sctp_sf_do_9_2_reshutack():
The same checks as does in sctp_sf_do_5_2_2_dupinit() is needed
for sctp_sf_do_9_2_reshutack().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently Linux SCTP uses the verification tag of the existing SCTP
asoc when failing to process and sending the packet with the ABORT
chunk. This will result in the peer accepting the ABORT chunk and
removing the SCTP asoc. One could exploit this to terminate a SCTP
asoc.
This patch is to fix it by always using the initiate tag of the
received INIT chunk for the ABORT chunk to be sent.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Christoph Paasch reports [1] about incorrect skb->truesize
after skb_expand_head() call in ip6_xmit.
This may happen because of two reasons:
- skb_set_owner_w() for newly cloned skb is called too early,
before pskb_expand_head() where truesize is adjusted for (!skb-sk) case.
- pskb_expand_head() does not adjust truesize in (skb->sk) case.
In this case sk->sk_wmem_alloc should be adjusted too.
[1] https://lkml.org/lkml/2021/8/20/1082
Fixes: f1260ff15a71 ("skbuff: introduce skb_expand_head()")
Fixes: 2d85a1b31dde ("ipv6: ip6_finish_output2: set sk into newly allocated nskb")
Reported-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/644330dd-477e-0462-83bf-9f514c41edd1@virtuozzo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On arm64 randconfig builds, hyperv sometimes fails with this
error:
In file included from drivers/hv/hv_trace.c:3:
In file included from drivers/hv/hyperv_vmbus.h:16:
In file included from arch/arm64/include/asm/sync_bitops.h:5:
arch/arm64/include/asm/bitops.h:11:2: error: only <linux/bitops.h> can be included directly
In file included from include/asm-generic/bitops/hweight.h:5:
include/asm-generic/bitops/arch_hweight.h:9:9: error: implicit declaration of function '__sw_hweight32' [-Werror,-Wimplicit-function-declaration]
include/asm-generic/bitops/atomic.h:17:7: error: implicit declaration of function 'BIT_WORD' [-Werror,-Wimplicit-function-declaration]
Include the correct header first.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211018131929.2260087-1-arnd@kernel.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix two regressions, one related to ACPI power resources
management and one that broke ACPI tools compilation.
Specifics:
- Stop turning off unused ACPI power resources in an unknown state to
address a regression introduced during the 5.14 cycle (Rafael
Wysocki).
- Fix an ACPI tools build issue introduced recently when the minimal
stdarg.h was added (Miguel Bernal Marin)"
* tag 'acpi-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Do not turn off power resources in unknown state
ACPI: tools: fix compilation error
|
|
Pull more x86 kvm fixes from Paolo Bonzini:
- Cache coherency fix for SEV live migration
- Fix for instruction emulation with PKU
- fixes for rare delaying of interrupt delivery
- fix for SEV-ES buffer overflow
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SEV-ES: go over the sev_pio_data buffer in multiple passes if needed
KVM: SEV-ES: keep INS functions together
KVM: x86: remove unnecessary arguments from complete_emulator_pio_in
KVM: x86: split the two parts of emulator_pio_in
KVM: SEV-ES: clean up kvm_sev_es_ins/outs
KVM: x86: leave vcpu->arch.pio.count alone in emulator_pio_in_out
KVM: SEV-ES: rename guest_ins_data to sev_pio_data
KVM: SEV: Flush cache on non-coherent systems before RECEIVE_UPDATE_DATA
KVM: MMU: Reset mmu->pkru_mask to avoid stale data
KVM: nVMX: promptly process interrupts delivered while in guest mode
KVM: x86: check for interrupts before deciding whether to exit the fast path
|
|
Merge a fix for a recent ACPI tools bild regresson.
* acpi-tools:
ACPI: tools: fix compilation error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Two small fixes:
* RCU misuse in scan processing in cfg80211
* missing size check for HE data in mac80211 mesh
* tag 'mac80211-for-net-2021-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
cfg80211: scan: fix RCU in cfg80211_add_nontrans_list()
mac80211: mesh: fix HE operation element length check
====================
Link: https://lore.kernel.org/r/20211021154351.134297-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently AMD/Hygon do not populate l2c_id, this means that for SMT
enabled systems they report an L2 per thread. This is ofcourse not
true but was harmless so far.
However, since commit: 66558b730f25 ("sched: Add cluster scheduler
level for x86") the scheduler topology setup requires:
SMT <= L2 <= LLC
Which leads to noisy warnings and possibly weird behaviour on affected
chips.
Therefore change the topology generation such that if l2c_id is not
populated it follows the SMT topology, thereby satisfying the
constraint.
Fixes: 66558b730f25 ("sched: Add cluster scheduler level for x86")
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
|
|
We should not reference the queue tagset in blk_mq_sched_tags_teardown()
(see function comment) for the blk-mq flags, so use the passed flags
instead.
This solves a use-after-free, similarly fixed earlier (and since broken
again) in commit f0c1c4d2864e ("blk-mq: fix use-after-free in
blk_mq_exit_sched").
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Fixes: e155b0c238b2 ("blk-mq: Use shared tags for shared sbitmap support")
Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1634890340-15432-1-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Shinichiro Kawasaki reports that there is a bug in a recent
req_bio_endio() patch causing problems with zonefs. As Shinichiro
suggested, inverse the condition in zone append path to resemble how it
was before: fail when it's not fully completed.
Fixes: 478eb72b815f3 ("block: optimise req_bio_endio()")
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/344ea4e334aace9148b41af5f2426da38c8aa65a.1634914228.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Get rid of the indirections and just provide a sync_bdevs
helper for the generic sync code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211019062530.2174626-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use sync_blockdev_nowait instead of opencoding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211019062530.2174626-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use sync_blockdev_nowait instead of opencoding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211019062530.2174626-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use sync_blockdev instead of opencoding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: David Sterba <dsterba@suse.com>
Link: https://lore.kernel.org/r/20211019062530.2174626-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use sync_blockdev instead of opencoding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211019062530.2174626-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead offer a new sync_blockdev_nowait helper for the !wait case.
This new helper is exported as it will grow modular callers in a bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211019062530.2174626-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is no clear benefit in having this helper vs just open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211019062530.2174626-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Support for cyrptoloop has been officially marked broken and deprecated
in favor of dm-crypt (which supports the same broken algorithms if
needed) in Linux 2.6.4 (released in March 2004), and support for it has
been entirely removed from losetup in util-linux 2.23 (released in April
2013). The XOR transfer has never been more than a toy to demonstrate
the transfer in the bad old times of crypto export restrictions.
Remove them as they have some nasty interactions with loop device life
times due to the iteration over all loop devices in
loop_unregister_transfer.
Suggested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211019075639.2333969-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Export scsi_device_from_queue for use with pktcdvd and use that instead
of the otherwise unused QUEUE_FLAG_SCSI_PASSTHROUGH queue flag.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Entirely unused now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a new helper that calls blk_get_request and initializes the
scsi_request to avoid the indirect call through ->.initialize_rq_fn.
Note that this makes the pktcdvd driver depend on the SCSI core, but
given that only SCSI devices support SCSI passthrough requests that
is not a functional change.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Directly initialize the bsg_job structure instead of relying on the
->.initialize_rq_fn indirection. This also removes the superflous
initialization of the second request used for BIDI requests.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Call the ->get_unique_id method to query the SCSI identifiers. This can
use the cached VPD page in the sd driver instead of sending a command
on every LAYOUTGET. It will also allow to support NVMe based volumes
if the draft for that ever takes off.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add the method to query for a uniqueue ID of a given type by looking
it up in the cached device identification VPD page.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add a method to query unique IDs from block devices. It will be used to
remove code that deeply pokes into SCSI internals in the NFS server.
The implementation in the sd driver itself is also much nicer as it can
use the cached VPD page instead of always sending a command as the
current NFS code does.
For now the interface is kept very minimal but could be easily
extended when other users like a block-layer sysfs interface for
uniquue IDs shows up.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20211021060607.264371-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The PIO scratch buffer is larger than a single page, and therefore
it is not possible to copy it in a single step to vcpu->arch/pio_data.
Bound each call to emulator_pio_in/out to a single page; keep
track of how many I/O operations are left in vcpu->arch.sev_pio_count,
so that the operation can be restarted in the complete_userspace_io
callback.
For OUT, this means that the previous kvm_sev_es_outs implementation
becomes an iterator of the loop, and we can consume the sev_pio_data
buffer before leaving to userspace.
For IN, instead, consuming the buffer and decreasing sev_pio_count
is always done in the complete_userspace_io callback, because that
is when the memcpy is done into sev_pio_data.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Felix Wilhelm <fwilhelm@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Make the diff a little nicer when we actually get to fixing
the bug. No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
complete_emulator_pio_in can expect that vcpu->arch.pio has been filled in,
and therefore does not need the size and count arguments. This makes things
nicer when the function is called directly from a complete_userspace_io
callback.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
emulator_pio_in handles both the case where the data is pending in
vcpu->arch.pio.count, and the case where I/O has to be done via either
an in-kernel device or a userspace exit. For SEV-ES we would like
to split these, to identify clearly the moment at which the
sev_pio_data is consumed. To this end, create two different
functions: __emulator_pio_in fills in vcpu->arch.pio.count, while
complete_emulator_pio_in clears it and releases vcpu->arch.pio.data.
Because this patch has to be backported, things are left a bit messy.
kernel_pio() operates on vcpu->arch.pio, which leads to emulator_pio_in()
having with two calls to complete_emulator_pio_in(). It will be fixed
in the next release.
While at it, remove the unused void* val argument of emulator_pio_in_out.
The function currently hardcodes vcpu->arch.pio_data as the
source/destination buffer, which sucks but will be fixed after the more
severe SEV-ES buffer overflow.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
A few very small cleanups to the functions, smushed together because
the patch is already very small like this:
- inline emulator_pio_in_emulated and emulator_pio_out_emulated,
since we already have the vCPU
- remove the data argument and pull setting vcpu->arch.sev_pio_data into
the caller
- remove unnecessary clearing of vcpu->arch.pio.count when
emulation is done by the kernel (and therefore vcpu->arch.pio.count
is already clear on exit from emulator_pio_in and emulator_pio_out).
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Currently emulator_pio_in clears vcpu->arch.pio.count twice if
emulator_pio_in_out performs kernel PIO. Move the clear into
emulator_pio_out where it is actually necessary.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|