Age | Commit message (Collapse) | Author |
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
2 gem context related fixes:
- to avoid a general protection failure when using perf/OA (Chris)
- to avoid kernel warnings on driver release (Janusz)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yyt1CV+YIjKQZZMB@intel.com
|
|
request_queues are a block layer implementation detail that should not
leak into file systems. Change the fscrypt inline crypto code to
retrieve block devices instead of request_queues from the file system.
As part of that, clean up the interaction with multi-device file systems
by returning both the number of devices and the actual device array in a
single method call.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ebiggers: bug fixes and minor tweaks]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20220901193208.138056-4-ebiggers@kernel.org
|
|
Now that the fscrypt_master_key lifetime has been reworked to not be
subject to the quirks of the keyrings subsystem, blk_crypto_evict_key()
no longer gets called after the filesystem has already been unmounted.
Therefore, there is no longer any need to hold extra references to the
filesystem's request_queue(s). (And these references didn't always do
their intended job anyway, as pinning a request_queue doesn't
necessarily pin the corresponding blk_crypto_profile.)
Stop taking these extra references. Instead, just pass the super_block
to fscrypt_destroy_inline_crypt_key(), and use it to get the list of
block devices the key needs to be evicted from.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20220901193208.138056-3-ebiggers@kernel.org
|
|
The approach of fs/crypto/ internally managing the fscrypt_master_key
structs as the payloads of "struct key" objects contained in a
"struct key" keyring has outlived its usefulness. The original idea was
to simplify the code by reusing code from the keyrings subsystem.
However, several issues have arisen that can't easily be resolved:
- When a master key struct is destroyed, blk_crypto_evict_key() must be
called on any per-mode keys embedded in it. (This started being the
case when inline encryption support was added.) Yet, the keyrings
subsystem can arbitrarily delay the destruction of keys, even past the
time the filesystem was unmounted. Therefore, currently there is no
easy way to call blk_crypto_evict_key() when a master key is
destroyed. Currently, this is worked around by holding an extra
reference to the filesystem's request_queue(s). But it was overlooked
that the request_queue reference is *not* guaranteed to pin the
corresponding blk_crypto_profile too; for device-mapper devices that
support inline crypto, it doesn't. This can cause a use-after-free.
- When the last inode that was using an incompletely-removed master key
is evicted, the master key removal is completed by removing the key
struct from the keyring. Currently this is done via key_invalidate().
Yet, key_invalidate() takes the key semaphore. This can deadlock when
called from the shrinker, since in fscrypt_ioctl_add_key(), memory is
allocated with GFP_KERNEL under the same semaphore.
- More generally, the fact that the keyrings subsystem can arbitrarily
delay the destruction of keys (via garbage collection delay, or via
random processes getting temporary key references) is undesirable, as
it means we can't strictly guarantee that all secrets are ever wiped.
- Doing the master key lookups via the keyrings subsystem results in the
key_permission LSM hook being called. fscrypt doesn't want this, as
all access control for encrypted files is designed to happen via the
files themselves, like any other files. The workaround which SELinux
users are using is to change their SELinux policy to grant key search
access to all domains. This works, but it is an odd extra step that
shouldn't really have to be done.
The fix for all these issues is to change the implementation to what I
should have done originally: don't use the keyrings subsystem to keep
track of the filesystem's fscrypt_master_key structs. Instead, just
store them in a regular kernel data structure, and rework the reference
counting, locking, and lifetime accordingly. Retain support for
RCU-mode key lookups by using a hash table. Replace fscrypt_sb_free()
with fscrypt_sb_delete(), which releases the keys synchronously and runs
a bit earlier during unmount, so that block devices are still available.
A side effect of this patch is that neither the master keys themselves
nor the filesystem keyrings will be listed in /proc/keys anymore.
("Master key users" and the master key users keyrings will still be
listed.) However, this was mostly an implementation detail, and it was
intended just for debugging purposes. I don't know of anyone using it.
This patch does *not* change how "master key users" (->mk_users) works;
that still uses the keyrings subsystem. That is still needed for key
quotas, and changing that isn't necessary to solve the issues listed
above. If we decide to change that too, it would be a separate patch.
I've marked this as fixing the original commit that added the fscrypt
keyring, but as noted above the most important issue that this patch
fixes wasn't introduced until the addition of inline encryption support.
Fixes: 22d94f493bfb ("fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20220901193208.138056-2-ebiggers@kernel.org
|
|
The .data.rel.ro.local section has the same semantics as .data.rel.ro
here, so include it in the .rodata section of the decompressor.
Additionally since the .printk_index section isn't usable outside of
the core kernel, discard it in the decompressor. Avoids these warnings:
arm-linux-gnueabi-ld: warning: orphan section `.data.rel.ro.local' from `arch/arm/boot/compressed/fdt_rw.o' being placed in section `.data.rel.ro.local'
arm-linux-gnueabi-ld: warning: orphan section `.printk_index' from `arch/arm/boot/compressed/fdt_rw.o' being placed in section `.printk_index'
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-mm/202209080545.qMIVj7YM-lkp@intel.com
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Andrii Nakryiko says:
====================
Add three more critical features to veristat tool, which make it sufficient
for a practical work on BPF verifier:
- CSV output, which allows easier programmatic post-processing of stats;
- building upon CSV output, veristat now supports comparison mode, in which
two previously captured CSV outputs from veristat are compared with each
other in a convenient form;
- flexible allow/deny filtering using globs for BPF object files and
programs, allowing to narrow down target BPF programs to be verified.
See individual patches for more details and examples.
v1->v2:
- split out double-free fix into patch #1 (Yonghong);
- fixed typo in verbose flag (Quentin);
- baseline and comparison stats were reversed in output table, fixed that.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add -f (--filter) argument which accepts glob-based filters for
narrowing down what BPF object files and programs within them should be
processed by veristat. This filtering applies both to comparison and
main (verification) mode.
Filter can be of two forms:
- file (object) filter: 'strobemeta*'; in this case all the programs
within matching files are implicitly allowed (or denied, depending
if it's positive or negative rule, see below);
- file and prog filter: 'strobemeta*/*unroll*' will further filter
programs within matching files to only allow those program names that
match '*unroll*' glob.
As mentioned, filters can be positive (allowlisting) and negative
(denylisting). Negative filters should start with '!': '!strobemeta*'
will deny any filename which basename starts with "strobemeta".
Further, one extra special syntax is supported to allow more convenient
use in practice. Instead of specifying rule on the command line,
veristat allows to specify file that contains rules, both positive and
negative, one line per one filter. This is achieved with -f @<filepath>
use, where <filepath> points to a text file containing rules (negative
and positive rules can be mixed). For convenience empty lines and lines
starting with '#' are ignored. This feature is useful to have some
pre-canned list of object files and program names that are tested
repeatedly, allowing to check in a list of rules and quickly specify
them on the command line.
As a demonstration (and a short cut for nearest future), create a small
list of "interesting" BPF object files from selftests/bpf and commit it
as veristat.cfg. It currently includes 73 programs, most of which are
the most complex and largest BPF programs in selftests, as judged by
total verified instruction count and verifier states total.
If there is overlap between positive or negative filters, negative
filter takes precedence (denylisting is stronger than allowlisting). If
no allow filter is specified, veristat implicitly assumes '*/*' rule. If
no deny rule is specified, veristat (logically) assumes no negative
filters.
Also note that -f (just like -e and -s) can be specified multiple times
and their effect is cumulative.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220921164254.3630690-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add ability to compare and contrast two veristat runs, previously
recorded with veristat using CSV output format.
When veristat is called with -C (--compare) flag, veristat expects
exactly two input files specified, both should be in CSV format.
Expectation is that it's output from previous veristat runs, but as long
as column names and formats match, it should just work. First CSV file
is designated as a "baseline" provided, and the second one is
comparison (experiment) data set. Establishing baseline matters later
when calculating difference percentages, see below.
Veristat parses these two CSV files and "reconstructs" verifier stats
(it could be just a subset of all possible stats). File and program
names are mandatory as they are used as joining key (these two "stats"
are designated as "key stats" in the code).
Veristat currently enforces that the set of stats recorded in both CSV
has to exactly match, down to exact order. This is just a simplifying
condition which can be lifted with a bit of additional pre-processing to
reorded stat specs internally, which I didn't bother doing, yet.
For all the non-key stats, veristat will output three columns: one for
baseline data, one for comparison data, and one with an absolute and
relative percentage difference. If either baseline or comparison values
are missing (that is, respective CSV file doesn't have a row with
*exactly* matching file and program name), those values are assumed to
be empty or zero. In such case relative percentages are forced to +100%
or -100% output, for consistency with a typical case.
Veristat's -e (--emit) and -s (--sort) specs still apply, so even if CSV
contains lots of stats, user can request to compare only a subset of
them (and specify desired column order as well). Similarly, both CSV and
human-readable table output is honored. Note that input is currently
always expected to be CSV.
Here's an example shell session, recording data for biosnoop tool on two
different kernels and comparing them afterwards, outputting data in table
format.
# on slightly older production kernel
$ sudo ./veristat biosnoop_bpf.o
File Program Verdict Duration (us) Total insns Total states Peak states
-------------- ------------------------ ------- ------------- ----------- ------------ -----------
biosnoop_bpf.o blk_account_io_merge_bio success 37 24 1 1
biosnoop_bpf.o blk_account_io_start failure 0 0 0 0
biosnoop_bpf.o block_rq_complete success 76 104 6 6
biosnoop_bpf.o block_rq_insert success 83 85 7 7
biosnoop_bpf.o block_rq_issue success 79 85 7 7
-------------- ------------------------ ------- ------------- ----------- ------------ -----------
Done. Processed 1 object files, 5 programs.
$ sudo ./veristat ~/local/tmp/fbcode-bpf-objs/biosnoop_bpf.o -o csv > baseline.csv
$ cat baseline.csv
file_name,prog_name,verdict,duration,total_insns,total_states,peak_states
biosnoop_bpf.o,blk_account_io_merge_bio,success,36,24,1,1
biosnoop_bpf.o,blk_account_io_start,failure,0,0,0,0
biosnoop_bpf.o,block_rq_complete,success,82,104,6,6
biosnoop_bpf.o,block_rq_insert,success,78,85,7,7
biosnoop_bpf.o,block_rq_issue,success,74,85,7,7
# on latest bpf-next kernel
$ sudo ./veristat biosnoop_bpf.o
File Program Verdict Duration (us) Total insns Total states Peak states
-------------- ------------------------ ------- ------------- ----------- ------------ -----------
biosnoop_bpf.o blk_account_io_merge_bio success 31 24 1 1
biosnoop_bpf.o blk_account_io_start failure 0 0 0 0
biosnoop_bpf.o block_rq_complete success 76 104 6 6
biosnoop_bpf.o block_rq_insert success 83 91 7 7
biosnoop_bpf.o block_rq_issue success 74 91 7 7
-------------- ------------------------ ------- ------------- ----------- ------------ -----------
Done. Processed 1 object files, 5 programs.
$ sudo ./veristat biosnoop_bpf.o -o csv > comparison.csv
$ cat comparison.csv
file_name,prog_name,verdict,duration,total_insns,total_states,peak_states
biosnoop_bpf.o,blk_account_io_merge_bio,success,71,24,1,1
biosnoop_bpf.o,blk_account_io_start,failure,0,0,0,0
biosnoop_bpf.o,block_rq_complete,success,82,104,6,6
biosnoop_bpf.o,block_rq_insert,success,83,91,7,7
biosnoop_bpf.o,block_rq_issue,success,87,91,7,7
# now let's compare with human-readable output (note that no sudo needed)
# we also ignore verification duration in this case to shortned output
$ ./veristat -C baseline.csv comparison.csv -e file,prog,verdict,insns
File Program Verdict (A) Verdict (B) Verdict (DIFF) Total insns (A) Total insns (B) Total insns (DIFF)
-------------- ------------------------ ----------- ----------- -------------- --------------- --------------- ------------------
biosnoop_bpf.o blk_account_io_merge_bio success success MATCH 24 24 +0 (+0.00%)
biosnoop_bpf.o blk_account_io_start failure failure MATCH 0 0 +0 (+100.00%)
biosnoop_bpf.o block_rq_complete success success MATCH 104 104 +0 (+0.00%)
biosnoop_bpf.o block_rq_insert success success MATCH 91 85 -6 (-6.59%)
biosnoop_bpf.o block_rq_issue success success MATCH 91 85 -6 (-6.59%)
-------------- ------------------------ ----------- ----------- -------------- --------------- --------------- ------------------
While not particularly exciting example (it turned out to be kind of hard to
quickly find a nice example with significant difference just because of kernel
version bump), it should demonstrate main features.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220921164254.3630690-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Teach veristat to output results as CSV table for easier programmatic
processing. Change what was --output/-o argument to now be --emit/-e.
And then use --output-format/-o <fmt> to specify output format.
Currently "table" and "csv" is supported, table being default.
For CSV output mode veristat is using spec identifiers as column names.
E.g., instead of "Total states" veristat uses "total_states" as a CSV
header name.
Internally veristat recognizes three formats, one of them
(RESFMT_TABLE_CALCLEN) is a special format instructing veristat to
calculate column widths for table output. This felt a bit cleaner and
more uniform than either creating separate functions just for this.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220921164254.3630690-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
bpf_object__close(obj) is called twice for BPF object files with single
BPF program in it. This causes crash. Fix this by not calling
bpf_object__close() unnecessarily.
Fixes: c8bc5e050976 ("selftests/bpf: Add veristat tool for mass-verifying BPF object files")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220921164254.3630690-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Lorenzo Bianconi says:
====================
Introduce bpf_ct_set_nat_info kfunc helper in order to set source and
destination nat addresses/ports in a new allocated ct entry not inserted
in the connection tracking table yet.
Introduce support for per-parameter trusted args.
Changes since v2:
- use int instead of a pointer for port in bpf_ct_set_nat_info signature
- modify KF_TRUSTED_ARGS definition in order to referenced pointer constraint
just for PTR_TO_BTF_ID
- drop patch 2/4
Changes since v1:
- enable CONFIG_NF_NAT in tools/testing/selftests/bpf/config
Kumar Kartikeya Dwivedi (1):
bpf: Tweak definition of KF_TRUSTED_ARGS
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce self-tests for bpf_ct_set_nat_info kfunc used to set the
source or destination nat addresses/ports.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/803e33294e247744d466943105879414344d3235.1663778601.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce bpf_ct_set_nat_info kfunc helper in order to set source and
destination nat addresses/ports in a new allocated ct entry not inserted
in the connection tracking table yet.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/9567db2fdfa5bebe7b7cc5870f7a34549418b4fc.1663778601.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Instead of forcing all arguments to be referenced pointers with non-zero
reg->ref_obj_id, tweak the definition of KF_TRUSTED_ARGS to mean that
only PTR_TO_BTF_ID (and socket types translated to PTR_TO_BTF_ID) have
that constraint, and require their offset to be set to 0.
The rest of pointer types are also accomodated in this definition of
trusted pointers, but with more relaxed rules regarding offsets.
The inherent meaning of setting this flag is that all kfunc pointer
arguments have a guranteed lifetime, and kernel object pointers
(PTR_TO_BTF_ID, PTR_TO_CTX) are passed in their unmodified form (with
offset 0). In general, this is not true for PTR_TO_BTF_ID as it can be
obtained using pointer walks.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/cdede0043c47ed7a357f0a915d16f9ce06a1d589.1663778601.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Using rbtree for sorting groups by average fragment size is relatively
expensive (needs rbtree update on every block freeing or allocation) and
leads to wide spreading of allocations because selection of block group
is very sentitive both to changes in free space and amount of blocks
allocated. Furthermore selecting group with the best matching average
fragment size is not necessary anyway, even more so because the
variability of fragment sizes within a group is likely large so average
is not telling much. We just need a group with large enough average
fragment size so that we have high probability of finding large enough
free extent and we don't want average fragment size to be too big so
that we are likely to find free extent only somewhat larger than what we
need.
So instead of maintaing rbtree of groups sorted by fragment size keep
bins (lists) or groups where average fragment size is in the interval
[2^i, 2^(i+1)). This structure requires less updates on block allocation
/ freeing, generally avoids chaotic spreading of allocations into block
groups, and still is able to quickly (even faster that the rbtree)
provide a block group which is likely to have a suitably sized free
space extent.
This patch reduces number of block groups used when untarring archive
with medium sized files (size somewhat above 64k which is default
mballoc limit for avoiding locality group preallocation) to about half
and thus improves write speeds for eMMC flash significantly.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-5-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Curently we don't use any preallocation when a file is already closed
when allocating blocks (from writeback code when converting delayed
allocation). However for small files, using locality group preallocation
is actually desirable as that is not specific to a particular file.
Rather it is a method to pack small files together to reduce
fragmentation and for that the fact the file is closed is actually even
stronger hint the file would benefit from packing. So change the logic
to allow locality group preallocation in this case.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Currently the Orlov inode allocator searches for free inodes for a
directory only in flex block groups with at most inodes_per_group/16
more directory inodes than average per flex block group. However with
growing size of flex block group this becomes unnecessarily strict.
Scale allowed difference from average directory count per flex block
group with flex block group size as we do with other metrics.
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
mb_set_largest_free_order() updates lists containing groups with largest
chunk of free space of given order. The way it updates it leads to
always moving the group to the tail of the list. Thus allocations
looking for free space of given order effectively end up cycling through
all groups (and due to initialization in last to first order). This
spreads allocations among block groups which reduces performance for
rotating disks or low-end flash media. Change
mb_set_largest_free_order() to only update lists if the order of the
largest free chunk in the group changed.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Link: https://lore.kernel.org/r/20220908092136.11770-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
One of the side-effects of mb_optimize_scan was that the optimized
functions to select next group to try were called even before we tried
the goal group. As a result we no longer allocate files close to
corresponding inodes as well as we don't try to expand currently
allocated extent in the same group. This results in reaim regression
with workfile.disk workload of upto 8% with many clients on my test
machine:
baseline mb_optimize_scan
Hmean disk-1 2114.16 ( 0.00%) 2099.37 ( -0.70%)
Hmean disk-41 87794.43 ( 0.00%) 83787.47 * -4.56%*
Hmean disk-81 148170.73 ( 0.00%) 135527.05 * -8.53%*
Hmean disk-121 177506.11 ( 0.00%) 166284.93 * -6.32%*
Hmean disk-161 220951.51 ( 0.00%) 207563.39 * -6.06%*
Hmean disk-201 208722.74 ( 0.00%) 203235.59 ( -2.63%)
Hmean disk-241 222051.60 ( 0.00%) 217705.51 ( -1.96%)
Hmean disk-281 252244.17 ( 0.00%) 241132.72 * -4.41%*
Hmean disk-321 255844.84 ( 0.00%) 245412.84 * -4.08%*
Also this is causing huge regression (time increased by a factor of 5 or
so) when untarring archive with lots of small files on some eMMC storage
cards.
Fix the problem by making sure we try goal group first.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable@kernel.org
Reported-and-tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/all/20220727105123.ckwrhbilzrxqpt24@quack3/
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter patches for net-next
Remove GPL license copypastry in uapi files, those have SPDX tags.
From Christophe Jaillet.
Remove unused variable in rpfilter, from Guillaume Nault.
Rework gc resched delay computation in conntrack, from Antoine Tenart.
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: rpfilter: Remove unused variable 'ret'.
headers: Remove some left-over license text in include/uapi/linux/netfilter/
netfilter: conntrack: revisit the gc initial rescheduling bias
netfilter: conntrack: fix the gc rescheduling delay
====================
Link: https://lore.kernel.org/r/20220921095000.29569-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-09-20 (ice)
Michal re-sets TC configuration when changing number of queues.
Mateusz moves the check and call for link-down-on-close to the specific
path for downing/closing the interface.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix interface being down after reset with link-down-on-close flag on
ice: config netdev tc before setting queues number
====================
Link: https://lore.kernel.org/r/20220920205344.1860934-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Usually it's not necessary to declare static functions if the symbols are
in the right order. Moving the definition of tsi_eth_driver down in the
compilation unit allows to drop two such declarations.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20220919131515.885361-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We've met the problem that when there is a vlan tag inside
GRE encapsulation, the match of num_of_vlans fails.
It is caused by the vlan tag inside GRE payload has been
counted into num_of_vlans, which is not expected.
One example packet is like this:
Ethernet II, Src: Broadcom_68:56:07 (00:10:18:68:56:07)
Dst: Broadcom_68:56:08 (00:10:18:68:56:08)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 100
Internet Protocol Version 4, Src: 192.168.1.4, Dst: 192.168.1.200
Generic Routing Encapsulation (Transparent Ethernet bridging)
Ethernet II, Src: Broadcom_68:58:07 (00:10:18:68:58:07)
Dst: Broadcom_68:58:08 (00:10:18:68:58:08)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 200
...
It should match the (num_of_vlans 1) rule, but it matches
the (num_of_vlans 2) rule.
The vlan tags inside the GRE or other tunnel encapsulated payload
should not be taken into num_of_vlans.
The fix is to stop counting the vlan number when the encapsulation
bit is set.
Fixes: 34951fcf26c5 ("flow_dissector: Add number of vlan tags dissector")
Signed-off-by: Qingqing Yang <qingqing.yang@broadcom.com>
Reviewed-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Link: https://lore.kernel.org/r/20220919074808.136640-1-qingqing.yang@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Added by:
commit e5cf1baf92cb ("act_mirred: use TC_ACT_REINSERT when possible")
but no longer useful.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20220919130627.3551233-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Colin Foster says:
====================
clean up ocelot_reset() routine
ocelot_reset() will soon be exported to a common library to be used by
the ocelot_ext system. This will make error values from regmap calls
possible, so they must be checked. Additionally, readx_poll_timeout()
can be substituted for the custom loop, as a simple cleanup.
I don't have hardware to verify this set directly, but there shouldn't
be any functional changes.
====================
Link: https://lore.kernel.org/r/20220917175127.161504-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ocelot_reset() function utilizes regmap_field_write() but wasn't
checking return values. While this won't cause issues for the current MMIO
regmaps, it could be an issue for externally controlled interfaces.
Add checks for these return values.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clean up the reset code by utilizing readx_poll_timeout instead of a custom
loop.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Haoyue Xu says:
====================
net: ll_temac: Cleanup for clearing static warnings
Most static warnings are detected by Checkpatch.pl, mainly about:
(1) #1: About the comments.
(2) #2: About function name in a string.
(3) #3: About the open parenthesis.
(4) #4: About the else branch.
(6) #6: About trailing statements.
(7) #5,#7: About blank lines and spaces.
====================
Link: https://lore.kernel.org/r/20220917103843.526877-1-xuhaoyue1@hisilicon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleaning some static warnings of indent.
Signed-off-by: huangjunxian <huangjunxian6@hisilicon.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleaning some static warnings of trailing statements.
Signed-off-by: huangjunxian <huangjunxian6@hisilicon.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleaning some static warnings of missing spaces around '='.
Signed-off-by: huangjunxian <huangjunxian6@hisilicon.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleaning some static warnings of unnecessary else branch.
Signed-off-by: huangjunxian <huangjunxian6@hisilicon.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleaning some static warnings of open parenthesis.
Signed-off-by: huangjunxian <huangjunxian6@hisilicon.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As Checkpatch.pl warns, prefer using '"%s...", __func__'
to using 'temac_device_reset', this function's name, in a string.
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleaning some static warnings of block comments.
Signed-off-by: huangjunxian <huangjunxian6@hisilicon.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Reviewed-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add missing __init/__exit annotations to module init/exit funcs.
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Link: https://lore.kernel.org/r/20220917083535.20040-1-xiujianfeng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add missing __init/__exit annotations to module init/exit funcs.
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Link: https://lore.kernel.org/r/20220917082118.7971-1-xiujianfeng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
such expression:
if (err)
return err;
return 0;
can simplify to:
return err;
Signed-off-by: William Dean <williamsukatube@163.com>
Link: https://lore.kernel.org/r/20220917063556.2673-1-williamsukatube@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
am65_cpsw_nuss_common_open()
The inptu parameter 'features' is unused now. so remove it.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/20220917020451.63417-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For a non-preallocated hash map on RT kernel, regular spinlock instead
of raw spinlock is used for bucket lock. The reason is that on RT kernel
memory allocation is forbidden under atomic context and regular spinlock
is sleepable under RT.
Now hash map has been fully converted to use bpf_map_alloc, and there
will be no synchronous memory allocation for non-preallocated hash map,
so it is safe to always use raw spinlock for bucket lock on RT. So
removing the usage of htab_use_raw_lock() and updating the comments
accordingly.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220921073826.2365800-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
We got report from sysbot [1] about warnings that were caused by
bpf program attached to contention_begin raw tracepoint triggering
the same tracepoint by using bpf_trace_printk helper that takes
trace_printk_lock lock.
Call Trace:
<TASK>
? trace_event_raw_event_bpf_trace_printk+0x5f/0x90
bpf_trace_printk+0x2b/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
bpf_trace_printk+0x3f/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
bpf_trace_printk+0x3f/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
bpf_trace_printk+0x3f/0xe0
bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
bpf_trace_run2+0x26/0x90
native_queued_spin_lock_slowpath+0x1c6/0x2b0
_raw_spin_lock_irqsave+0x44/0x50
__unfreeze_partials+0x5b/0x160
...
The can be reproduced by attaching bpf program as raw tracepoint on
contention_begin tracepoint. The bpf prog calls bpf_trace_printk
helper. Then by running perf bench the spin lock code is forced to
take slow path and call contention_begin tracepoint.
Fixing this by skipping execution of the bpf program if it's
already running, Using bpf prog 'active' field, which is being
currently used by trampoline programs for the same reason.
Moving bpf_prog_inc_misses_counter to syscall.c because
trampoline.c is compiled in just for CONFIG_BPF_JIT option.
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Reported-by: syzbot+2251879aa068ad9c960d@syzkaller.appspotmail.com
[1] https://lore.kernel.org/bpf/YxhFe3EwqchC%2FfYf@krava/T/#t
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220916071914.7156-1-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Roberto Sassu says:
====================
One of the desirable features in security is the ability to restrict import
of data to a given system based on data authenticity. If data import can be
restricted, it would be possible to enforce a system-wide policy based on
the signing keys the system owner trusts.
This feature is widely used in the kernel. For example, if the restriction
is enabled, kernel modules can be plugged in only if they are signed with a
key whose public part is in the primary or secondary keyring.
For eBPF, it can be useful as well. For example, it might be useful to
authenticate data an eBPF program makes security decisions on.
After a discussion in the eBPF mailing list, it was decided that the stated
goal should be accomplished by introducing four new kfuncs:
bpf_lookup_user_key() and bpf_lookup_system_key(), for retrieving a keyring
with keys trusted for signature verification, respectively from its serial
and from a pre-determined ID; bpf_key_put(), to release the reference
obtained with the former two kfuncs, bpf_verify_pkcs7_signature(), for
verifying PKCS#7 signatures.
Other than the key serial, bpf_lookup_user_key() also accepts key lookup
flags, that influence the behavior of the lookup. bpf_lookup_system_key()
accepts pre-determined IDs defined in include/linux/verification.h.
bpf_key_put() accepts the new bpf_key structure, introduced to tell whether
the other structure member, a key pointer, is valid or not. The reason is
that verify_pkcs7_signature() also accepts invalid pointers, set with the
pre-determined ID, to select a system-defined keyring. key_put() must be
called only for valid key pointers.
Since the two key lookup functions allocate memory and one increments a key
reference count, they must be used in conjunction with bpf_key_put(). The
latter must be called only if the lookup functions returned a non-NULL
pointer. The verifier denies the execution of eBPF programs that don't
respect this rule.
The two key lookup functions should be used in alternative, depending on
the use case. While bpf_lookup_user_key() provides great flexibility, it
seems suboptimal in terms of security guarantees, as even if the eBPF
program is assumed to be trusted, the serial used to obtain the key pointer
might come from untrusted user space not choosing one that the system
administrator approves to enforce a mandatory policy.
bpf_lookup_system_key() instead provides much stronger guarantees,
especially if the pre-determined ID is not passed by user space but is
hardcoded in the eBPF program, and that program is signed. In this case,
bpf_verify_pkcs7_signature() will always perform signature verification
with a key that the system administrator approves, i.e. the primary,
secondary or platform keyring.
Nevertheless, key permission checks need to be done accurately. Since
bpf_lookup_user_key() cannot determine how a key will be used by other
kfuncs, it has to defer the permission check to the actual kfunc using the
key. It does it by calling lookup_user_key() with KEY_DEFER_PERM_CHECK as
needed permission. Later, bpf_verify_pkcs7_signature(), if called,
completes the permission check by calling key_validate(). It does not need
to call key_task_permission() with permission KEY_NEED_SEARCH, as it is
already done elsewhere by the key subsystem. Future kfuncs using the
bpf_key structure need to implement the proper checks as well.
Finally, the last kfunc, bpf_verify_pkcs7_signature(), accepts the data and
signature to verify as eBPF dynamic pointers, to minimize the number of
kfunc parameters, and the keyring with keys for signature verification as a
bpf_key structure, returned by one of the two key lookup functions.
bpf_lookup_user_key() and bpf_verify_pkcs7_signature() can be called only
from sleepable programs, because of memory allocation and crypto
operations. For example, the lsm.s/bpf attach point is suitable,
fexit/array_map_update_elem is not.
The correctness of implementation of the new kfuncs and of their usage is
checked with the introduced tests.
The patch set includes a patch from another author (dependency) for sake of
completeness. It is organized as follows.
Patch 1 from KP Singh allows kfuncs to be used by LSM programs. Patch 2
exports the bpf_dynptr definition through BTF. Patch 3 splits
is_dynptr_reg_valid_init() and introduces is_dynptr_type_expected(), to
know more precisely the cause of a negative result of a dynamic pointer
check. Patch 4 allows dynamic pointers to be used as kfunc parameters.
Patch 5 exports bpf_dynptr_get_size(), to obtain the real size of data
carried by a dynamic pointer. Patch 6 makes available for new eBPF kfuncs
and programs some key-related definitions. Patch 7 introduces the
bpf_lookup_*_key() and bpf_key_put() kfuncs. Patch 8 introduces the
bpf_verify_pkcs7_signature() kfunc. Patch 9 changes the testing kernel
configuration to compile everything as built-in. Finally, patches 10-13
introduce the tests.
Changelog
v17:
- Remove unnecessary typedefs in test_verify_pkcs7_sig.c (suggested by KP)
- Add patch to export bpf_dynptr through BTF (reported by KP)
- Rename u{8,16,32,64} variables to __u{8,16,32,64} in the tests, for
consistency with other eBPF programs (suggested by Yonghong)
v16:
- Remove comments in include/linux/key.h for KEY_LOOKUP_*
- Change kmalloc() flag from GFP_ATOMIC to GFP_KERNEL in
bpf_lookup_user_key(), as the kfunc needs anyway to be sleepable
(suggested by Kumar)
- Test passing a dynamic pointer with NULL data to
bpf_verify_pkcs7_signature() (suggested by Kumar)
v15:
- Add kfunc_dynptr_param test to deny list for s390x
v14:
- Explain that is_dynptr_type_expected() will be useful also for BTF
(suggested by Joanne)
- Rename KEY_LOOKUP_FLAGS_ALL to KEY_LOOKUP_ALL (suggested by Jarkko)
- Swap declaration of spi and dynptr_type in is_dynptr_type_expected()
(suggested by Joanne)
- Reimplement kfunc dynptr tests with a regular eBPF program instead of
executing them with test_verifier (suggested by Joanne)
- Make key lookup flags as enum so that they are automatically exported
through BTF (suggested by Alexei)
v13:
- Split is_dynptr_reg_valid_init() and introduce is_dynptr_type_expected()
to see if the dynamic pointer type passed as argument to a kfunc is
supported (suggested by Kumar)
- Add forward declaration of struct key in include/linux/bpf.h (suggested
by Song)
- Declare mask for key lookup flags, remove key_lookup_flags_check()
(suggested by Jarkko and KP)
- Allow only certain dynamic pointer types (currently, local) to be passed
as argument to kfuncs (suggested by Kumar)
- For each dynamic pointer parameter in kfunc, additionally check if the
passed pointer is to the stack (suggested by Kumar)
- Split the validity/initialization and dynamic pointer type check also in
the verifier, and adjust the expected error message in the test (a test
for an unexpected dynptr type passed to a helper cannot be added due to
missing suitable helpers, but this case has been tested manually)
- Add verifier tests to check the dynamic pointers passed as argument to
kfuncs (suggested by Kumar)
v12:
- Put lookup_key and verify_pkcs7_sig tests in deny list for s390x (JIT
does not support calling kernel function)
v11:
- Move stringify_struct() macro to include/linux/btf.h (suggested by
Daniel)
- Change kernel configuration options in
tools/testing/selftests/bpf/config* from =m to =y
v10:
- Introduce key_lookup_flags_check() and system_keyring_id_check() inline
functions to check parameters (suggested by KP)
- Fix descriptions and comment of key-related kfuncs (suggested by KP)
- Register kfunc set only once (suggested by Alexei)
- Move needed kernel options to the architecture-independent configuration
for testing
v9:
- Drop patch to introduce KF_SLEEPABLE kfunc flag (already merged)
- Rename valid_ptr member of bpf_key to has_ref (suggested by Daniel)
- Check dynamic pointers in kfunc definition with bpf_dynptr_kern struct
definition instead of string, to detect structure renames (suggested by
Daniel)
- Explicitly say that we permit initialized dynamic pointers in kfunc
definition (suggested by Daniel)
- Remove noinline __weak from kfuncs definition (reported by Daniel)
- Simplify key lookup flags check in bpf_lookup_user_key() (suggested by
Daniel)
- Explain the reason for deferring key permission check (suggested by
Daniel)
- Allocate memory with GFP_ATOMIC in bpf_lookup_system_key(), and remove
KF_SLEEPABLE kfunc flag from kfunc declaration (suggested by Daniel)
- Define only one kfunc set and remove the loop for registration
(suggested by Alexei)
v8:
- Define the new bpf_key structure to carry the key pointer and whether
that pointer is valid or not (suggested by Daniel)
- Drop patch to mark a kfunc parameter with the __maybe_null suffix
- Improve documentation of kfuncs
- Introduce bpf_lookup_system_key() to obtain a key pointer suitable for
verify_pkcs7_signature() (suggested by Daniel)
- Use the new kfunc registration API
- Drop patch to test the __maybe_null suffix
- Add tests for bpf_lookup_system_key()
v7:
- Add support for using dynamic and NULL pointers in kfunc (suggested by
Alexei)
- Add new kfunc-related tests
v6:
- Switch back to key lookup helpers + signature verification (until v5),
and defer permission check from bpf_lookup_user_key() to
bpf_verify_pkcs7_signature()
- Add additional key lookup test to illustrate the usage of the
KEY_LOOKUP_CREATE flag and validate the flags (suggested by Daniel)
- Make description of flags of bpf_lookup_user_key() more user-friendly
(suggested by Daniel)
- Fix validation of flags parameter in bpf_lookup_user_key() (reported by
Daniel)
- Rename bpf_verify_pkcs7_signature() keyring-related parameters to
user_keyring and system_keyring to make their purpose more clear
- Accept keyring-related parameters of bpf_verify_pkcs7_signature() as
alternatives (suggested by KP)
- Replace unsigned long type with u64 in helper declaration (suggested by
Daniel)
- Extend the bpf_verify_pkcs7_signature() test by calling the helper
without data, by ensuring that the helper enforces the keyring-related
parameters as alternatives, by ensuring that the helper rejects
inaccessible and expired keyrings, and by checking all system keyrings
- Move bpf_lookup_user_key() and bpf_key_put() usage tests to
ref_tracking.c (suggested by John)
- Call bpf_lookup_user_key() and bpf_key_put() only in sleepable programs
v5:
- Move KEY_LOOKUP_ to include/linux/key.h
for validation of bpf_verify_pkcs7_signature() parameter
- Remove bpf_lookup_user_key() and bpf_key_put() helpers, and the
corresponding tests
- Replace struct key parameter of bpf_verify_pkcs7_signature() with the
keyring serial and lookup flags
- Call lookup_user_key() and key_put() in bpf_verify_pkcs7_signature()
code, to ensure that the retrieved key is used according to the
permission requested at lookup time
- Clarified keyring precedence in the description of
bpf_verify_pkcs7_signature() (suggested by John)
- Remove newline in the second argument of ASSERT_
- Fix helper prototype regular expression in bpf_doc.py
v4:
- Remove bpf_request_key_by_id(), don't return an invalid pointer that
other helpers can use
- Pass the keyring ID (without ULONG_MAX, suggested by Alexei) to
bpf_verify_pkcs7_signature()
- Introduce bpf_lookup_user_key() and bpf_key_put() helpers (suggested by
Alexei)
- Add lookup_key_norelease test, to ensure that the verifier blocks eBPF
programs which don't decrement the key reference count
- Parse raw PKCS#7 signature instead of module-style signature in the
verify_pkcs7_signature test (suggested by Alexei)
- Parse kernel module in user space and pass raw PKCS#7 signature to the
eBPF program for signature verification
v3:
- Rename bpf_verify_signature() back to bpf_verify_pkcs7_signature() to
avoid managing different parameters for each signature verification
function in one helper (suggested by Daniel)
- Use dynamic pointers and export bpf_dynptr_get_size() (suggested by
Alexei)
- Introduce bpf_request_key_by_id() to give more flexibility to the caller
of bpf_verify_pkcs7_signature() to retrieve the appropriate keyring
(suggested by Alexei)
- Fix test by reordering the gcc command line, always compile sign-file
- Improve helper support check mechanism in the test
v2:
- Rename bpf_verify_pkcs7_signature() to a more generic
bpf_verify_signature() and pass the signature type (suggested by KP)
- Move the helper and prototype declaration under #ifdef so that user
space can probe for support for the helper (suggested by Daniel)
- Describe better the keyring types (suggested by Daniel)
- Include linux/bpf.h instead of vmlinux.h to avoid implicit or
redeclaration
- Make the test selfcontained (suggested by Alexei)
v1:
- Don't define new map flag but introduce simple wrapper of
verify_pkcs7_signature() (suggested by Alexei and KP)
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add tests to ensure that only supported dynamic pointer types are accepted,
that the passed argument is actually a dynamic pointer, that the passed
argument is a pointer to the stack, and that bpf_verify_pkcs7_signature()
correctly handles dynamic pointers with data set to NULL.
The tests are currently in the deny list for s390x (JIT does not support
calling kernel function).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-14-roberto.sassu@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Perform several tests to ensure the correct implementation of the
bpf_verify_pkcs7_signature() kfunc.
Do the tests with data signed with a generated testing key (by using
sign-file from scripts/) and with the tcp_bic.ko kernel module if it is
found in the system. The test does not fail if tcp_bic.ko is not found.
First, perform an unsuccessful signature verification without data.
Second, perform a successful signature verification with the session
keyring and a new one created for testing.
Then, ensure that permission and validation checks are done properly on the
keyring provided to bpf_verify_pkcs7_signature(), despite those checks were
deferred at the time the keyring was retrieved with bpf_lookup_user_key().
The tests expect to encounter an error if the Search permission is removed
from the keyring, or the keyring is expired.
Finally, perform a successful and unsuccessful signature verification with
the keyrings with pre-determined IDs (the last test fails because the key
is not in the platform keyring).
The test is currently in the deny list for s390x (JIT does not support
calling kernel function).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/r/20220920075951.929132-13-roberto.sassu@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a test to ensure that bpf_lookup_user_key() creates a referenced
special keyring when the KEY_LOOKUP_CREATE flag is passed to this function.
Ensure that the kfunc rejects invalid flags.
Ensure that a keyring can be obtained from bpf_lookup_system_key() when one
of the pre-determined keyring IDs is provided.
The test is currently blacklisted for s390x (JIT does not support calling
kernel function).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/r/20220920075951.929132-12-roberto.sassu@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|