Age | Commit message (Collapse) | Author |
|
Pull smack update from Casey Schaufler:
"A single change to remove a pointless assignment"
* tag 'Smack-for-5.19' of https://github.com/cschaufler/smack-next:
smack: Remove redundant assignments
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull Landlock updates from Mickaël Salaün:
- improve the path_rename LSM hook implementations for RENAME_EXCHANGE;
- fix a too-restrictive filesystem control for a rare corner case;
- set the nested sandbox limitation to 16 layers;
- add a new LANDLOCK_ACCESS_FS_REFER access right to properly handle
file reparenting (i.e. full rename and link support);
- add new tests and documentation;
- format code with clang-format to make it easier to maintain and
contribute.
* tag 'landlock-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: (30 commits)
landlock: Explain how to support Landlock
landlock: Add design choices documentation for filesystem access rights
landlock: Document good practices about filesystem policies
landlock: Document LANDLOCK_ACCESS_FS_REFER and ABI versioning
samples/landlock: Add support for file reparenting
selftests/landlock: Add 11 new test suites dedicated to file reparenting
landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER
LSM: Remove double path_rename hook calls for RENAME_EXCHANGE
landlock: Move filesystem helpers and add a new one
landlock: Fix same-layer rule unions
landlock: Create find_rule() from unmask_layers()
landlock: Reduce the maximum number of layers to 16
landlock: Define access_mask_t to enforce a consistent access mask size
selftests/landlock: Test landlock_create_ruleset(2) argument check ordering
landlock: Change landlock_restrict_self(2) check ordering
landlock: Change landlock_add_rule(2) argument check ordering
selftests/landlock: Add tests for O_PATH
selftests/landlock: Fully test file rename with "remove" access
selftests/landlock: Extend access right tests to directories
selftests/landlock: Add tests for unknown access rights
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"We've got twelve patches queued for v5.19, with most being fairly
minor. The highlights are below:
- The checkreqprot and runtime disable knobs have been deprecated for
some time with no active users that we can find. In an effort to
move things along we are adding a pause when the knobs are used to
help make the deprecation more noticeable in case anyone is still
using these hacks in the shadows.
- We've added the anonymous inode class name to the AVC audit records
when anonymous inodes are involved. This should make writing policy
easier when anonymous inodes are involved.
- More constification work. This is fairly straightforward and the
source of most of the diffstat.
- The usual minor cleanups: remove unnecessary assignments, assorted
style/checkpatch fixes, kdoc fixes, macro while-loop
encapsulations, #include tweaks, etc"
* tag 'selinux-pr-20220523' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
security: declare member holding string literal const
selinux: log anon inode class name
selinux: declare data arrays const
selinux: fix indentation level of mls_ops block
selinux: include necessary headers in headers
selinux: avoid extra semicolon
selinux: update parameter documentation
selinux: resolve checkpatch errors
selinux: don't sleep when CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE is true
selinux: checkreqprot is deprecated, add some ssleep() discomfort
selinux: runtime disable is deprecated, add some ssleep() discomfort
selinux: Remove redundant assignments
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Fix binfmt_flat GOT handling for riscv (Niklas Cassel)
- Remove unused/broken binfmt_flat shared library and coredump code
(Eric W. Biederman)
* tag 'execve-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
binfmt_flat: Remove shared library support
binfmt_flat: Drop vestiges of coredump support
binfmt_flat: do not stop relocating GOT entries prematurely on riscv
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
- Rework USER_NOTIF notification ordering and kill logic (Sargun
Dhillon)
- Improved PTRACE_O_SUSPEND_SECCOMP selftest (Jann Horn)
- Gracefully handle failed unshare() in selftests (Yang Guang)
- Spelling fix (Colin Ian King)
* tag 'seccomp-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
selftests/seccomp: Fix spelling mistake "Coud" -> "Could"
selftests/seccomp: Add test for wait killable notifier
selftests/seccomp: Refactor get_proc_stat to split out file reading code
seccomp: Add wait_killable semantic to seccomp user notifier
selftests/seccomp: Ensure that notifications come in FIFO order
seccomp: Use FIFO semantics to order notifications
selftests/seccomp: Add SKIP for failed unshare()
selftests/seccomp: Test PTRACE_O_SUSPEND_SECCOMP without CAP_SYS_ADMIN
|
|
Make the test_dummy_encryption mount option require that the encrypt
feature flag be already enabled on the filesystem, rather than
automatically enabling it. Practically, this means that "-O encrypt"
will need to be included in MKFS_OPTIONS when running xfstests with the
test_dummy_encryption mount option. (ext4/053 also needs an update.)
Moreover, as long as the preconditions for test_dummy_encryption are
being tightened anyway, take the opportunity to start rejecting it when
!CONFIG_FS_ENCRYPTION rather than ignoring it.
The motivation for requiring the encrypt feature flag is that:
- Having the filesystem auto-enable feature flags is problematic, as it
bypasses the usual sanity checks. The specific issue which came up
recently is that in kernel versions where ext4 supports casefold but
not encrypt+casefold (v5.1 through v5.10), the kernel will happily add
the encrypt flag to a filesystem that has the casefold flag, making it
unmountable -- but only for subsequent mounts, not the initial one.
This confused the casefold support detection in xfstests, causing
generic/556 to fail rather than be skipped.
- The xfstests-bld test runners (kvm-xfstests et al.) already use the
required mkfs flag, so they will not be affected by this change. Only
users of test_dummy_encryption alone will be affected. But, this
option has always been for testing only, so it should be fine to
require that the few users of this option update their test scripts.
- f2fs already requires it (for its equivalent feature flag).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Link: https://lore.kernel.org/r/20220519204437.61645-1-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Hulk Robot reported a BUG_ON:
==================================================================
kernel BUG at fs/ext4/extents_status.c:199!
[...]
RIP: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline]
RIP: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217
[...]
Call Trace:
ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766
ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561
ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964
ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384
ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567
ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980
ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031
ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257
v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63
v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82
vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368
dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490
ext4_quota_enable fs/ext4/super.c:6137 [inline]
ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163
ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754
mount_bdev+0x2e9/0x3b0 fs/super.c:1158
mount_fs+0x4b/0x1e4 fs/super.c:1261
[...]
==================================================================
Above issue may happen as follows:
-------------------------------------
ext4_fill_super
ext4_enable_quotas
ext4_quota_enable
ext4_iget
__ext4_iget
ext4_ext_check_inode
ext4_ext_check
__ext4_ext_check
ext4_valid_extent_entries
Check for overlapping extents does't take effect
dquot_enable
vfs_load_quota_inode
v2_check_quota_file
v2_read_header
ext4_quota_read
ext4_bread
ext4_getblk
ext4_map_blocks
ext4_ext_map_blocks
ext4_find_extent
ext4_cache_extents
ext4_es_cache_extent
ext4_es_cache_extent
__es_tree_search
ext4_es_end
BUG_ON(es->es_lblk + es->es_len < es->es_lblk)
The error ext4 extents is as follows:
0af3 0300 0400 0000 00000000 extent_header
00000000 0100 0000 12000000 extent1
00000000 0100 0000 18000000 extent2
02000000 0400 0000 14000000 extent3
In the ext4_valid_extent_entries function,
if prev is 0, no error is returned even if lblock<=prev.
This was intended to skip the check on the first extent, but
in the error image above, prev=0+1-1=0 when checking the second extent,
so even though lblock<=prev, the function does not return an error.
As a result, bug_ON occurs in __es_tree_search and the system panics.
To solve this problem, we only need to check that:
1. The lblock of the first extent is not less than 0.
2. The lblock of the next extent is not less than
the next block of the previous extent.
The same applies to extent_idx.
Cc: stable@kernel.org
Fixes: 5946d089379a ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
A maliciously corrupted filesystem can contain cycles in the h-tree
stored inside a directory. That can easily lead to the kernel corrupting
tree nodes that were already verified under its hands while doing a node
split and consequently accessing unallocated memory. Fix the problem by
verifying traversed block numbers are unique.
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220518093332.13986-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Before splitting a directory block verify its directory entries are sane
so that the splitting code does not access memory it should not.
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220518093332.13986-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that
we are in the middle of replay the fast commit journal. This was
actually a mistake, since the sbi->s_mount_info is initialized from
es->s_state. Arguably s_mount_state is misleadingly named, but the
name is historical --- s_mount_state and s_state dates back to ext2.
What should have been used is the ext4_{set,clear,test}_mount_flag()
inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags.
The problem with using EXT4_FC_REPLAY is that a maliciously corrupted
superblock could result in EXT4_FC_REPLAY getting set in
s_mount_state. This bypasses some sanity checks, and this can trigger
a BUG() in ext4_es_cache_extent(). As a easy-to-backport-fix, filter
out the EXT4_FC_REPLAY bit for now. We should eventually transition
away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY.
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20220420192312.1655305-1-phind.uet@gmail.com
Link: https://lore.kernel.org/r/20220517174028.942119-1-tytso@mit.edu
Reported-by: syzbot+c7358a3cd05ee786eb31@syzkaller.appspotmail.com
|
|
This adds caching of the directory entries for a cached directory while we keep
a lease on the directory.
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Before this patch, function bh_get used block_map to figure out the
block it needed to read in from the quota_change file. This patch
changes it to use iomap directly to make it more efficient.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Before this patch, functions gfs2_qa_get and _put used the i_rw_mutex to
prevent simultaneous access to its i_qadata. But i_rw_mutex is now used
for many other things, including iomap_begin and end, which causes a
conflict according to lockdep. We cannot just remove the lock since
simultaneous opens (gfs2_open -> gfs2_open_common -> gfs2_qa_get) can
then stomp on each others values for i_qadata.
This patch solves the conflict by using the i_lock spin_lock in the inode
to prevent simultaneous access.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The bug that 27ca8273f ("gfs2: Make sure FITRIM minlen is rounded up to
fs block size") fixes was a little confusing as the user saw
"Input/output error" which masked the -EINVAL that sb_issue_discard()
returned.
sb_issue_discard() can fail for various reasons, so we should return its
return value from gfs2_rgrp_send_discards() to avoid all errors being
reported as IO errors.
This improves error reporting for FITRIM and makes no difference to the
-o discard code path because the return value from
gfs2_rgrp_send_discards() gets thrown away in that case (and the option
switches off). Presumably that's why it was ok to just return -EIO in
the past, before FITRIM was implemented.
Tested with xfstests.
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Clang's structure layout randomization feature gets upset when it sees
struct address_space (which is randomized) cast to struct gfs2_glock.
This is due to seeing the mapping pointer as being treated as an array
of gfs2_glock, rather than "something else, before struct address_space":
In file included from fs/gfs2/acl.c:23:
fs/gfs2/meta_io.h:44:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct gfs2_glock *'
return (((struct gfs2_glock *)mapping) - 1)->gl_name.ln_sbd;
^
Replace the instances of open-coded pointer math with container_of()
usage, and update the allocator to match.
Some cleanups and conversion of gfs2_glock_get() and
gfs2_glock_dealloc() by Andreas.
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202205041550.naKxwCBj-lkp@intel.com
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Bill Wendling <morbo@google.com>
Cc: cluster-devel@redhat.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Add some comments explaining the oddities of partial direct I/O reads
and writes.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
- usercopy hardening expanded to check other allocation types (Matthew
Wilcox, Yuanzheng Song)
- arm64 stackleak behavioral improvements (Mark Rutland)
- arm64 CFI code gen improvement (Sami Tolvanen)
- LoadPin LSM block dev API adjustment (Christoph Hellwig)
- Clang randstruct support (Bill Wendling, Kees Cook)
* tag 'kernel-hardening-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (34 commits)
loadpin: stop using bdevname
mm: usercopy: move the virt_addr_valid() below the is_vmalloc_addr()
gcc-plugins: randstruct: Remove cast exception handling
af_unix: Silence randstruct GCC plugin warning
niu: Silence randstruct warnings
big_keys: Use struct for internal payload
gcc-plugins: Change all version strings match kernel
randomize_kstack: Improve docs on requirements/rationale
lkdtm/stackleak: fix CONFIG_GCC_PLUGIN_STACKLEAK=n
arm64: entry: use stackleak_erase_on_task_stack()
stackleak: add on/off stack variants
lkdtm/stackleak: check stack boundaries
lkdtm/stackleak: prevent unexpected stack usage
lkdtm/stackleak: rework boundary management
lkdtm/stackleak: avoid spurious failure
stackleak: rework poison scanning
stackleak: rework stack high bound handling
stackleak: clarify variable names
stackleak: rework stack low bound handling
stackleak: remove redundant check
...
|
|
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity updates from Eric Biggers:
"A couple small cleanups for fs/verity/"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: Use struct_size() helper in enable_verity()
fs-verity: remove unused parameter desc_size in fsverity_create_info()
|
|
Pull fscrypt updates from Eric Biggers:
"Some cleanups for fs/crypto/:
- Split up the misleadingly-named FS_CRYPTO_BLOCK_SIZE constant.
- Consistently report the encryption implementation that is being
used.
- Add helper functions for the test_dummy_encryption mount option
that work properly with the new mount API. ext4 and f2fs will use
these"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: add new helper functions for test_dummy_encryption
fscrypt: factor out fscrypt_policy_to_key_spec()
fscrypt: log when starting to use inline encryption
fscrypt: split up FS_CRYPTO_BLOCK_SIZE
|
|
After allowing channels to reconnect in parallel, it now
becomes important to take care that multiple processes do not
call negotiate/session setup in parallel on the same channel.
This change avoids that by marking a channel as "in_reconnect".
During session setup if the channel in question has this flag
set, we return immediately.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
ses->status today shares statusEnum with server->tcpStatus.
This has been confusing, and tcon->status has deviated to use
a new enum. Follow suit and use new enum for ses_status as well.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Recent changes to multichannel to allow channel reconnects to
work in parallel and independent of each other did so by
making use of tcpStatus for the connection, and status for the
session. However, this did not take into account the multiuser
scenario, where same connection is used by multiple connections.
However, tcpStatus should be tracked only till the end of
negotiate exchange, and not used for session setup. This change
fixes this.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"These updates continue to refine the work began in 5.17 and 5.18 of
modernizing the RNG's crypto and streamlining and documenting its
code.
New for 5.19, the updates aim to improve entropy collection methods
and make some initial decisions regarding the "premature next" problem
and our threat model. The cloc utility now reports that random.c is
931 lines of code and 466 lines of comments, not that basic metrics
like that mean all that much, but at the very least it tells you that
this is very much a manageable driver now.
Here's a summary of the various updates:
- The random_get_entropy() function now always returns something at
least minimally useful. This is the primary entropy source in most
collectors, which in the best case expands to something like RDTSC,
but prior to this change, in the worst case it would just return 0,
contributing nothing. For 5.19, additional architectures are wired
up, and architectures that are entirely missing a cycle counter now
have a generic fallback path, which uses the highest resolution
clock available from the timekeeping subsystem.
Some of those clocks can actually be quite good, despite the CPU
not having a cycle counter of its own, and going off-core for a
stamp is generally thought to increase jitter, something positive
from the perspective of entropy gathering. Done very early on in
the development cycle, this has been sitting in next getting some
testing for a while now and has relevant acks from the archs, so it
should be pretty well tested and fine, but is nonetheless the thing
I'll be keeping my eye on most closely.
- Of particular note with the random_get_entropy() improvements is
MIPS, which, on CPUs that lack the c0 count register, will now
combine the high-speed but short-cycle c0 random register with the
lower-speed but long-cycle generic fallback path.
- With random_get_entropy() now always returning something useful,
the interrupt handler now collects entropy in a consistent
construction.
- Rather than comparing two samples of random_get_entropy() for the
jitter dance, the algorithm now tests many samples, and uses the
amount of differing ones to determine whether or not jitter entropy
is usable and how laborious it must be. The problem with comparing
only two samples was that if the cycle counter was extremely slow,
but just so happened to be on the cusp of a change, the slowness
wouldn't be detected. Taking many samples fixes that to some
degree.
This, combined with the other improvements to random_get_entropy(),
should make future unification of /dev/random and /dev/urandom
maybe more possible. At the very least, were we to attempt it again
today (we're not), it wouldn't break any of Guenter's test rigs
that broke when we tried it with 5.18. So, not today, but perhaps
down the road, that's something we can revisit.
- We attempt to reseed the RNG immediately upon waking up from system
suspend or hibernation, making use of the various timestamps about
suspend time and such available, as well as the usual inputs such
as RDRAND when available.
- Batched randomness now falls back to ordinary randomness before the
RNG is initialized. This provides more consistent guarantees to the
types of random numbers being returned by the various accessors.
- The "pre-init injection" code is now gone for good. I suspect you
in particular will be happy to read that, as I recall you
expressing your distaste for it a few months ago. Instead, to avoid
a "premature first" issue, while still allowing for maximal amount
of entropy availability during system boot, the first 128 bits of
estimated entropy are used immediately as it arrives, with the next
128 bits being buffered. And, as before, after the RNG has been
fully initialized, it winds up reseeding anyway a few seconds later
in most cases. This resulted in a pretty big simplification of the
initialization code and let us remove various ad-hoc mechanisms
like the ugly crng_pre_init_inject().
- The RNG no longer pretends to handle the "premature next" security
model, something that various academics and other RNG designs have
tried to care about in the past. After an interesting mailing list
thread, these issues are thought to be a) mainly academic and not
practical at all, and b) actively harming the real security of the
RNG by delaying new entropy additions after a potential compromise,
making a potentially bad situation even worse. As well, in the
first place, our RNG never even properly handled the premature next
issue, so removing an incomplete solution to a fake problem was
particularly nice.
This allowed for numerous other simplifications in the code, which
is a lot cleaner as a consequence. If you didn't see it before,
https://lore.kernel.org/lkml/YmlMGx6+uigkGiZ0@zx2c4.com/ may be a
thread worth skimming through.
- While the interrupt handler received a separate code path years ago
that avoids locks by using per-cpu data structures and a faster
mixing algorithm, in order to reduce interrupt latency, input and
disk events that are triggered in hardirq handlers were still
hitting locks and more expensive algorithms. Those are now
redirected to use the faster per-cpu data structures.
- Rather than having the fake-crypto almost-siphash-based random32
implementation be used right and left, and in many places where
cryptographically secure randomness is desirable, the batched
entropy code is now fast enough to replace that.
- As usual, numerous code quality and documentation cleanups. For
example, the initialization state machine now uses enum symbolic
constants instead of just hard coding numbers everywhere.
- Since the RNG initializes once, and then is always initialized
thereafter, a pretty heavy amount of code used during that
initialization is never used again. It is now completely cordoned
off using static branches and it winds up in the .text.unlikely
section so that it doesn't reduce cache compactness after the RNG
is ready.
- A variety of functions meant for waiting on the RNG to be
initialized were only used by vsprintf, and in not a particularly
optimal way. Replacing that usage with a more ordinary setup made
it possible to remove those functions.
- A cleanup of how we warn userspace about the use of uninitialized
/dev/urandom and uninitialized get_random_bytes() usage.
Interestingly, with the change you merged for 5.18 that attempts to
use jitter (but does not block if it can't), the majority of users
should never see those warnings for /dev/urandom at all now, and
the one for in-kernel usage is mainly a debug thing.
- The file_operations struct for /dev/[u]random now implements
.read_iter and .write_iter instead of .read and .write, allowing it
to also implement .splice_read and .splice_write, which makes
splice(2) work again after it was broken here (and in many other
places in the tree) during the set_fs() removal. This was a bit of
a last minute arrival from Jens that hasn't had as much time to
bake, so I'll be keeping my eye on this as well, but it seems
fairly ordinary. Unfortunately, read_iter() is around 3% slower
than read() in my tests, which I'm not thrilled about. But Jens and
Al, spurred by this observation, seem to be making progress in
removing the bottlenecks on the iter paths in the VFS layer in
general, which should remove the performance gap for all drivers.
- Assorted other bug fixes, cleanups, and optimizations.
- A small SipHash cleanup"
* tag 'random-5.19-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (49 commits)
random: check for signals after page of pool writes
random: wire up fops->splice_{read,write}_iter()
random: convert to using fops->write_iter()
random: convert to using fops->read_iter()
random: unify batched entropy implementations
random: move randomize_page() into mm where it belongs
random: remove mostly unused async readiness notifier
random: remove get_random_bytes_arch() and add rng_has_arch_random()
random: move initialization functions out of hot pages
random: make consistent use of buf and len
random: use proper return types on get_random_{int,long}_wait()
random: remove extern from functions in header
random: use static branch for crng_ready()
random: credit architectural init the exact amount
random: handle latent entropy and command line from random_init()
random: use proper jiffies comparison macro
random: remove ratelimiting for in-kernel unseeded randomness
random: move initialization out of reseeding hot path
random: avoid initializing twice in credit race
random: use symbolic constants for crng_init states
...
|
|
Jonathan Lemon says:
====================
ptp: ocp: various updates
Collection of cleanups and updates to the timecard.
====================
Link: https://lore.kernel.org/r/20220519212153.450437-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Right now it's possible to flash any kind of binary via devlink and
break the card easily. This diff adds an optional header check when
installing the firmware.
Signed-off-by: Vadim Fedorenko <vadfed@fb.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The NTL timecard design has a PPS1 selector which selects the
the PPS source automatically, according to Section 1.9 of the
documentation.
If there is a SMA PPS input detected:
- send signal to MAC and PPS slave selector.
If there is a MAC PPS input detected:
- send GNSS1 to the MAC
- send MAC to the PPS slave
If there is a GNSS1 input detected:
- send GNSS1 to the MAC
- send GNSS1 to the PPS slave.MAC
Change the debugfs summary so it reflects the correct mapping,
for assistance in debugging. No functional change.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Create an .init function for the op vector, and a corresponding
wrapper function, for different sma mapping setups.
Add a default_fcn to the sma information, and use it when displaying
information for pins which have fixed functions.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move the SMA get and set functions into an operations vector for
different boards.
Create wrappers for the accessor functions.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ocp selectors are all constant, so label them as such.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Group the sma input/output tables together and select the correct
group from the bp information. This allows adding new groups with
different sma mappings.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Preparse the firmware image information into loader/tag/version,
and set the fw capabilities based on the tag/version.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Celestica is producing card with their own vendor id and device id.
Add these ids to driver to support this card.
Signed-off-by: Vadim Fedorenko <vadfed@fb.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These #ifdefs are not required, so remove them.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use 'resource_size_t' instead of 'unsigned long' when computing the
pci start address, for the benefit of 32-bit platforms.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 8c3b8dc5cc9bf6d273ebe18b16e2d6882bcfb36d.
Some rollback issue will be fixed in other patches in the future.
Link: https://lore.kernel.org/all/20220523055056.2078994-1-liuyacan@corp.netease.com/
Fixes: 8c3b8dc5cc9b ("net/smc: fix listen processing for SMC-Rv2")
Signed-off-by: liuyacan <liuyacan@corp.netease.com>
Link: https://lore.kernel.org/r/20220524090230.2140302-1-liuyacan@corp.netease.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
KGDB and KDB allow read and write access to kernel memory, and thus
should be restricted during lockdown. An attacker with access to a
serial port (for example, via a hypervisor console, which some cloud
vendors provide over the network) could trigger the debugger so it is
important that the debugger respect the lockdown mode when/if it is
triggered.
Fix this by integrating lockdown into kdb's existing permissions
mechanism. Unfortunately kgdb does not have any permissions mechanism
(although it certainly could be added later) so, for now, kgdb is simply
and brutally disabled by immediately exiting the gdb stub without taking
any action.
For lockdowns established early in the boot (e.g. the normal case) then
this should be fine but on systems where kgdb has set breakpoints before
the lockdown is enacted than "bad things" will happen.
CVE: CVE-2022-21499
Co-developed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Clang does not support this option so the build fails:
error: unknown warning option '-Wno-dangling-pointer' [-Werror,-Wunknown-warning-option]
Use cc-disable-warning so that the option is only added when it is
supported.
Fixes: bd1d129daa3e ("wifi: ath6k: silence false positive -Wno-dangling-pointer warning on GCC 12")
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220524145655.869822-1-nathan@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Updates to scheduler metrics:
- PELT fixes & enhancements
- PSI fixes & enhancements
- Refactor cpu_util_without()
- Updates to instrumentation/debugging:
- Remove sched_trace_*() helper functions - can be done via debug
info
- Fix double update_rq_clock() warnings
- Introduce & use "preemption model accessors" to simplify some of the
Kconfig complexity.
- Make softirq handling RT-safe.
- Misc smaller fixes & cleanups.
* tag 'sched-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
topology: Remove unused cpu_cluster_mask()
sched: Reverse sched_class layout
sched/deadline: Remove superfluous rq clock update in push_dl_task()
sched/core: Avoid obvious double update_rq_clock warning
smp: Make softirq handling RT safe in flush_smp_call_function_queue()
smp: Rename flush_smp_call_function_from_idle()
sched: Fix missing prototype warnings
sched/fair: Remove cfs_rq_tg_path()
sched/fair: Remove sched_trace_*() helper functions
sched/fair: Refactor cpu_util_without()
sched/fair: Revise comment about lb decision matrix
sched/psi: report zeroes for CPU full at the system level
sched/fair: Delete useless condition in tg_unthrottle_up()
sched/fair: Fix cfs_rq_clock_pelt() for throttled cfs_rq
sched/fair: Move calculate of avg_load to a better location
mailmap: Update my email address to @redhat.com
MAINTAINERS: Add myself as scheduler topology reviewer
psi: Fix trigger being fired unexpectedly at initial
ftrace: Use preemption model accessors for trace header printout
kcsan: Use preemption model accessors
|
|
One of the concessions we made to get our driver upstream was to remove
the diagnostic packet support. There is however still some cruft that was
left over. Remove it.
Link: https://lore.kernel.org/r/20220520183727.48973.93587.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There is no need to have separate user and kernel software versions. There
is a single software that the kernel is compatible with.
Also remove the notion of a "kernel type" that is long since deprecated.
Link: https://lore.kernel.org/r/20220520183722.48973.60262.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Driver versions have long been forbidden in the RDMA subsystem. We removed
most of the code relating to them and have been very strict about not
allowing. However there is some leftover versioning that we do not
need. Get rid of that.
Link: https://lore.kernel.org/r/20220520183717.48973.17418.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When multiplying of different types, an overflow is possible even when
storing the result in a larger type. This is because the conversion is
done after the multiplication. So arithmetic overflow and thus in
incorrect value is possible.
Correct an instance of this in the inter packet delay calculation. Fix by
ensuring one of the operands is u64 which will promote the other to u64 as
well ensuring no overflow.
Cc: stable@vger.kernel.org
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20220520183712.48973.29855.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If the hfi1 module is loaded with HFI1_CAP_SDMA off, a call to
hfi1_write_iter() will dereference a NULL pointer and panic. A typical
stack frame is:
sdma_select_user_engine [hfi1]
hfi1_user_sdma_process_request [hfi1]
hfi1_write_iter [hfi1]
do_iter_readv_writev
do_iter_write
vfs_writev
do_writev
do_syscall_64
The fix is to test for SDMA in hfi1_write_iter() and fail the I/O with
EINVAL.
Link: https://lore.kernel.org/r/20220520183706.48973.79803.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
"Platform PMU changes:
- x86/intel:
- Add new Intel Alder Lake and Raptor Lake support
- x86/amd:
- AMD Zen4 IBS extensions support
- Add AMD PerfMonV2 support
- Add AMD Fam19h Branch Sampling support
Generic changes:
- signal: Deliver SIGTRAP on perf event asynchronously if blocked
Perf instrumentation can be driven via SIGTRAP, but this causes a
problem when SIGTRAP is blocked by a task & terminate the task.
Allow user-space to request these signals asynchronously (after
they get unblocked) & also give the information to the signal
handler when this happens:
"To give user space the ability to clearly distinguish
synchronous from asynchronous signals, introduce
siginfo_t::si_perf_flags and TRAP_PERF_FLAG_ASYNC (opted for
flags in case more binary information is required in future).
The resolution to the problem is then to (a) no longer force the
signal (avoiding the terminations), but (b) tell user space via
si_perf_flags if the signal was synchronous or not, so that such
signals can be handled differently (e.g. let user space decide
to ignore or consider the data imprecise). "
- Unify/standardize the /sys/devices/cpu/events/* output format.
- Misc fixes & cleanups"
* tag 'perf-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
perf/x86/amd/core: Fix reloading events for SVM
perf/x86/amd: Run AMD BRS code only on supported hw
perf/x86/amd: Fix AMD BRS period adjustment
perf/x86/amd: Remove unused variable 'hwc'
perf/ibs: Fix comment
perf/amd/ibs: Advertise zen4_ibs_extensions as pmu capability attribute
perf/amd/ibs: Add support for L3 miss filtering
perf/amd/ibs: Use ->is_visible callback for dynamic attributes
perf/amd/ibs: Cascade pmu init functions' return value
perf/x86/uncore: Add new Alder Lake and Raptor Lake support
perf/x86/uncore: Clean up uncore_pci_ids[]
perf/x86/cstate: Add new Alder Lake and Raptor Lake support
perf/x86/msr: Add new Alder Lake and Raptor Lake support
perf/x86: Add new Alder Lake and Raptor Lake support
perf/amd/ibs: Use interrupt regs ip for stack unwinding
perf/x86/amd/core: Add PerfMonV2 overflow handling
perf/x86/amd/core: Add PerfMonV2 counter control
perf/x86/amd/core: Detect available counters
perf/x86/amd/core: Detect PerfMonV2 support
x86/msr: Add PerfCntrGlobal* registers
...
|
|
If there is a failure during probe of hfi1 before the sdma_map_lock is
initialized, the call to hfi1_free_devdata() will attempt to use a lock
that has not been initialized. If the locking correctness validator is on
then an INFO message and stack trace resembling the following may be seen:
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
Call Trace:
register_lock_class+0x11b/0x880
__lock_acquire+0xf3/0x7930
lock_acquire+0xff/0x2d0
_raw_spin_lock_irq+0x46/0x60
sdma_clean+0x42a/0x660 [hfi1]
hfi1_free_devdata+0x3a7/0x420 [hfi1]
init_one+0x867/0x11a0 [hfi1]
pci_device_probe+0x40e/0x8d0
The use of sdma_map_lock in sdma_clean() is for freeing the sdma_map
memory, and sdma_map is not allocated/initialized until after
sdma_map_lock has been initialized. This code only needs to be run if
sdma_map is not NULL, and so checking for that condition will avoid trying
to use the lock before it is initialized.
Fixes: 473291b3ea0e ("IB/hfi1: Fix for early release of sdma context")
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20220520183701.48973.72434.stgit@awfm-01.cornelisnetworks.com
Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Comprehensive interface overhaul:
=================================
Objtool's interface has some issues:
- Several features are done unconditionally, without any way to
turn them off. Some of them might be surprising. This makes
objtool tricky to use, and prevents porting individual features
to other arches.
- The config dependencies are too coarse-grained. Objtool
enablement is tied to CONFIG_STACK_VALIDATION, but it has several
other features independent of that.
- The objtool subcmds ("check" and "orc") are clumsy: "check" is
really a subset of "orc", so it has all the same options.
The subcmd model has never really worked for objtool, as it only
has a single purpose: "do some combination of things on an object
file".
- The '--lto' and '--vmlinux' options are nonsensical and have
surprising behavior.
Overhaul the interface:
- get rid of subcmds
- make all features individually selectable
- remove and/or clarify confusing/obsolete options
- update the documentation
- fix some bugs found along the way
- Fix x32 regression
- Fix Kbuild cleanup bugs
- Add scripts/objdump-func helper script to disassemble a single
function from an object file.
- Rewrite scripts/faddr2line to be section-aware, by basing it on
'readelf', moving it away from 'nm', which doesn't handle multiple
sections well, which can result in decoding failure.
- Rewrite & fix symbol handling - which had a number of bugs wrt.
object files that don't have global symbols - which is rare but
possible. Also fix a bunch of symbol handling bugs found along the
way.
* tag 'objtool-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
objtool: Fix objtool regression on x32 systems
objtool: Fix symbol creation
scripts/faddr2line: Fix overlapping text section failures
scripts: Create objdump-func helper script
objtool: Remove libsubcmd.a when make clean
objtool: Remove inat-tables.c when make clean
objtool: Update documentation
objtool: Remove --lto and --vmlinux in favor of --link
objtool: Add HAVE_NOINSTR_VALIDATION
objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION"
objtool: Make noinstr hacks optional
objtool: Make jump label hack optional
objtool: Make static call annotation optional
objtool: Make stack validation frame-pointer-specific
objtool: Add CONFIG_OBJTOOL
objtool: Extricate sls from stack validation
objtool: Rework ibt and extricate from stack validation
objtool: Make stack validation optional
objtool: Add option to print section addresses
objtool: Don't print parentheses in function addresses
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- rwsem cleanups & optimizations/fixes:
- Conditionally wake waiters in reader/writer slowpaths
- Always try to wake waiters in out_nolock path
- Add try_cmpxchg64() implementation, with arch optimizations - and use
it to micro-optimize sched_clock_{local,remote}()
- Various force-inlining fixes to address objdump instrumentation-check
warnings
- Add lock contention tracepoints:
lock:contention_begin
lock:contention_end
- Misc smaller fixes & cleanups
* tag 'locking-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote}
locking/atomic/x86: Introduce arch_try_cmpxchg64
locking/atomic: Add generic try_cmpxchg64 support
futex: Remove a PREEMPT_RT_FULL reference.
locking/qrwlock: Change "queue rwlock" to "queued rwlock"
lockdep: Delete local_irq_enable_in_hardirq()
locking/mutex: Make contention tracepoints more consistent wrt adaptive spinning
locking: Apply contention tracepoints in the slow path
locking: Add lock contention tracepoints
locking/rwsem: Always try to wake waiters in out_nolock path
locking/rwsem: Conditionally wake waiters in reader/writer slowpaths
locking/rwsem: No need to check for handoff bit if wait queue empty
lockdep: Fix -Wunused-parameter for _THIS_IP_
x86/mm: Force-inline __phys_addr_nodebug()
x86/kvm/svm: Force-inline GHCB accessors
task_stack, x86/cea: Force-inline stack helpers
|
|
The commit in the Fixes tag has shuffled some code.
Now 'mcg_num' is incremented before the kzalloc(). So if the memory
allocation fails, this increment must be undone.
Fixes: a926a903b7dc ("RDMA/rxe: Do not call dev_mc_add/del() under a spinlock")
Link: https://lore.kernel.org/r/fe137cd8b1f17593243aa73d59c18ea71ab9ee36.1653225896.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Following patches have dependencies.
Resolve the merge conflict in
drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names
for the fs functions following linux-next:
https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Commit 61f60bac8c05 ("gcc-plugins: Change all version strings match
kernel") broke parallel builds.
Instead of adding the dependency between GCC plugins and utsrelease.h,
let's use KERNELVERSION, which does not require any build artifact.
Another reason why I want to avoid utsrelease.h is because it depends
on CONFIG_LOCALVERSION(_AUTO) and localversion* files.
(include/generated/utsrelease.h depends on include/config/kernel.release,
which is generated by scripts/setlocalversion)
I want to keep host tools independent of the kernel configuration.
There is no good reason to rebuild GCC plugins just because of
CONFIG_LOCALVERSION being changed.
We just want to associate the plugin versions with the kernel source
version. KERNELVERSION should be enough for our purpose.
Fixes: 61f60bac8c05 ("gcc-plugins: Change all version strings match kernel")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-mm/202205230239.EZxeZ3Fv-lkp@intel.com
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220524135541.1453693-1-masahiroy@kernel.org
|