Age | Commit message (Collapse) | Author |
|
corruption
This patch adds an extra checks in ext4_mb_mark_bb() function
to make sure we mark & report error if we were to mark/clear any
of the critical FS metadata specific bitmaps (&bail out) to prevent
from any accidental corruption.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/53cbb6f2573db162a57f935365050d8b1df202ee.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Currently ext4_mb_clear_bb() & ext4_group_add_blocks() only checks
whether the given block ranges (which is to be freed) belongs to any FS
metadata blocks or not, of the block's respective block group.
But to detect any FS error early, it is better to add more strict
checkings in those functions which checks whether the given blocks
belongs to any critical FS metadata or not within system-zone.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/ddd9143d064774e32d6364a99667817c6e8bfdc0.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
This API will be needed at places where we don't have an inode
for e.g. while freeing blocks in ext4_group_add_blocks()
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/dd34a236543ad5ae7123eeebe0cb69e6bdd44f34.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
We don't need the return value of mb_test_and_clear_bits() in ext4_mb_mark_bb()
So simply use mb_clear_bits() instead.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/a971935306dafb124da0193c7dad1aa485210b62.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
ext4_set_bits() should actually be mb_set_bits() for uniform API naming
convention.
This is via below cmd -
grep -nr "ext4_set_bits" fs/ext4/ | cut -d ":" -f 1 | xargs sed -i 's/ext4_set_bits/mb_set_bits/g'
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/f1f6ece1405b76a7a987e9145d1adfaf71e30695.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Instead of open coding it, use in_range() function instead.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/8e5526ef14150778871ac7c937c8993c6a09cd3e.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
ext4_free_blocks() function became too long and confusing, this patch
just pulls out the ext4_mb_clear_bb() function logic from it
which clears the block bitmap and frees it.
No functionality change in this patch
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/22c30fbb26ba409cf8aa5f0c7912970272c459e8.1644992610.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
In case of flex_bg feature (which is by default enabled), extents for
any given inode might span across blocks from two different block group.
ext4_mb_mark_bb() only reads the buffer_head of block bitmap once for the
starting block group, but it fails to read it again when the extent length
boundary overflows to another block group. Then in this below loop it
accesses memory beyond the block group bitmap buffer_head and results
into a data abort.
for (i = 0; i < clen; i++)
if (!mb_test_bit(blkoff + i, bitmap_bh->b_data) == !state)
already++;
This patch adds this functionality for checking block group boundary in
ext4_mb_mark_bb() and update the buffer_head(bitmap_bh) for every different
block group.
w/o this patch, I was easily able to hit a data access abort using Power platform.
<...>
[ 74.327662] EXT4-fs error (device loop3): ext4_mb_generate_buddy:1141: group 11, block bitmap and bg descriptor inconsistent: 21248 vs 23294 free clusters
[ 74.533214] EXT4-fs (loop3): shut down requested (2)
[ 74.536705] Aborting journal on device loop3-8.
[ 74.702705] BUG: Unable to handle kernel data access on read at 0xc00000005e980000
[ 74.703727] Faulting instruction address: 0xc0000000007bffb8
cpu 0xd: Vector: 300 (Data Access) at [c000000015db7060]
pc: c0000000007bffb8: ext4_mb_mark_bb+0x198/0x5a0
lr: c0000000007bfeec: ext4_mb_mark_bb+0xcc/0x5a0
sp: c000000015db7300
msr: 800000000280b033
dar: c00000005e980000
dsisr: 40000000
current = 0xc000000027af6880
paca = 0xc00000003ffd5200 irqmask: 0x03 irq_happened: 0x01
pid = 5167, comm = mount
<...>
enter ? for help
[c000000015db7380] c000000000782708 ext4_ext_clear_bb+0x378/0x410
[c000000015db7400] c000000000813f14 ext4_fc_replay+0x1794/0x2000
[c000000015db7580] c000000000833f7c do_one_pass+0xe9c/0x12a0
[c000000015db7710] c000000000834504 jbd2_journal_recover+0x184/0x2d0
[c000000015db77c0] c000000000841398 jbd2_journal_load+0x188/0x4a0
[c000000015db7880] c000000000804de8 ext4_fill_super+0x2638/0x3e10
[c000000015db7a40] c0000000005f8404 get_tree_bdev+0x2b4/0x350
[c000000015db7ae0] c0000000007ef058 ext4_get_tree+0x28/0x40
[c000000015db7b00] c0000000005f6344 vfs_get_tree+0x44/0x100
[c000000015db7b70] c00000000063c408 path_mount+0xdd8/0xe70
[c000000015db7c40] c00000000063c8f0 sys_mount+0x450/0x550
[c000000015db7d50] c000000000035770 system_call_exception+0x4a0/0x4e0
[c000000015db7e10] c00000000000c74c system_call_common+0xec/0x250
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/2609bc8f66fc15870616ee416a18a3d392a209c4.1644992609.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
ext4_mb_mark_bb() currently wrongly calculates cluster len (clen) and
flex_group->free_clusters. This patch fixes that.
Identified based on code review of ext4_mb_mark_bb() function.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/a0b035d536bafa88110b74456853774b64c8ac40.1644992609.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
CONFIG_JBD2_DEBUG and jbd2_journal_enable_debug knobs were added in
update_t_max_wait(), since earlier it used to take a spinlock for
updating t_max_wait, which could cause a bottleneck while starting a
txn (start_this_handle()).
Since in previous patch, we have killed t_handle_lock completely, we
could get rid of this debug config and knob to let t_max_wait be
updated by default again.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/ad7319a601fd501079310747ce87d908e0944763.1644992076.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
This patch kills t_handle_lock transaction spinlock completely from
jbd2.
To explain the reasoning, currently there were three sites at which
this spinlock was used.
1. jbd2_journal_wait_updates()
a. Based on careful code review it can be seen that, we don't need this
lock here. This is since we wait for any currently ongoing updates
based on a atomic variable t_updates. And we anyway don't take any
t_handle_lock while in stop_this_handle().
i.e.
write_lock(&journal->j_state_lock()
jbd2_journal_wait_updates() stop_this_handle()
while (atomic_read(txn->t_updates) { |
DEFINE_WAIT(wait); |
prepare_to_wait(); |
if (atomic_read(txn->t_updates) if (atomic_dec_and_test(txn->t_updates))
write_unlock(&journal->j_state_lock);
schedule(); wake_up()
write_lock(&journal->j_state_lock);
finish_wait();
}
txn->t_state = T_COMMIT
write_unlock(&journal->j_state_lock);
b. Also note that between atomic_inc(&txn->t_updates) in
start_this_handle() and jbd2_journal_wait_updates(), the
synchronization happens via read_lock(journal->j_state_lock) in
start_this_handle();
2. jbd2_journal_extend()
a. jbd2_journal_extend() is called with the handle of each process from
task_struct. So no lock required in updating member fields of handle_t
b. For member fields of h_transaction, all updates happens only via
atomic APIs (which is also within read_lock()).
So, no need of this transaction spinlock.
3. update_t_max_wait()
Based on Jan suggestion, this can be carefully removed using atomic
cmpxchg API.
Note that there can be several processes which are waiting for a new
transaction to be allocated and started. For doing this only one
process will succeed in taking write_lock() and allocating a new txn.
After that all of the process will be updating the t_max_wait (max
transaction wait time). This can be done via below method w/o taking
any locks using atomic cmpxchg.
For more details refer [1]
new = get_new_val();
old = READ_ONCE(ptr->max_val);
while (old < new)
old = cmpxchg(&ptr->max_val, old, new);
[1]: https://lwn.net/Articles/849237/
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/d89e599658b4a1f3893a48c6feded200073037fc.1644992076.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
jbd2_journal_wait_updates() is called with j_state_lock held. But if
there is a commit in progress, then this transaction might get committed
and freed via jbd2_journal_commit_transaction() ->
jbd2_journal_free_transaction(), when we release j_state_lock.
So check for journal->j_running_transaction everytime we release and
acquire j_state_lock to avoid use-after-free issue.
Link: https://lore.kernel.org/r/948c2fed518ae739db6a8f7f83f1d58b504f87d0.1644497105.git.ritesh.list@gmail.com
Fixes: 4f98186848707f53 ("jbd2: refactor wait logic for transaction updates into a common function")
Cc: stable@kernel.org
Reported-and-tested-by: syzbot+afa2ca5171d93e44b348@syzkaller.appspotmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
After commit 6e47a3cc68fc ("ext4: get rid of super block and sbi from
handle_mount_ops()") the 'abort' options stopped working. This is
because we're using ctx_set_mount_flags() helper that's expecting an
argument with the appropriate bit set, but instead got
EXT4_MF_FS_ABORTED which is a bit position. ext4_set_mount_flag() is
using set_bit() while ctx_set_mount_flags() was using bitwise OR.
Create a separate helper ctx_set_mount_flag() to handle setting the
mount_flags correctly.
While we're at it clean up the EXT4_SET_CTX macros so that we're only
creating helpers that we actually use to avoid warnings.
Fixes: 6e47a3cc68fc ("ext4: get rid of super block and sbi from handle_mount_ops()")
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: Ye Bin <yebin10@huawei.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Tested-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Link: https://lore.kernel.org/r/20220201131345.77591-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
To follow the existing per-arch conventions replace open-coded uses
of asm "sp" as "current_stack_pointer". This will let it be used in
non-arch places (like HARDENED_USERCOPY).
Cc: Chris Zankel <chris@zankel.net>
Cc: Marc Zyngier <maz@kernel.org>
Cc: linux-xtensa@linux-xtensa.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Link: https://lore.kernel.org/lkml/CAMo8BfJFJE-n3=AF+pb9_6oF3gzxX7a+7aBrASHjjNX5byqDqw@mail.gmail.com
|
|
One of the things that CONFIG_HARDENED_USERCOPY sanity-checks is whether
an object that is about to be copied to/from userspace is overlapping
the stack at all. If it is, it performs a number of inexpensive
bounds checks. One of the finer-grained checks is whether an object
crosses stack frames within the stack region. Doing this on x86 with
CONFIG_FRAME_POINTER was cheap/easy. Doing it with ORC was deemed too
heavy, and was left out (a while ago), leaving the courser whole-stack
check.
The LKDTM tests USERCOPY_STACK_FRAME_TO and USERCOPY_STACK_FRAME_FROM
try to exercise these cross-frame cases to validate the defense is
working. They have been failing ever since ORC was added (which was
expected). While Muhammad was investigating various LKDTM failures[1],
he asked me for additional details on them, and I realized that when
exact stack frame boundary checking is not available (i.e. everything
except x86 with FRAME_POINTER), it could check if a stack object is at
least "current depth valid", in the sense that any object within the
stack region but not between start-of-stack and current_stack_pointer
should be considered unavailable (i.e. its lifetime is from a call no
longer present on the stack).
Introduce ARCH_HAS_CURRENT_STACK_POINTER to track which architectures
have actually implemented the common global register alias.
Additionally report usercopy bounds checking failures with an offset
from current_stack_pointer, which may assist with diagnosing failures.
The LKDTM USERCOPY_STACK_FRAME_TO and USERCOPY_STACK_FRAME_FROM tests
(once slightly adjusted in a separate patch) pass again with this fixed.
[1] https://github.com/kernelci/kernelci-project/issues/84
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
v1: https://lore.kernel.org/lkml/20220216201449.2087956-1-keescook@chromium.org
v2: https://lore.kernel.org/lkml/20220224060342.1855457-1-keescook@chromium.org
v3: https://lore.kernel.org/lkml/20220225173345.3358109-1-keescook@chromium.org
v4: - improve commit log (akpm)
|
|
When a trace instance creation fails, tools are printing:
Could not enable -> osnoiser <- tracer for tracing
Print the actual (and correct) name of the tracer it fails to enable.
Link: https://lkml.kernel.org/r/53ef0582605af91eca14b19dba9fc9febb95d4f9.1645206561.git.bristot@kernel.org
Fixes: b1696371d865 ("rtla: Helper functions for rtla")
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
The variable that stores the parsed command line arguments are not
being free()d at the rtla osnoise top exit path.
Free params variable before exiting.
Link: https://lkml.kernel.org/r/0be31d8259c7c53b98a39769d60cfeecd8421785.1645206561.git.bristot@kernel.org
Fixes: 1eceb2fc2ca5 ("rtla/osnoise: Add osnoise top mode")
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Currently, --entries uses -e as the short version in the hist mode of
timerlat and osnoise tools. But as -e is already used to enable events
on trace sessions by other tools, thus let's keep it available for the
same usage for all rtla tools.
Make -E the short version of --entries for hist mode on all tools.
Note: rtla was merged in this merge window, so rtla was not released yet.
Link: https://lkml.kernel.org/r/5dbf0cbe7364d3a05e708926b41a097c59a02b1e.1645206561.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS is required to test
direct tramp.
Link: https://lkml.kernel.org/r/bdc7e594e13b0891c1d61bc8d56c94b1890eaed7.1640017960.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Al Viro brought it to my attention that the dentries may not be filled
when the parse_options() is called, causing the call to set_gid() to
possibly crash. It should only be called if parse_options() succeeds
totally anyway.
He suggested the logical place to do the update is in apply_options().
Link: https://lore.kernel.org/all/20220225165219.737025658@goodmis.org/
Link: https://lkml.kernel.org/r/20220225153426.1c4cab6b@gandalf.local.home
Cc: stable@vger.kernel.org
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 48b27b6b5191 ("tracefs: Set all files to the same group ownership as the mount option")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-02-25
The first 2 patches are by Vincent Mailhol and fix the error handling
of the ndo_open callbacks of the etas_es58x and the gs_usb CAN USB
drivers.
The last patch is by Lad Prabhakar and fixes a small race condition in
the rcar_canfd's rcar_canfd_channel_probe() function.
* tag 'linux-can-fixes-for-5.17-20220225' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: rcar_canfd: rcar_canfd_channel_probe(): register the CAN device when fully ready
can: gs_usb: change active_channels's type from atomic_t to u8
can: etas_es58x: change opened_channel_cnt's type from atomic_t to u8
====================
Link: https://lore.kernel.org/r/20220225165622.3231809-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.infradead.org/users/hch/configfs
Pull configfs fix from Christoph Hellwig:
- fix a race in configfs_{,un}register_subsystem (ChenXiaoSong)
* tag 'configfs-5.17-2022-02-25' of git://git.infradead.org/users/hch/configfs:
configfs: fix a race in configfs_{,un}register_subsystem()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"This is a hopefully last batch of fixes for defrag that got broken in
5.16, all stable material.
The remaining reported problem is excessive IO with autodefrag due to
various conditions in the defrag code not met or missing"
* tag 'for-5.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: reduce extent threshold for autodefrag
btrfs: autodefrag: only scan one inode once
btrfs: defrag: don't use merged extent map for their generation check
btrfs: defrag: bring back the old file extent search behavior
btrfs: defrag: remove an ambiguous condition for rejection
btrfs: defrag: don't defrag extents which are already at max capacity
btrfs: defrag: don't try to merge regular extents with preallocated extents
btrfs: defrag: allow defrag_one_cluster() to skip large extent which is not a target
btrfs: prevent copying too big compressed lzo segment
|
|
Pull rdma fixes from Jason Gunthorpe:
- Older "does not even boot" regression in qib from July
- Bug fixes for error unwind in rtrs
- Avoid a deadlock syzkaller found in srp
- Fix another UAF syzkaller found in cma
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/cma: Do not change route.addr.src_addr outside state checks
RDMA/ib_srp: Fix a deadlock
RDMA/rtrs-clt: Move free_permit from free_clt to rtrs_clt_close
RDMA/rtrs-clt: Fix possible double free in error case
IB/qib: Fix duplicate sysfs directory name
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix an bug generating spurious interrupts in gpio-rockchip
- fix a race condition in gpiod_to_irq() called by GPIO consumers
* tag 'gpio-fixes-for-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: Return EPROBE_DEFER if gc->to_irq is NULL
gpio: rockchip: Reset int_bothedge when changing trigger
|
|
If the state is not idle then resolve_prepare_src() should immediately
fail and no change to global state should happen. However, it
unconditionally overwrites the src_addr trying to build a temporary any
address.
For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():
if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)
Which would manifest as this trace from syzkaller:
BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204
CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
__kasan_report mm/kasan/report.c:399 [inline]
kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
__list_add_valid+0x93/0xa0 lib/list_debug.c:26
__list_add include/linux/list.h:67 [inline]
list_add_tail include/linux/list.h:100 [inline]
cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
vfs_write+0x28e/0xa30 fs/read_write.c:603
ksys_write+0x1ee/0x250 fs/read_write.c:658
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
This is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().
Instead of trying to re-use the src_addr memory to indirectly create an
any address derived from the dst build one explicitly on the stack and
bind to that as any other normal flow would do. rdma_bind_addr() will copy
it over the src_addr once it knows the state is valid.
This is similar to commit bc0bdc5afaa7 ("RDMA/cma: Do not change
route.addr.src_addr.ss_family")
Link: https://lore.kernel.org/r/0-v2-e975c8fd9ef2+11e-syz_cma_srcaddr_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 732d41c545bb ("RDMA/cma: Make the locking for automatic state transition more clear")
Reported-by: syzbot+c94a3675a626f6333d74@syzkaller.appspotmail.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few small driver specific fixes"
* tag 'spi-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: rockchip: terminate dma transmission when slave abort
spi: rockchip: Fix error in getting num-cs property
spi: spi-zynq-qspi: Fix a NULL pointer dereference in zynq_qspi_exec_mem_op()
|
|
These ioctls are equivalent to fcntl(fd, F_SETFD, flags), which SELinux
always allows too. Furthermore, a failed FIOCLEX could result in a file
descriptor being leaked to a process that should not have access to it.
As this patch removes access controls, a policy capability needs to be
enabled in policy to always allow these ioctls.
Based-on-patch-by: Demi Marie Obenour <demiobenour@gmail.com>
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A series of fixes for the da9121 driver"
* tag 'regulator-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: da9121: Remove surplus DA9141 parameters
regulator: da9121: Fix DA914x voltage value
regulator: da9121: Fix DA914x current values
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix for interrupt controllers which require the explicit
acknowledgement of interrupts using a different register to the one
where interrupts are reported.
Urgent for the few devices this affects"
* tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Update interrupt clear register for proper reset
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix a memory leak in the int340x thermal driver's ACPI notify handler
(Chuansheng Liu)"
* tag 'thermal-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: fix memory leak in int3400_notify()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Fix the throttle IRQ handling during cpufreq initialization on
Qualcomm platforms (Bjorn Andersson)"
* tag 'pm-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: qcom-hw: Delay enabling throttle_irq
cpufreq: Reintroduce ready() callback
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a few small driver fixes for 5.17-rc6 for reported issues.
The majority of these are IIO fixes for small things, and the other
two are a mvmem and mtd core conflict fix.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
mtd: core: Fix a conflict between MTD and NVMEM on wp-gpios property
nvmem: core: Fix a conflict between MTD and NVMEM on wp-gpios property
iio: imu: st_lsm6dsx: wait for settling time in st_lsm6dsx_read_oneshot
iio: Fix error handling for PM
iio: addac: ad74413r: correct comparator gpio getters mask usage
iio: addac: ad74413r: use ngpio size when iterating over mask
iio: addac: ad74413r: Do not reference negative array offsets
iio: adc: men_z188_adc: Fix a resource leak in an error handling path
iio: frequency: admv1013: remove the always true condition
iio: accel: fxls8962af: add padding to regmap for SPI
iio:imu:adis16480: fix buffering for devices with no burst mode
iio: adc: ad7124: fix mask used for setting AIN_BUFP & AIN_BUFM bits
iio: adc: tsc2046: fix memory corruption by preventing array overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix for 5.17-rc6. It resolves a reported
problem when the DMA map of a device is not properly released.
It has been in linux-next with no reported problems"
* tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Free DMA range map when device is released
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fix from Greg KH:
"Here is a single staging driver fix for 5.17-rc6.
It resolves a reported problem in the fbtft fb_st7789v.c driver that
could cause the display to be flipped in cold weather.
It has been in linux-next with no reported problems"
* tag 'staging-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: fbtft: fb_st7789v: reset display before initialization
|
|
For some reason, the Microsoft Surface Go 3 uses the standard ACPI
interface for battery information, but does not use the standard PNP0C0A
HID. Instead it uses MSHW0146 as identifier. Add that ID to the driver
as this seems to work well.
Additionally, the power state is not updated immediately after the AC
has been (un-)plugged, so add the respective quirk for that.
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add documentation for the newly added attributes:
fine_grain_control
fan_speed_rpm
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add additional attributes, which helps in implementing algorithm in
the user space to optimize fan control. These attributes are presented
in the same directory as the existing performance state attributes.
Additional attributes:
1. Support of fine grain control
Publish support of presence of fine grain control so that fan speed
can be tuned correctly. This attribute is called "fine_grain_control".
2. fan speed
Publish the actual fan rpm in sysfs. Knowing fan rpm is helpful to
reduce noise level and use passive control instead. Also fan performance
may not be same over time, so the same control value may not be enough
to run the fan at a speed. So a feedback value of speed is helpful. This
sysfs attribute is called "fan_speed_rpm".
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When _FIF object specifies support for fine grain control, then fan speed
can be set from 0 to 100% with the recommended minimum "step size" via
_FSL object. Here the control value doesn't need to match any value from
_FPS object.
Currently we have a simple solution implemented which just pick maximum
control value from _FPS to display the actual state, but this is not
optimal when there is a big window between two control values in
_FPS. Also there is no way to set to any speed which doesn't match
control values in _FPS. The system firmware can start the fan at speed
which doesn't match any control value.
To support fine grain control (when supported) via thermal sysfs:
- cooling device max state is not _FPS state count but it will be
100 / _FIF.step_size
Step size can be from 1 to 9.
- cooling device current state is _FST.control / _FIF.step_size
- cooling device set state will set the control value
cdev.curr_state * _FIF.step_size plus any adjustment for 100%.
By the spec, when control value do not sum to 100% because of
_FIF.step_size, OSPM may select an appropriate ending Level increment
to reach 100%.
There is no rounding during calculation. For example if step size
is 6:
thermal sysfs cooling device max_state = 100/6 = 16
So user can set any value from 0-16.
If the system boots with a _FST.control which is not multiples
of step_size, the thermal sysfs cur_state will be based on the
range. For example for step size = 6:
_FST.control thermal sysfs cur_state
------------------------------------------------
0-5 0
6-11 1
..
..
90-95 15
96-100 16
While setting the _FST.control, the compensation will be at
the last step for cur_state = 16, which will set the _FST.control
to 100.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
We don't need u64 to store the information about _FIF. There are two
booleans (fine_grain_ctrl and low_speed_notification) and one field
step_size which can take value from 1-9. There are no internal users
of revision field. So convert all fields to u8, by not directly
extracting the _FIF info the struct. Use an intermediate buffer to
extract and assign.
This will help to do u32 math using these fields. No functional
changes are expected.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Move the functionality of creation of sysfs attributes under acpi device
to a new file fan_attr.c. This cleans up the core fan code, which just
use thermal sysfs interface. The original fan.c is renamed to
fan_core.c.
No functional changes are expected.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
When user get/set cur_state fails, it should be some negative error
value instead of whatever returned by acpi_evaluate_object() or from
acpi_execute_simple_method(). The return value from these apis is
some positive values greater than 0. For example if AE_NOT_FOUND
is returned it will be "5".
In other ACPI drivers, -ENODEV is returned when ACPI_FAILURE(status)
is true. Do the same thing here for thermal sysfs callbacks for
get and set for failures.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small n_gsm and sc16is7xx serial driver fixes for
5.17-rc6.
The n_gsm fixes are from Siemens as it seems they are using the line
discipline and fixing up a number of issues they found in their
testing. The sc16is7xx serial driver fix is for a reported problem
with that chip.
All of these have been in linux-next with no reported problems"
* tag 'tty-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
sc16is7xx: Fix for incorrect data being transmitted
tty: n_gsm: fix deadlock in gsmtty_open()
tty: n_gsm: fix wrong modem processing in convergence layer type 2
tty: n_gsm: fix wrong tty control line for flow control
tty: n_gsm: fix NULL pointer access due to DLCI release
tty: n_gsm: fix proper link termination after failed open
tty: n_gsm: fix encoding of command/response bit
tty: n_gsm: fix encoding of control signal octet bit DV
|
|
The setup of __IAVF_RESETTING state in watchdog task had no
effect and could lead to slow resets in the driver as
the task for __IAVF_RESETTING state only requeues watchdog.
Till now the __IAVF_RESETTING was interpreted by reset task
as running state which could lead to errors with allocating
and resources disposal.
Make watchdog_task queue the reset task when it's necessary.
Do not update the state to __IAVF_RESETTING so the reset task
knows exactly what is the current state of the adapter.
Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The driver was queueing reset_task regardless of the netdev
state.
Do not queue the reset task in iavf_change_mtu if netdev
is not running.
Fixes: fdd4044ffdc8 ("iavf: Remove timer for work triggering, use delaying work instead")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There exists a missing mutex_unlock call on crit_lock in
iavf_reset_task call path.
Unlock the crit_lock before returning from reset task.
Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When iavf_init_version_check sends VIRTCHNL_OP_GET_VF_RESOURCES
message, the driver will wait for the response after requeueing
the watchdog task in iavf_init_get_resources call stack. The
logic is implemented this way that iavf_init_get_resources has
to be called in order to allocate adapter->vf_res. It is polling
for the AQ response in iavf_get_vf_config function. Expect a
call trace from kernel when adminq_task worker handles this
message first. adapter->vf_res will be NULL in
iavf_virtchnl_completion.
Make the watchdog task not queue the adminq_task if the init
process is not finished yet.
Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
iavf_virtchnl_completion is called under crit_lock but when
the code for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is called,
this lock is released in order to obtain rtnl_lock to avoid
ABBA deadlock with unregister_netdev.
Along with the new way iavf_remove behaves, there exist
many risks related to the lock release and attmepts to regrab
it. The driver faces crashes related to races between
unregister_netdev and netdev_update_features. Yet another
risk is that the driver could already obtain the crit_lock
in order to destroy it and iavf_virtchnl_completion could
crash or block forever.
Make iavf_virtchnl_completion never relock crit_lock in it's
call paths.
Extract rtnl_lock locking logic to the driver for
unregister_netdev in order to set the netdev_registered flag
inside the lock.
Introduce a new flag that will inform adminq_task to perform
the code from VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS right after
it finishes processing messages. Guard this code with remove
flags so it's never called when the driver is in remove state.
Fixes: 5951a2b9812d ("iavf: Fix VLAN feature flags after VFR")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When init states of the adapter work, the errors like lack
of communication with the PF might hop in. If such events
occur the driver restores previous states in order to retry
initialization in a proper way. When remove task kicks in,
this situation could lead to races with unregistering the
netdevice as well as resources cleanup. With the commit
introducing the waiting in remove for init to complete,
this problem turns into an endless waiting if init never
recovers from errors.
Introduce __IAVF_IN_REMOVE_TASK bit to indicate that the
remove thread has started.
Make __IAVF_COMM_FAILED adapter state respect the
__IAVF_IN_REMOVE_TASK bit and set the __IAVF_INIT_FAILED
state and return without any action instead of trying to
recover.
Make __IAVF_INIT_FAILED adapter state respect the
__IAVF_IN_REMOVE_TASK bit and return without any further
actions.
Make the loop in the remove handler break when adapter has
__IAVF_INIT_FAILED state set.
Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
There exist races when port is being configured and remove is
triggered.
unregister_netdev is not and can't be called under crit_lock
mutex since it is calling ndo_stop -> iavf_close which requires
this lock. Depending on init state the netdev could be still
unregistered so unregister_netdev never cleans up, when shortly
after that the device could become registered.
Make iavf_remove wait until port finishes initialization.
All critical state changes are atomic (under crit_lock).
Crashes that come from iavf_reset_interrupt_capability and
iavf_free_traffic_irqs should now be solved in a graceful
manner.
Fixes: 605ca7c5c6707 ("iavf: Fix kernel BUG in free_msi_irqs")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Phani Burra <phani.r.burra@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|