Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2020-06-16
This series contains fixes to e1000 and e1000e.
Chen fixes an e1000e issue where systems could be waken via WoL, even
though the user has disabled the wakeup bit via sysfs.
Vaibhav Gupta updates the e1000 driver to clean up the legacy Power
Management hooks.
Arnd Bergmann cleans up the inconsistent use CONFIG_PM_SLEEP
preprocessor tags, which also resolves the compiler warnings about the
possibility of unused structure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Verify that setns() reports EINVAL when an fd is passed that refers to an
open file but the file is not a file descriptor useable to interact with
namespaces.
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Cyril Hrubis <chrubis@suse.cz>
Link: https://lore.kernel.org/lkml/20200615085836.GR12456@shao2-debian
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
The CONFIG_PM_SLEEP #ifdef checks in this file are inconsistent,
leading to a warning about sometimes unused function:
drivers/net/ethernet/intel/e1000e/netdev.c:137:13: error: unused function 'e1000e_check_me' [-Werror,-Wunused-function]
Rather than adding more #ifdefs, just remove them completely
and mark the PM functions as __maybe_unused to let the compiler
work it out on it own.
Fixes: e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME systems")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
With legacy PM hooks, it was the responsibility of a driver to manage PCI
states and also the device's power state. The generic approach is to let PCI
core handle the work.
e1000_suspend() calls __e1000_shutdown() to perform intermediate tasks.
__e1000_shutdown() modifies the value of "wake" (device should be wakeup
enabled or not), responsible for controlling the flow of legacy PM.
Since, PCI core has no idea about the value of "wake", new code for generic
PM may produce unexpected results. Thus, use "device_set_wakeup_enable()"
to wakeup-enable the device accordingly.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Currently the system will be woken up via WOL(Wake On LAN) even if the
device wakeup ability has been disabled via sysfs:
cat /sys/devices/pci0000:00/0000:00:1f.6/power/wakeup
disabled
The system should not be woken up if the user has explicitly
disabled the wake up ability for this device.
This patch clears the WOL ability of this network device if the
user has disabled the wake up ability in sysfs.
Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver")
Reported-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
The LTP testsuite reported a regression where users would now see EBADF
returned instead of EINVAL when an fd was passed that referred to an open
file but the file was not a nsfd. Fix this by continuing to report EINVAL.
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Cyril Hrubis <chrubis@suse.cz>
Link: https://lore.kernel.org/lkml/20200615085836.GR12456@shao2-debian
Fixes: 303cc571d107 ("nsproxy: attach to namespaces via pidfds")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
Without a MODULE_DEVICE_TABLE the attributes are missing that create
an alias for auto-loading the module in userspace via hotplug.
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix AFS's silly rename by the following means:
(1) Set the destination directory in afs_do_silly_rename() so as to avoid
misbehaviour and indicate that the directory data version will
increment by 1 so as to avoid warnings about unexpected changes in the
DV. Also indicate that the ctime should be updated to avoid xfstest
grumbling.
(2) Note when the server indicates that a directory changed more than we
expected (AFS_OPERATION_DIR_CONFLICT), indicating a conflict with a
third party change, checking on successful completion of unlink and
rename.
The problem is that the FS.RemoveFile RPC op doesn't report the status
of the unlinked file, though YFS.RemoveFile2 does. This can be
mitigated by the assumption that if the directory DV cranked by
exactly 1, we can be sure we removed one link from the file; further,
ordinarily in AFS, files cannot be hardlinked across directories, so
if we reduce nlink to 0, the file is deleted.
However, if the directory DV jumps by more than 1, we cannot know if a
third party intervened by adding or removing a link on the file we
just removed a link from.
The same also goes for any vnode that is at the destination of the
FS.Rename RPC op.
(3) Make afs_vnode_commit_status() apply the nlink drop inside the cb_lock
section along with the other attribute updates if ->op_unlinked is set
on the descriptor for the appropriate vnode.
(4) Issue a follow up status fetch to the unlinked file in the event of a
third party conflict that makes it impossible for us to know if we
actually deleted the file or not.
(5) Provide a flag, AFS_VNODE_SILLY_DELETED, to make afs_getattr() lie to
the user about the nlink of a silly deleted file so that it appears as
0, not 1.
Found with the generic/035 and generic/084 xfstests.
Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
The port's headroom buffers are used to store packets while they
traverse the device's pipeline and also to store packets that are egress
mirrored.
On Spectrum-3, ports with eight lanes use two headroom buffers between
which the configured headroom size is split.
In order to prevent packet loss, multiply the calculated headroom size
by two for 8x ports.
Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Code to initialize the conf structure while gathering the configuration
of the device was missing.
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Martin <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The remove function does not destroy all
BM Pools when per cpu pool is active.
When reloading the mvpp2 as a module the BM Pools
are still active in hardware and due to the bug
have twice the size now old + new.
This eventually leads to a kernel crash.
v2:
* add Fixes tag
Fixes: 7d04b0b13b11 ("mvpp2: percpu buffers")
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Back in 2013, we made a change that broke fast retransmit
for non SACK flows.
Indeed, for these flows, a sender needs to receive three duplicate
ACK before starting fast retransmit. Sending ACK with different
receive window do not count.
Even if enabling SACK is strongly recommended these days,
there still are some cases where it has to be disabled.
Not increasing the window seems better than having to
rely on RTO.
After the fix, following packetdrill test gives :
// Initialize connection
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 8>
+0 < . 1:1(0) ack 1 win 514
+0 accept(3, ..., ...) = 4
+0 < . 1:1001(1000) ack 1 win 514
// Quick ack
+0 > . 1:1(0) ack 1001 win 264
+0 < . 2001:3001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 3001:4001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 4001:5001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 1001:2001(1000) ack 1 win 514
// Hole is repaired.
+0 > . 1:1(0) ack 5001 win 272
Fixes: 4e4f1fc22681 ("tcp: properly increase rcv_ssthresh for ofo packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Unable to handle kernel NULL pointer dereference at virtual address 00000918
pgd = (ptrval)
[00000918] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-15001-gfa384b50b96b-dirty #514
Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support)
PC is at mcde_display_enable+0x78/0x7c0
LR is at mcde_display_enable+0x78/0x7c0
Fix this by using to_mcde() as in other functions.
Fixes: fd7ee85cfe7b ("drm/mcde: Don't use drm_device->dev_private")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200613223027.4189309-2-linus.walleij@linaro.org
|
|
The following bug appeared in the MCDE driver/display
initialization during the recent merge window.
First the place we call drm_fbdev_generic_setup() in the
wrong place: this needs to be called AFTER calling
drm_dev_register() else we get this splat:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at ../drivers/gpu/drm/drm_fb_helper.c:2198 drm_fbdev_generic_setup+0x164/0x1a8
mcde a0350000.mcde: Device has not been registered.
Modules linked in:
Hardware name: ST-Ericsson Ux5x0 platform (Device Tree Support)
[<c010e704>] (unwind_backtrace) from [<c010a86c>] (show_stack+0x10/0x14)
[<c010a86c>] (show_stack) from [<c0414f38>] (dump_stack+0x9c/0xb0)
[<c0414f38>] (dump_stack) from [<c0121c8c>] (__warn+0xb8/0xd0)
[<c0121c8c>] (__warn) from [<c0121d18>] (warn_slowpath_fmt+0x74/0xb8)
[<c0121d18>] (warn_slowpath_fmt) from [<c04b154c>] (drm_fbdev_generic_setup+0x164/0x1a8)
[<c04b154c>] (drm_fbdev_generic_setup) from [<c04ed278>] (mcde_drm_bind+0xc4/0x160)
[<c04ed278>] (mcde_drm_bind) from [<c04f06b8>] (try_to_bring_up_master+0x15c/0x1a4)
(...)
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200613223027.4189309-1-linus.walleij@linaro.org
|
|
The "ie_len" variable is in the 0-255 range and it comes from the
network. If it's over NDIS_802_11_LENGTH_RATES_EX (16) then that will
lead to memory corruption.
Fixes: 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200603101958.GA1845750@mwanda
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The function hif_scan() return the timeout for the completion of the
scan request. It is the only function from hif_tx.c that return another
thing than just an error code. This behavior is not coherent with the
rest of file. Worse, if value returned is positive, the caller can't
make say if it is a timeout or the value returned by the hardware.
Uniformize API with other HIF functions, only return the error code and
pass timeout with parameters.
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200529121256.1045521-1-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
It is guarantee that the loop will stop at first iteration. So drop the
loop.
Fixes: 6bf418c50f98a ("staging: wfx: change the way to choose frame to send")
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200529121603.1050891-2-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In order to work properly all the queues of the device must be filled (the
device chooses itself the queue to use depending of AC parameters and
other things). It is the job of wfx_tx_queues_get_skb() to choose which
queue must be filled. However, the sorting algorithm was inverted, so it
prioritized the already filled queue! Consequently, the AC priorities was
badly broken.
Fixes: 6bf418c50f98a ("staging: wfx: change the way to choose frame to send")
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20200529121603.1050891-1-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fix from Lee Jones:
"Fix NULL pointer dereference in mt6360 driver"
* tag 'mfd-fixes-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: mt6360: Fix register driver NULL pointer by adding driver name
|
|
When I squashed the 'allnoconfig' compiler warning about the
set_sve_default_vl() function being defined but not used in commit
1e570f512cbd ("arm64/sve: Eliminate data races on sve_default_vl"), I
accidentally broke the build for configs where ARM64_SVE is enabled, but
SYSCTL is not.
Fix this by only compiling the SVE sysctl support if both CONFIG_SVE=y
and CONFIG_SYSCTL=y.
Cc: Dave Martin <Dave.Martin@arm.com>
Reported-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200616131808.GA1040@lca.pw
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In btrfs_ioctl_get_subvol_info(), there is a classic case where kzalloc()
was incorrectly paired with kzfree(). According to David Sterba, there
isn't any sensitive information in the subvol_info that needs to be
cleared before freeing. So kzfree() isn't really needed, use kfree()
instead.
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
A RWF_NOWAIT write is not supposed to wait on filesystem locks that can be
held for a long time or for ongoing IO to complete.
However when calling check_can_nocow(), if the inode has prealloc extents
or has the NOCOW flag set, we can block on extent (file range) locks
through the call to btrfs_lock_and_flush_ordered_range(). Such lock can
take a significant amount of time to be available. For example, a fiemap
task may be running, and iterating through the entire file range checking
all extents and doing backref walking to determine if they are shared,
or a readpage operation may be in progress.
Also at btrfs_lock_and_flush_ordered_range(), called by check_can_nocow(),
after locking the file range we wait for any existing ordered extent that
is in progress to complete. Another operation that can take a significant
amount of time and defeat the purpose of RWF_NOWAIT.
So fix this by trying to lock the file range and if it's currently locked
return -EAGAIN to user space. If we are able to lock the file range without
waiting and there is an ordered extent in the range, return -EAGAIN as
well, instead of waiting for it to complete. Finally, don't bother trying
to lock the snapshot lock of the root when attempting a RWF_NOWAIT write,
as that is only important for buffered writes.
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If we attempt to do a RWF_NOWAIT write against a file range for which we
can only do NOCOW for a part of it, due to the existence of holes or
shared extents for example, we proceed with the write as if it were
possible to NOCOW the whole range.
Example:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/sdj/bar
$ chattr +C /mnt/sdj/bar
$ xfs_io -d -c "pwrite -S 0xab -b 256K 0 256K" /mnt/bar
wrote 262144/262144 bytes at offset 0
256 KiB, 1 ops; 0.0003 sec (694.444 MiB/sec and 2777.7778 ops/sec)
$ xfs_io -c "fpunch 64K 64K" /mnt/bar
$ sync
$ xfs_io -d -c "pwrite -N -V 1 -b 128K -S 0xfe 0 128K" /mnt/bar
wrote 131072/131072 bytes at offset 0
128 KiB, 1 ops; 0.0007 sec (160.051 MiB/sec and 1280.4097 ops/sec)
This last write should fail with -EAGAIN since the file range from 64K to
128K is a hole. On xfs it fails, as expected, but on ext4 it currently
succeeds because apparently it is expensive to check if there are extents
allocated for the whole range, but I'll check with the ext4 people.
Fix the issue by checking if check_can_nocow() returns a number of
NOCOW'able bytes smaller then the requested number of bytes, and if it
does return -EAGAIN.
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If we attempt to write to prealloc extent located after eof using a
RWF_NOWAIT write, we always fail with -EAGAIN.
We do actually check if we have an allocated extent for the write at
the start of btrfs_file_write_iter() through a call to check_can_nocow(),
but later when we go into the actual direct IO write path we simply
return -EAGAIN if the write starts at or beyond EOF.
Trivial to reproduce:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foo
$ chattr +C /mnt/foo
$ xfs_io -d -c "pwrite -S 0xab 0 64K" /mnt/foo
wrote 65536/65536 bytes at offset 0
64 KiB, 16 ops; 0.0004 sec (135.575 MiB/sec and 34707.1584 ops/sec)
$ xfs_io -c "falloc -k 64K 1M" /mnt/foo
$ xfs_io -d -c "pwrite -N -V 1 -S 0xfe -b 64K 64K 64K" /mnt/foo
pwrite: Resource temporarily unavailable
On xfs and ext4 the write succeeds, as expected.
Fix this by removing the wrong check at btrfs_direct_IO().
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
If we do a successful RWF_NOWAIT write we end up locking the snapshot lock
of the inode, through a call to check_can_nocow(), but we never unlock it.
This means the next attempt to create a snapshot on the subvolume will
hang forever.
Trivial reproducer:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foobar
$ chattr +C /mnt/foobar
$ xfs_io -d -c "pwrite -S 0xab 0 64K" /mnt/foobar
$ xfs_io -d -c "pwrite -N -V 1 -S 0xfe 0 64K" /mnt/foobar
$ btrfs subvolume snapshot -r /mnt /mnt/snap
--> hangs
Fix this by unlocking the snapshot lock if check_can_nocow() returned
success.
Fixes: edf064e7c6fec3 ("btrfs: nowait aio support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This brings back an optimization that commit e678934cbe5f02 ("btrfs:
Remove unnecessary check from join_running_log_trans") removed, but in
a different form. So it's almost equivalent to a revert.
That commit removed an optimization where we avoid locking a root's
log_mutex when there is no log tree created in the current transaction.
The affected code path is triggered through unlink operations.
That commit was based on the assumption that the optimization was not
necessary because we used to have the following checks when the patch
was authored:
int btrfs_del_dir_entries_in_log(...)
{
(...)
if (dir->logged_trans < trans->transid)
return 0;
ret = join_running_log_trans(root);
(...)
}
int btrfs_del_inode_ref_in_log(...)
{
(...)
if (inode->logged_trans < trans->transid)
return 0;
ret = join_running_log_trans(root);
(...)
}
However before that patch was merged, another patch was merged first which
replaced those checks because they were buggy.
That other patch corresponds to commit 803f0f64d17769 ("Btrfs: fix fsync
not persisting dentry deletions due to inode evictions"). The assumption
that if the logged_trans field of an inode had a smaller value then the
current transaction's generation (transid) meant that the inode was not
logged in the current transaction was only correct if the inode was not
evicted and reloaded in the current transaction. So the corresponding bug
fix changed those checks and replaced them with the following helper
function:
static bool inode_logged(struct btrfs_trans_handle *trans,
struct btrfs_inode *inode)
{
if (inode->logged_trans == trans->transid)
return true;
if (inode->last_trans == trans->transid &&
test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags) &&
!test_bit(BTRFS_FS_LOG_RECOVERING, &trans->fs_info->flags))
return true;
return false;
}
So if we have a subvolume without a log tree in the current transaction
(because we had no fsyncs), every time we unlink an inode we can end up
trying to lock the log_mutex of the root through join_running_log_trans()
twice, once for the inode being unlinked (by btrfs_del_inode_ref_in_log())
and once for the parent directory (with btrfs_del_dir_entries_in_log()).
This means if we have several unlink operations happening in parallel for
inodes in the same subvolume, and the those inodes and/or their parent
inode were changed in the current transaction, we end up having a lot of
contention on the log_mutex.
The test robots from intel reported a -30.7% performance regression for
a REAIM test after commit e678934cbe5f02 ("btrfs: Remove unnecessary check
from join_running_log_trans").
So just bring back the optimization to join_running_log_trans() where we
check first if a log root exists before trying to lock the log_mutex. This
is done by checking for a bit that is set on the root when a log tree is
created and removed when a log tree is freed (at transaction commit time).
Commit e678934cbe5f02 ("btrfs: Remove unnecessary check from
join_running_log_trans") was merged in the 5.4 merge window while commit
803f0f64d17769 ("Btrfs: fix fsync not persisting dentry deletions due to
inode evictions") was merged in the 5.3 merge window. But the first
commit was actually authored before the second commit (May 23 2019 vs
June 19 2019).
Reported-by: kernel test robot <rong.a.chen@intel.com>
Link: https://lore.kernel.org/lkml/20200611090233.GL12456@shao2-debian/
Fixes: e678934cbe5f02 ("btrfs: Remove unnecessary check from join_running_log_trans")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When balance and scrub are running in parallel it is possible to end up
with an underflow of the bytes_may_use counter of the data space_info
object, which triggers a warning like the following:
[134243.793196] BTRFS info (device sdc): relocating block group 1104150528 flags data
[134243.806891] ------------[ cut here ]------------
[134243.807561] WARNING: CPU: 1 PID: 26884 at fs/btrfs/space-info.h:125 btrfs_add_reserved_bytes+0x1da/0x280 [btrfs]
[134243.808819] Modules linked in: btrfs blake2b_generic xor (...)
[134243.815779] CPU: 1 PID: 26884 Comm: kworker/u8:8 Tainted: G W 5.6.0-rc7-btrfs-next-58 #5
[134243.816944] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[134243.818389] Workqueue: writeback wb_workfn (flush-btrfs-108483)
[134243.819186] RIP: 0010:btrfs_add_reserved_bytes+0x1da/0x280 [btrfs]
[134243.819963] Code: 0b f2 85 (...)
[134243.822271] RSP: 0018:ffffa4160aae7510 EFLAGS: 00010287
[134243.822929] RAX: 000000000000c000 RBX: ffff96159a8c1000 RCX: 0000000000000000
[134243.823816] RDX: 0000000000008000 RSI: 0000000000000000 RDI: ffff96158067a810
[134243.824742] RBP: ffff96158067a800 R08: 0000000000000001 R09: 0000000000000000
[134243.825636] R10: ffff961501432a40 R11: 0000000000000000 R12: 000000000000c000
[134243.826532] R13: 0000000000000001 R14: ffffffffffff4000 R15: ffff96158067a810
[134243.827432] FS: 0000000000000000(0000) GS:ffff9615baa00000(0000) knlGS:0000000000000000
[134243.828451] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[134243.829184] CR2: 000055bd7e414000 CR3: 00000001077be004 CR4: 00000000003606e0
[134243.830083] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[134243.830975] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[134243.831867] Call Trace:
[134243.832211] find_free_extent+0x4a0/0x16c0 [btrfs]
[134243.832846] btrfs_reserve_extent+0x91/0x180 [btrfs]
[134243.833487] cow_file_range+0x12d/0x490 [btrfs]
[134243.834080] fallback_to_cow+0x82/0x1b0 [btrfs]
[134243.834689] ? release_extent_buffer+0x121/0x170 [btrfs]
[134243.835370] run_delalloc_nocow+0x33f/0xa30 [btrfs]
[134243.836032] btrfs_run_delalloc_range+0x1ea/0x6d0 [btrfs]
[134243.836725] ? find_lock_delalloc_range+0x221/0x250 [btrfs]
[134243.837450] writepage_delalloc+0xe8/0x150 [btrfs]
[134243.838059] __extent_writepage+0xe8/0x4c0 [btrfs]
[134243.838674] extent_write_cache_pages+0x237/0x530 [btrfs]
[134243.839364] extent_writepages+0x44/0xa0 [btrfs]
[134243.839946] do_writepages+0x23/0x80
[134243.840401] __writeback_single_inode+0x59/0x700
[134243.841006] writeback_sb_inodes+0x267/0x5f0
[134243.841548] __writeback_inodes_wb+0x87/0xe0
[134243.842091] wb_writeback+0x382/0x590
[134243.842574] ? wb_workfn+0x4a2/0x6c0
[134243.843030] wb_workfn+0x4a2/0x6c0
[134243.843468] process_one_work+0x26d/0x6a0
[134243.843978] worker_thread+0x4f/0x3e0
[134243.844452] ? process_one_work+0x6a0/0x6a0
[134243.844981] kthread+0x103/0x140
[134243.845400] ? kthread_create_worker_on_cpu+0x70/0x70
[134243.846030] ret_from_fork+0x3a/0x50
[134243.846494] irq event stamp: 0
[134243.846892] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[134243.847682] hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134243.848687] softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134243.849913] softirqs last disabled at (0): [<0000000000000000>] 0x0
[134243.850698] ---[ end trace bd7c03622e0b0a96 ]---
[134243.851335] ------------[ cut here ]------------
When relocating a data block group, for each extent allocated in the
block group we preallocate another extent with the same size for the
data relocation inode (we do it at prealloc_file_extent_cluster()).
We reserve space by calling btrfs_check_data_free_space(), which ends
up incrementing the data space_info's bytes_may_use counter, and
then call btrfs_prealloc_file_range() to allocate the extent, which
always decrements the bytes_may_use counter by the same amount.
The expectation is that writeback of the data relocation inode always
follows a NOCOW path, by writing into the preallocated extents. However,
when starting writeback we might end up falling back into the COW path,
because the block group that contains the preallocated extent was turned
into RO mode by a scrub running in parallel. The COW path then calls the
extent allocator which ends up calling btrfs_add_reserved_bytes(), and
this function decrements the bytes_may_use counter of the data space_info
object by an amount corresponding to the size of the allocated extent,
despite we haven't previously incremented it. When the counter currently
has a value smaller then the allocated extent we reset the counter to 0
and emit a warning, otherwise we just decrement it and slowly mess up
with this counter which is crucial for space reservation, the end result
can be granting reserved space to tasks when there isn't really enough
free space, and having the tasks fail later in critical places where
error handling consists of a transaction abort or hitting a BUG_ON().
Fix this by making sure that if we fallback to the COW path for a data
relocation inode, we increment the bytes_may_use counter of the data
space_info object. The COW path will then decrement it at
btrfs_add_reserved_bytes() on success or through its error handling part
by a call to extent_clear_unlock_delalloc() (which ends up calling
btrfs_clear_delalloc_extent() that does the decrement operation) in case
of an error.
Test case btrfs/061 from fstests could sporadically trigger this.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When running relocation of a data block group while scrub is running in
parallel, it is possible that the relocation will fail and abort the
current transaction with an -EINVAL error:
[134243.988595] BTRFS info (device sdc): found 14 extents, stage: move data extents
[134243.999871] ------------[ cut here ]------------
[134244.000741] BTRFS: Transaction aborted (error -22)
[134244.001692] WARNING: CPU: 0 PID: 26954 at fs/btrfs/ctree.c:1071 __btrfs_cow_block+0x6a7/0x790 [btrfs]
[134244.003380] Modules linked in: btrfs blake2b_generic xor raid6_pq (...)
[134244.012577] CPU: 0 PID: 26954 Comm: btrfs Tainted: G W 5.6.0-rc7-btrfs-next-58 #5
[134244.014162] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[134244.016184] RIP: 0010:__btrfs_cow_block+0x6a7/0x790 [btrfs]
[134244.017151] Code: 48 c7 c7 (...)
[134244.020549] RSP: 0018:ffffa41607863888 EFLAGS: 00010286
[134244.021515] RAX: 0000000000000000 RBX: ffff9614bdfe09c8 RCX: 0000000000000000
[134244.022822] RDX: 0000000000000001 RSI: ffffffffb3d63980 RDI: 0000000000000001
[134244.024124] RBP: ffff961589e8c000 R08: 0000000000000000 R09: 0000000000000001
[134244.025424] R10: ffffffffc0ae5955 R11: 0000000000000000 R12: ffff9614bd530d08
[134244.026725] R13: ffff9614ced41b88 R14: ffff9614bdfe2a48 R15: 0000000000000000
[134244.028024] FS: 00007f29b63c08c0(0000) GS:ffff9615ba600000(0000) knlGS:0000000000000000
[134244.029491] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[134244.030560] CR2: 00007f4eb339b000 CR3: 0000000130d6e006 CR4: 00000000003606f0
[134244.031997] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[134244.033153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[134244.034484] Call Trace:
[134244.034984] btrfs_cow_block+0x12b/0x2b0 [btrfs]
[134244.035859] do_relocation+0x30b/0x790 [btrfs]
[134244.036681] ? do_raw_spin_unlock+0x49/0xc0
[134244.037460] ? _raw_spin_unlock+0x29/0x40
[134244.038235] relocate_tree_blocks+0x37b/0x730 [btrfs]
[134244.039245] relocate_block_group+0x388/0x770 [btrfs]
[134244.040228] btrfs_relocate_block_group+0x161/0x2e0 [btrfs]
[134244.041323] btrfs_relocate_chunk+0x36/0x110 [btrfs]
[134244.041345] btrfs_balance+0xc06/0x1860 [btrfs]
[134244.043382] ? btrfs_ioctl_balance+0x27c/0x310 [btrfs]
[134244.045586] btrfs_ioctl_balance+0x1ed/0x310 [btrfs]
[134244.045611] btrfs_ioctl+0x1880/0x3760 [btrfs]
[134244.049043] ? do_raw_spin_unlock+0x49/0xc0
[134244.049838] ? _raw_spin_unlock+0x29/0x40
[134244.050587] ? __handle_mm_fault+0x11b3/0x14b0
[134244.051417] ? ksys_ioctl+0x92/0xb0
[134244.052070] ksys_ioctl+0x92/0xb0
[134244.052701] ? trace_hardirqs_off_thunk+0x1a/0x1c
[134244.053511] __x64_sys_ioctl+0x16/0x20
[134244.054206] do_syscall_64+0x5c/0x280
[134244.054891] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[134244.055819] RIP: 0033:0x7f29b51c9dd7
[134244.056491] Code: 00 00 00 (...)
[134244.059767] RSP: 002b:00007ffcccc1dd08 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[134244.061168] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f29b51c9dd7
[134244.062474] RDX: 00007ffcccc1dda0 RSI: 00000000c4009420 RDI: 0000000000000003
[134244.063771] RBP: 0000000000000003 R08: 00005565cea4b000 R09: 0000000000000000
[134244.065032] R10: 0000000000000541 R11: 0000000000000202 R12: 00007ffcccc2060a
[134244.066327] R13: 00007ffcccc1dda0 R14: 0000000000000002 R15: 00007ffcccc1dec0
[134244.067626] irq event stamp: 0
[134244.068202] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[134244.069351] hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134244.070909] softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
[134244.072392] softirqs last disabled at (0): [<0000000000000000>] 0x0
[134244.073432] ---[ end trace bd7c03622e0b0a99 ]---
The -EINVAL error comes from the following chain of function calls:
__btrfs_cow_block() <-- aborts the transaction
btrfs_reloc_cow_block()
replace_file_extents()
get_new_location() <-- returns -EINVAL
When relocating a data block group, for each allocated extent of the block
group, we preallocate another extent (at prealloc_file_extent_cluster()),
associated with the data relocation inode, and then dirty all its pages.
These preallocated extents have, and must have, the same size that extents
from the data block group being relocated have.
Later before we start the relocation stage that updates pointers (bytenr
field of file extent items) to point to the the new extents, we trigger
writeback for the data relocation inode. The expectation is that writeback
will write the pages to the previously preallocated extents, that it
follows the NOCOW path. That is generally the case, however, if a scrub
is running it may have turned the block group that contains those extents
into RO mode, in which case writeback falls back to the COW path.
However in the COW path instead of allocating exactly one extent with the
expected size, the allocator may end up allocating several smaller extents
due to free space fragmentation - because we tell it at cow_file_range()
that the minimum allocation size can match the filesystem's sector size.
This later breaks the relocation's expectation that an extent associated
to a file extent item in the data relocation inode has the same size as
the respective extent pointed by a file extent item in another tree - in
this case the extent to which the relocation inode poins to is smaller,
causing relocation.c:get_new_location() to return -EINVAL.
For example, if we are relocating a data block group X that has a logical
address of X and the block group has an extent allocated at the logical
address X + 128KiB with a size of 64KiB:
1) At prealloc_file_extent_cluster() we allocate an extent for the data
relocation inode with a size of 64KiB and associate it to the file
offset 128KiB (X + 128KiB - X) of the data relocation inode. This
preallocated extent was allocated at block group Z;
2) A scrub running in parallel turns block group Z into RO mode and
starts scrubing its extents;
3) Relocation triggers writeback for the data relocation inode;
4) When running delalloc (btrfs_run_delalloc_range()), we try first the
NOCOW path because the data relocation inode has BTRFS_INODE_PREALLOC
set in its flags. However, because block group Z is in RO mode, the
NOCOW path (run_delalloc_nocow()) falls back into the COW path, by
calling cow_file_range();
5) At cow_file_range(), in the first iteration of the while loop we call
btrfs_reserve_extent() to allocate a 64KiB extent and pass it a minimum
allocation size of 4KiB (fs_info->sectorsize). Due to free space
fragmentation, btrfs_reserve_extent() ends up allocating two extents
of 32KiB each, each one on a different iteration of that while loop;
6) Writeback of the data relocation inode completes;
7) Relocation proceeds and ends up at relocation.c:replace_file_extents(),
with a leaf which has a file extent item that points to the data extent
from block group X, that has a logical address (bytenr) of X + 128KiB
and a size of 64KiB. Then it calls get_new_location(), which does a
lookup in the data relocation tree for a file extent item starting at
offset 128KiB (X + 128KiB - X) and belonging to the data relocation
inode. It finds a corresponding file extent item, however that item
points to an extent that has a size of 32KiB, which doesn't match the
expected size of 64KiB, resuling in -EINVAL being returned from this
function and propagated up to __btrfs_cow_block(), which aborts the
current transaction.
To fix this make sure that at cow_file_range() when we call the allocator
we pass it a minimum allocation size corresponding the desired extent size
if the inode belongs to the data relocation tree, otherwise pass it the
filesystem's sector size as the minimum allocation size.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
There is a race between block group removal and block group creation
when the removal is completed by a task running fitrim or scrub. When
this happens we end up failing the block group creation with an error
-EEXIST since we attempt to insert a duplicate block group item key
in the extent tree. That results in a transaction abort.
The race happens like this:
1) Task A is doing a fitrim, and at btrfs_trim_block_group() it freezes
block group X with btrfs_freeze_block_group() (until very recently
that was named btrfs_get_block_group_trimming());
2) Task B starts removing block group X, either because it's now unused
or due to relocation for example. So at btrfs_remove_block_group(),
while holding the chunk mutex and the block group's lock, it sets
the 'removed' flag of the block group and it sets the local variable
'remove_em' to false, because the block group is currently frozen
(its 'frozen' counter is > 0, until very recently this counter was
named 'trimming');
3) Task B unlocks the block group and the chunk mutex;
4) Task A is done trimming the block group and unfreezes the block group
by calling btrfs_unfreeze_block_group() (until very recently this was
named btrfs_put_block_group_trimming()). In this function we lock the
block group and set the local variable 'cleanup' to true because we
were able to decrement the block group's 'frozen' counter down to 0 and
the flag 'removed' is set in the block group.
Since 'cleanup' is set to true, it locks the chunk mutex and removes
the extent mapping representing the block group from the mapping tree;
5) Task C allocates a new block group Y and it picks up the logical address
that block group X had as the logical address for Y, because X was the
block group with the highest logical address and now the second block
group with the highest logical address, the last in the fs mapping tree,
ends at an offset corresponding to block group X's logical address (this
logical address selection is done at volumes.c:find_next_chunk()).
At this point the new block group Y does not have yet its item added
to the extent tree (nor the corresponding device extent items and
chunk item in the device and chunk trees). The new group Y is added to
the list of pending block groups in the transaction handle;
6) Before task B proceeds to removing the block group item for block
group X from the extent tree, which has a key matching:
(X logical offset, BTRFS_BLOCK_GROUP_ITEM_KEY, length)
task C while ending its transaction handle calls
btrfs_create_pending_block_groups(), which finds block group Y and
tries to insert the block group item for Y into the exten tree, which
fails with -EEXIST since logical offset is the same that X had and
task B hasn't yet deleted the key from the extent tree.
This failure results in a transaction abort, producing a stack like
the following:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -17)
WARNING: CPU: 2 PID: 19736 at fs/btrfs/block-group.c:2074 btrfs_create_pending_block_groups+0x1eb/0x260 [btrfs]
Modules linked in: btrfs blake2b_generic xor raid6_pq (...)
CPU: 2 PID: 19736 Comm: fsstress Tainted: G W 5.6.0-rc7-btrfs-next-58 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_create_pending_block_groups+0x1eb/0x260 [btrfs]
Code: ff ff ff 48 8b 55 50 f0 48 (...)
RSP: 0018:ffffa4160a1c7d58 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff961581909d98 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffb3d63990 RDI: 0000000000000001
RBP: ffff9614f3356a58 R08: 0000000000000000 R09: 0000000000000001
R10: ffff9615b65b0040 R11: 0000000000000000 R12: ffff961581909c10
R13: ffff9615b0c32000 R14: ffff9614f3356ab0 R15: ffff9614be779000
FS: 00007f2ce2841e80(0000) GS:ffff9615bae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555f18780000 CR3: 0000000131d34005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_start_dirty_block_groups+0x398/0x4e0 [btrfs]
btrfs_commit_transaction+0xd0/0xc50 [btrfs]
? btrfs_attach_transaction_barrier+0x1e/0x50 [btrfs]
? __ia32_sys_fdatasync+0x20/0x20
iterate_supers+0xdb/0x180
ksys_sync+0x60/0xb0
__ia32_sys_sync+0xa/0x10
do_syscall_64+0x5c/0x280
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f2ce1d4d5b7
Code: 83 c4 08 48 3d 01 (...)
RSP: 002b:00007ffd8b558c58 EFLAGS: 00000202 ORIG_RAX: 00000000000000a2
RAX: ffffffffffffffda RBX: 000000000000002c RCX: 00007f2ce1d4d5b7
RDX: 00000000ffffffff RSI: 00000000186ba07b RDI: 000000000000002c
RBP: 0000555f17b9e520 R08: 0000000000000012 R09: 000000000000ce00
R10: 0000000000000078 R11: 0000000000000202 R12: 0000000000000032
R13: 0000000051eb851f R14: 00007ffd8b558cd0 R15: 0000555f1798ec20
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
softirqs last enabled at (0): [<ffffffffb2abdedf>] copy_process+0x74f/0x2020
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace bd7c03622e0b0a9c ]---
Fix this simply by making btrfs_remove_block_group() remove the block
group's item from the extent tree before it flags the block group as
removed. Also make the free space deletion from the free space tree
before flagging the block group as removed, to avoid a similar race
with adding and removing free space entries for the free space tree.
Fixes: 04216820fe83d5 ("Btrfs: fix race between fs trimming and block group remove/allocation")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When removing a block group, if we fail to delete the block group's item
from the extent tree, we jump to the 'out' label and end up decrementing
the block group's reference count once only (by 1), resulting in a counter
leak because the block group at that point was already removed from the
block group cache rbtree - so we have to decrement the reference count
twice, once for the rbtree and once for our lookup at the start of the
function.
There is a second bug where if removing the free space tree entries (the
call to remove_block_group_free_space()) fails we end up jumping to the
'out_put_group' label but end up decrementing the reference count only
once, when we should have done it twice, since we have already removed
the block group from the block group cache rbtree. This happens because
the reference count decrement for the rbtree reference happens after
attempting to remove the free space tree entries, which is far away from
the place where we remove the block group from the rbtree.
To make things less error prone, decrement the reference count for the
rbtree immediately after removing the block group from it. This also
eleminates the need for two different exit labels on error, renaming
'out_put_label' to just 'out' and removing the old 'out'.
Fixes: f6033c5e333238 ("btrfs: fix block group leak when removing fails")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
As of commit 4dc55525b095 ("drm: plane: Verify that no or all planes
have a zpos property") a warning is emitted if there's a mix of planes
with and without a zpos property.
On Tegra, cursor planes are always composited on top of all other
planes, which is why they never had a zpos property attached to them.
However, since the composition order is fixed, this is trivial to
remedy by simply attaching an immutable zpos property to them.
v3: do not hardcode zpos for overlay planes used as cursor (Dmitry)
v2: hardcode cursor plane zpos to 255 instead of 0 (Ville)
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Currently when a host1x device driver is unregistered, it is not
detached from the host1x controller, which means that the device
will stay around and when the driver is registered again, it may
bind to the old, stale device rather than the new one that was
created from scratch upon driver registration. This in turn can
cause various weird crashes within the driver core because it is
confronted with a device that was already deleted.
Fix this by detaching the driver from the host1x controller when
it is unregistered. This ensures that the deleted device also is
no longer present in the device list that drivers will bind to.
Reported-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Tested-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Silence documentation build warnings by adding kernel-doc fields.
./include/linux/host1x.h:69: warning: Function parameter or member 'parent' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'usecount' not described in 'host1x_client'
./include/linux/host1x.h:69: warning: Function parameter or member 'lock' not described in 'host1x_client'
Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add ":README" suffix support for the requires list, so that
the testcase can list up the required string for README file
to the requires list.
Note that the required string is treated as a fixed string,
instead of regular expression. Also, the testcase can specify
a string containing spaces with quotes. E.g.
# requires: "place: [<module>:]<symbol>":README
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Add ":tracer" suffix support for the requires list, so that
the testcase can list up the required tracer (e.g. function)
to the requires list.
For example, if the testcase requires function_graph tracer,
it can write requires list as below instead of checking
available_tracers.
# requires: function_graph:tracer
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Since check_filter_file() is basically checking the filter
tracefs file, we can convert it into requires list.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Convert the required tracefs interface checking code with
requires: list.
Fixed merge conflicts in trigger-hist.tc and trigger-trace-marker-hist.tc
Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
In blkdev_get() we call __blkdev_get() to do some internal jobs and if
there is some errors in __blkdev_get(), the bdput() is called which
means we have released the refcount of the bdev (actually the refcount of
the bdev inode). This means we cannot access bdev after that point. But
acctually bdev is still accessed in blkdev_get() after calling
__blkdev_get(). This results in use-after-free if the refcount is the
last one we released in __blkdev_get(). Let's take a look at the
following scenerio:
CPU0 CPU1 CPU2
blkdev_open blkdev_open Remove disk
bd_acquire
blkdev_get
__blkdev_get del_gendisk
bdev_unhash_inode
bd_acquire bdev_get_gendisk
bd_forget failed because of unhashed
bdput
bdput (the last one)
bdev_evict_inode
access bdev => use after free
[ 459.350216] BUG: KASAN: use-after-free in __lock_acquire+0x24c1/0x31b0
[ 459.351190] Read of size 8 at addr ffff88806c815a80 by task syz-executor.0/20132
[ 459.352347]
[ 459.352594] CPU: 0 PID: 20132 Comm: syz-executor.0 Not tainted 4.19.90 #2
[ 459.353628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 459.354947] Call Trace:
[ 459.355337] dump_stack+0x111/0x19e
[ 459.355879] ? __lock_acquire+0x24c1/0x31b0
[ 459.356523] print_address_description+0x60/0x223
[ 459.357248] ? __lock_acquire+0x24c1/0x31b0
[ 459.357887] kasan_report.cold+0xae/0x2d8
[ 459.358503] __lock_acquire+0x24c1/0x31b0
[ 459.359120] ? _raw_spin_unlock_irq+0x24/0x40
[ 459.359784] ? lockdep_hardirqs_on+0x37b/0x580
[ 459.360465] ? _raw_spin_unlock_irq+0x24/0x40
[ 459.361123] ? finish_task_switch+0x125/0x600
[ 459.361812] ? finish_task_switch+0xee/0x600
[ 459.362471] ? mark_held_locks+0xf0/0xf0
[ 459.363108] ? __schedule+0x96f/0x21d0
[ 459.363716] lock_acquire+0x111/0x320
[ 459.364285] ? blkdev_get+0xce/0xbe0
[ 459.364846] ? blkdev_get+0xce/0xbe0
[ 459.365390] __mutex_lock+0xf9/0x12a0
[ 459.365948] ? blkdev_get+0xce/0xbe0
[ 459.366493] ? bdev_evict_inode+0x1f0/0x1f0
[ 459.367130] ? blkdev_get+0xce/0xbe0
[ 459.367678] ? destroy_inode+0xbc/0x110
[ 459.368261] ? mutex_trylock+0x1a0/0x1a0
[ 459.368867] ? __blkdev_get+0x3e6/0x1280
[ 459.369463] ? bdev_disk_changed+0x1d0/0x1d0
[ 459.370114] ? blkdev_get+0xce/0xbe0
[ 459.370656] blkdev_get+0xce/0xbe0
[ 459.371178] ? find_held_lock+0x2c/0x110
[ 459.371774] ? __blkdev_get+0x1280/0x1280
[ 459.372383] ? lock_downgrade+0x680/0x680
[ 459.373002] ? lock_acquire+0x111/0x320
[ 459.373587] ? bd_acquire+0x21/0x2c0
[ 459.374134] ? do_raw_spin_unlock+0x4f/0x250
[ 459.374780] blkdev_open+0x202/0x290
[ 459.375325] do_dentry_open+0x49e/0x1050
[ 459.375924] ? blkdev_get_by_dev+0x70/0x70
[ 459.376543] ? __x64_sys_fchdir+0x1f0/0x1f0
[ 459.377192] ? inode_permission+0xbe/0x3a0
[ 459.377818] path_openat+0x148c/0x3f50
[ 459.378392] ? kmem_cache_alloc+0xd5/0x280
[ 459.379016] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 459.379802] ? path_lookupat.isra.0+0x900/0x900
[ 459.380489] ? __lock_is_held+0xad/0x140
[ 459.381093] do_filp_open+0x1a1/0x280
[ 459.381654] ? may_open_dev+0xf0/0xf0
[ 459.382214] ? find_held_lock+0x2c/0x110
[ 459.382816] ? lock_downgrade+0x680/0x680
[ 459.383425] ? __lock_is_held+0xad/0x140
[ 459.384024] ? do_raw_spin_unlock+0x4f/0x250
[ 459.384668] ? _raw_spin_unlock+0x1f/0x30
[ 459.385280] ? __alloc_fd+0x448/0x560
[ 459.385841] do_sys_open+0x3c3/0x500
[ 459.386386] ? filp_open+0x70/0x70
[ 459.386911] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 459.387610] ? trace_hardirqs_off_caller+0x55/0x1c0
[ 459.388342] ? do_syscall_64+0x1a/0x520
[ 459.388930] do_syscall_64+0xc3/0x520
[ 459.389490] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 459.390248] RIP: 0033:0x416211
[ 459.390720] Code: 75 14 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83
04 19 00 00 c3 48 83 ec 08 e8 0a fa ff ff 48 89 04 24 b8 02 00 00 00 0f
05 <48> 8b 3c 24 48 89 c2 e8 53 fa ff ff 48 89 d0 48 83 c4 08 48 3d
01
[ 459.393483] RSP: 002b:00007fe45dfe9a60 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
[ 459.394610] RAX: ffffffffffffffda RBX: 00007fe45dfea6d4 RCX: 0000000000416211
[ 459.395678] RDX: 00007fe45dfe9b0a RSI: 0000000000000002 RDI: 00007fe45dfe9b00
[ 459.396758] RBP: 000000000076bf20 R08: 0000000000000000 R09: 000000000000000a
[ 459.397930] R10: 0000000000000075 R11: 0000000000000293 R12: 00000000ffffffff
[ 459.399022] R13: 0000000000000bd9 R14: 00000000004cdb80 R15: 000000000076bf2c
[ 459.400168]
[ 459.400430] Allocated by task 20132:
[ 459.401038] kasan_kmalloc+0xbf/0xe0
[ 459.401652] kmem_cache_alloc+0xd5/0x280
[ 459.402330] bdev_alloc_inode+0x18/0x40
[ 459.402970] alloc_inode+0x5f/0x180
[ 459.403510] iget5_locked+0x57/0xd0
[ 459.404095] bdget+0x94/0x4e0
[ 459.404607] bd_acquire+0xfa/0x2c0
[ 459.405113] blkdev_open+0x110/0x290
[ 459.405702] do_dentry_open+0x49e/0x1050
[ 459.406340] path_openat+0x148c/0x3f50
[ 459.406926] do_filp_open+0x1a1/0x280
[ 459.407471] do_sys_open+0x3c3/0x500
[ 459.408010] do_syscall_64+0xc3/0x520
[ 459.408572] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 459.409415]
[ 459.409679] Freed by task 1262:
[ 459.410212] __kasan_slab_free+0x129/0x170
[ 459.410919] kmem_cache_free+0xb2/0x2a0
[ 459.411564] rcu_process_callbacks+0xbb2/0x2320
[ 459.412318] __do_softirq+0x225/0x8ac
Fix this by delaying bdput() to the end of blkdev_get() which means we
have finished accessing bdev.
Fixes: 77ea887e433a ("implement in-kernel gendisk events handling")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
Linux 5.8-rc1
|
|
Commit cca98e9f8b5e ("mm: enforce that vmap can't map pages executable")
introduced 'pgprot_nx(prot)' for arm64 but collided silently with the
BTI support during the merge window, which endeavours to clear the GP
bit for non-executable kernel mappings in set_memory_nx().
For consistency between the two APIs, clear the GP bit in pgprot_nx().
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200615154642.3579-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since commit cd28d1d6e52e ("net: phy: at803x: Disable phy delay for
RGMII mode") the networking is broken on the BeagleBone AI which has
the AR8035 PHY for Gigabit Ethernet [0]. The fix is to switch from
phy-mode = "rgmii" to phy-mode = "rgmii-rxid".
Note: Grygorii made a similar DT fix for other AM57xx boards with a
different phy in commit 820f8a870f65 ("ARM: dts: am57xx: fix networking
on boards with ksz9031 phy").
[0] https://git.io/Jf7PX
Fixes: 520557d4854b ("ARM: dts: am5729: beaglebone-ai: adding device tree")
Cc: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
I accidentally flipped the system timer to use system clock instead of
the 32k source clock.
Fixes: 14b1925a7219 ("ARM: dts: Configure system timers for omap4")
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
While testing the recent suspend and resume regressions I noticed that
duovero can still end up losing edge gpio interrupts on runtime
suspend. This causes NFSroot easily stopping working after resume on
duovero.
Let's fix the issue by using gpio level interrupts for smsc as then
the gpio interrupt state is seen by the gpio controller on resume.
Fixes: 731b409878a3 ("ARM: dts: Configure duovero for to allow core retention during idle")
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
AM3358 pin mcasp0_aclkr (ZCZ ball B13) [0] is routed to P1.31 header [1]
Mode 4 of this pin is mmc0_sdwp (SD Write Protect). A signal connected
to P1.31 may accidentally trigger mmc0 write protection. To avoid this
situation, do not put mcasp0_aclkr in mode 4 (mmc0_sdwp) by default.
[0] http://www.ti.com/lit/ds/symlink/am3358.pdf
[1] https://github.com/beagleboard/pocketbeagle/wiki/System-Reference-Manual#531_Expansion_Headers
Fixes: 047905376a16 (ARM: dts: Add am335x-pocketbeagle)
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
|
|
afs_vnode_commit_status() is only ever called if op->error is 0, so remove
the op->error checks from the function.
Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
afs_check_for_remote_deletion() checks to see if error ENOENT is returned
by the server in response to an operation and, if so, marks the primary
vnode as having been deleted as the FID is no longer valid.
However, it's being called from the operation success functions, where no
abort has happened - and if an inline abort is recorded, it's handled by
afs_vnode_commit_status().
Fix this by actually calling the operation aborted method if provided and
having that point to afs_check_for_remote_deletion().
Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Remove afs_operation::abort_code as it's read but never set. Use
ac.abort_code instead.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Fix yfs_fs_fetch_status() to honour the vnode selector in
op->fetch_status.which as does afs_fs_fetch_status() that allows
afs_do_lookup() to use this as an alternative to the InlineBulkStatus RPC
call if not implemented by the server.
This doesn't matter in the current code as YFS servers always implement
InlineBulkStatus, but a subsequent will call it on YFS servers too in some
circumstances.
Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Remove yfs_fs_fetch_file_status() as it's no longer used.
Signed-off-by: David Howells <dhowells@redhat.com>
|