Age | Commit message (Collapse) | Author |
|
Since we're releasing all existing nodes/edges, other than cleanup the
mess after error, "release" is a more proper naming here.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Also add comment explaining the cleanup progress, to differ it from
btrfs_backref_drop_node().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
With extra comment for drop_backref_node() as it has some similarity
with remove_backref_node(), thus we need extra comment explaining the
difference.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Structure tree_entry provides a very simple rb_tree which only uses
bytenr as search index.
That tree_entry is used in 3 structures: backref_node, mapping_node and
tree_block.
Since we're going to make backref_node independnt from relocation, it's
a good time to extract the tree_entry into rb_simple_node, and export it
into misc.h.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
These 3 structures are the main part of btrfs backref cache, move them
to backref.h to build the basis for later reuse.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Those three structures are the main elements of backref cache. Add the
"btrfs_" prefix for later export.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This patch will also add some comment for the cleanup.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
After handle_one_tree_backref(), all newly added (not cached) edges and
nodes have the following features:
- Only backref_edge::list[LOWER] is linked.
This means, we can only iterate from botton to top, not the other
direction.
- Newly added nodes are not added to cache rb_tree yet
So to finish the backref cache, we still need to finish the links and
add all nodes into backref cache rb_tree.
This patch will refactor the existing code into finish_upper_links(),
add more comments of each branch, and why we need to do all the work.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
build_backref_tree() uses "goto again;" to implement a breadth-first
search to build backref cache.
This patch will extract most of its work into a wrapper,
handle_one_tree_block(), and use a do {} while() loop to implement the
same thing.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Bytenr and level are essential parameters for backref_node, thus it
makes sense to initialize them at allocation time.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Since backref_edge is used to connect upper and lower backref nodes, and
needs to access both nodes, some code can look pretty nasty:
list_add_tail(&edge->list[LOWER], &cur->upper);
The above code will link @cur to the LOWER side of the edge, while both
"LOWER" and "upper" words show up. This can sometimes be very confusing
for reader to grasp.
This patch introduces a new wrapper, link_backref_edge(), to handle the
linking behavior. Which also has extra ASSERT() to ensure caller won't
pass wrong nodes.
Also, this updates the comment of related lists of backref_node and
backref_edge, to make it more clear that each list points to what.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
The processing of indirect tree backref (TREE_BLOCK_REF) is the most
complex work.
We need to grab the fs root, do a tree search to locate all its parent
nodes, link all needed edges, and put all uncached edges to pending edge
list.
This is definitely worth a helper function.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
For BTRFS_SHARED_BLOCK_REF_KEY, its processing is straightforward, as we
now the parent node bytenr directly.
If the parent is already cached, or a root, call it a day.
If the parent is not cached, add it pending list.
This patch will just refactor this part into its own function,
handle_direct_tree_backref() and add some comment explaining the
@ref_key parameter.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
find_reloc_root() searches reloc_control::reloc_root_tree to find the
reloc root. This behavior is only useful for relocation backref cache.
For the incoming more generic purpose backref cache, we don't care
about who owns the reloc root, but only care if it's a reloc root.
So this patch makes the following modifications to make the reloc root
search more specific to relocation backref:
- Add backref_node::is_reloc_root
This will be an extra indicator for generic purposed backref cache.
User doesn't need to read root key from backref_node::root to
determine if it's a reloc root.
Also for reloc tree root, it's useless and will be queued to useless
list.
- Add backref_cache::is_reloc
This will allow backref cache code to do different behavior for
generic purpose backref cache and relocation backref cache.
- Pass fs_info to find_reloc_root()
- Export find_reloc_root()
So backref.c can utilize this function.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Add this member so that we can grab fs_info without the help from
reloc_control.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
These two new members will act the same as the existing local lists,
@useless and @list in build_backref_tree().
Currently build_backref_tree() is only executed serially, thus moving
such local list into backref_cache is still safe.
Also since we're here, use list_first_entry() to replace a lot of
list_entry() calls after !list_empty().
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
These two functions are weirdly named, mark_block_processed() in fact
just marks a range dirty unconditionally, while __mark_block_processed()
does extra check before doing the marking.
This patch will open code old mark_block_processed, and rename
__mark_block_processed() to remove the "__" prefix.
Since we're here, also kill the forward declaration, which could also
kill in_block_group() with in_range() macro.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
In the core function of relocation, build_backref_tree, it needs to
iterate all backref items of one tree block.
Use btrfs_backref_iter infrastructure to do the loop and make the code
more readable.
The backref items look would be much more easier to read:
ret = btrfs_backref_iter_start(iter, cur->bytenr);
for (; ret == 0; ret = btrfs_backref_iter_next(iter)) {
/* The really important work */
}
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
This function will go to the next inline/keyed backref for
btrfs_backref_iter infrastructure.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Due to the complex nature of btrfs extent tree, when we want to iterate
all backrefs of one extent, this involves quite a lot of work, like
searching the EXTENT_ITEM/METADATA_ITEM, iteration through inline and keyed
backrefs.
Normally this would result in a complex code, something like:
btrfs_search_slot()
/* Ensure we are at EXTENT_ITEM/METADATA_ITEM */
while (1) { /* Loop for extent tree items */
while (ptr < end) { /* Loop for inlined items */
/* Real work here */
}
next:
ret = btrfs_next_item()
/* Ensure we're still at keyed item for specified bytenr */
}
The idea of btrfs_backref_iter is to avoid such complex and hard to
read code structure, but something like the following:
iter = btrfs_backref_iter_alloc();
ret = btrfs_backref_iter_start(iter, bytenr);
if (ret < 0)
goto out;
for (; ; ret = btrfs_backref_iter_next(iter)) {
/* Real work here */
}
out:
btrfs_backref_iter_free(iter);
This patch is just the skeleton + btrfs_backref_iter_start() code.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Sparse reports a warning at btrfs_tree_lock()
warning: context imbalance in btrfs_tree_lock() - wrong count at exit
The root cause is the missing annotation at btrfs_tree_lock()
Add the missing __acquires(&eb->lock) annotation
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Sparse reports a warning at btrfs_lock_cluster()
warning: context imbalance in btrfs_lock_cluster()
- wrong count
The root cause is the missing annotation at btrfs_lock_cluster()
Add the missing __acquires(&cluster->refill_lock) annotation.
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
https://github.com/Broadcom/stblinux into arm/fixes
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
5.7, please pull the following:
- Vincent fixes the polarity of the ACT LED on the Raspberry Pi Zero W
board
- Hamish fixes the ARM PPI interrupts sensitivy for the Hurricane 2
SoCs
* tag 'arm-soc/for-5.7/devicetree-fixes-part2-v2' of https://github.com/Broadcom/stblinux:
ARM: dts: bcm: HR2: Fix PPI interrupt types
ARM: dts: bcm2835-rpi-zero-w: Fix led polarity
Link: https://lore.kernel.org/r/20200524203714.17035-1-f.fainelli@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Propagate the error code returned by devm_platform_ioremap_resource()
out of probe() instead of overwriting it.
Fixes: 72d8cb715477 ("drivers: gpio: bcm-kona: use devm_platform_ioremap_resource()")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
[Bartosz: tweaked the commit message]
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
|
|
When call function devm_platform_ioremap_resource(), we should use IS_ERR()
to check the return value and return PTR_ERR() if failed.
Fixes: 542c25b7a209 ("drivers: gpio: pxa: use devm_platform_ioremap_resource()")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
|
|
mutex_lock() can sleep, don't call mutex_lock() while holding spin_lock.
Fixes: bc0ae0e737f5 ("gpio: add driver for Mellanox BlueField 2 GPIO controller")
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: asmaa@mellanox.com
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
|
|
Remove unused PLATFORM_POWER_LIMIT MSR local definition from file
intel_rapl_common.c. This was missed while splitting old RAPL code
intel_rapl.c file into two new files intel_rapl_msr.c and
intel_rapl_common.c as per the commit 3382388d7148
("intel_rapl: abstract RAPL common code"). Currently, this #define
entry is being used only in intel_rapl_msr.c file and local definition
present in this file.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20200521185707.GA3661@embeddedor
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
The ptr is a pointer to userspace memory. So we need annotate it with
__user otherwise we may get sparse warnings like:
drivers/vhost/vhost.c:1603:13: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected void const *__gu_ptr @@ got unsigned int [noderef] [usertypvoid const *__gu_ptr @@
drivers/vhost/vhost.c:1603:13: sparse: expected void const *__gu_ptr
drivers/vhost/vhost.c:1603:13: sparse: got unsigned int [noderef] [usertype] <asn:1> *idxp
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20200520065750.8401-1-jasowang@redhat.com
Fixes: 7124330dabe5b3cb ("m68k/uaccess: Revive 64-bit get_user()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
On a Quadra 900/950, the ISM IOP IRQ output pin is connected to an
edge-triggered input on VIA2. It is theoretically possible that this
signal could fail to produce the expected VIA2 interrupt.
The two IOP interrupt flags can be asserted in any order but the logic
in iop_ism_irq() does not allow for that. In particular, INT0 can be
asserted right after INT0 is checked and before INT1 is cleared.
Such an interrupt would produce no new edge and VIA2 would detect no
further interrupts from the IOP. Avoid this by looping over the INT0/1
handlers so an edge can be produced.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Cc: Joshua Thompson <funaho@jurai.org>
Link: https://lore.kernel.org/r/bfbb71db52c5e162d3afa25a28fc5d535ca87138.1589949122.git.fthain@telegraphics.com.au
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
This code path was tested on a Quadra 950 a long time ago and the
comment isn't needed.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Cc: Joshua Thompson <funaho@jurai.org>
Link: https://lore.kernel.org/r/10dff3e7c17d363a4b239aae7b3ebab32bef3547.1589949122.git.fthain@telegraphics.com.au
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
There is no VIA2 chip on the Mac IIfx, so don't call via_flush_cache().
This avoids a boot crash which appeared in v5.4.
printk: console [ttyS0] enabled
printk: bootconsole [debug0] disabled
printk: bootconsole [debug0] disabled
Calibrating delay loop... 9.61 BogoMIPS (lpj=48064)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
devtmpfs: initialized
random: get_random_u32 called from bucket_table_alloc.isra.27+0x68/0x194 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
futex hash table entries: 256 (order: -1, 3072 bytes, linear)
NET: Registered protocol family 16
Data read fault at 0x00000000 in Super Data (pc=0x8a6a)
BAD KERNEL BUSERR
Oops: 00000000
Modules linked in:
PC: [<00008a6a>] via_flush_cache+0x12/0x2c
SR: 2700 SP: 01c1fe3c a2: 01c24000
d0: 00001119 d1: 0000000c d2: 00012000 d3: 0000000f
d4: 01c06840 d5: 00033b92 a0: 00000000 a1: 00000000
Process swapper (pid: 1, task=01c24000)
Frame format=B ssw=0755 isc=0200 isb=fff7 daddr=00000000 dobuf=01c1fed0
baddr=00008a6e dibuf=0000004e ver=f
Stack from 01c1fec4:
01c1fed0 00007d7e 00010080 01c1fedc 0000792e 00000001 01c1fef4 00006b40
01c80000 00040000 00000006 00000003 01c1ff1c 004a545e 004ff200 00040000
00000000 00000003 01c06840 00033b92 004a5410 004b6c88 01c1ff84 000021e2
00000073 00000003 01c06840 00033b92 0038507a 004bb094 004b6ca8 004b6c88
004b6ca4 004b6c88 000021ae 00020002 00000000 01c0685d 00000000 01c1ffb4
0049f938 00409c85 01c06840 0045bd40 00000073 00000002 00000002 00000000
Call Trace: [<00007d7e>] mac_cache_card_flush+0x12/0x1c
[<00010080>] fix_dnrm+0x2/0x18
[<0000792e>] cache_push+0x46/0x5a
[<00006b40>] arch_dma_prep_coherent+0x60/0x6e
[<00040000>] switched_to_dl+0x76/0xd0
[<004a545e>] dma_atomic_pool_init+0x4e/0x188
[<00040000>] switched_to_dl+0x76/0xd0
[<00033b92>] parse_args+0x0/0x370
[<004a5410>] dma_atomic_pool_init+0x0/0x188
[<000021e2>] do_one_initcall+0x34/0x1be
[<00033b92>] parse_args+0x0/0x370
[<0038507a>] strcpy+0x0/0x1e
[<000021ae>] do_one_initcall+0x0/0x1be
[<00020002>] do_proc_dointvec_conv+0x54/0x74
[<0049f938>] kernel_init_freeable+0x126/0x190
[<0049f94c>] kernel_init_freeable+0x13a/0x190
[<004a5410>] dma_atomic_pool_init+0x0/0x188
[<00041798>] complete+0x0/0x3c
[<000b9b0c>] kfree+0x0/0x20a
[<0038df98>] schedule+0x0/0xd0
[<0038d604>] kernel_init+0x0/0xda
[<0038d610>] kernel_init+0xc/0xda
[<0038d604>] kernel_init+0x0/0xda
[<00002d38>] ret_from_kernel_thread+0xc/0x14
Code: 0000 2079 0048 10da 2279 0048 10c8 d3c8 <1011> 0200 fff7 1280 d1f9 0048 10c8 1010 0000 0008 1080 4e5e 4e75 4e56 0000 2039
Disabling lock debugging due to kernel taint
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Thanks to Stan Johnson for capturing the console log and running git
bisect.
Git bisect said commit 8e3a68fb55e0 ("dma-mapping: make
dma_atomic_pool_init self-contained") is the first "bad" commit. I don't
know why. Perhaps mach_l2_flush first became reachable with that commit.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Cc: Joshua Thompson <funaho@jurai.org>
Link: https://lore.kernel.org/r/b8bbeef197d6b3898e82ed0d231ad08f575a4b34.1589949122.git.fthain@telegraphics.com.au
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
rpm_suspend() simple bails out when conditions are wrong. But this is not
immediately obvious from the code. Make it clear what we do when conditions
are wrong in rpm_suspend().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
pr_cont_once() does not make sense; at least emitting module name using
pr_fmt() into middle of a line (after e.g. pr_info_once()) does not make
sense. Let's remove unused pr_cont_once().
Link: https://lore.kernel.org/r/20200524153243.11690-1-penguin-kernel@I-love.SAKURA.ne.jp
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
The data structure member “rpmb->md” was passed to a call of the function
“mmc_blk_put” after a call of the function “put_device”. Reorder these
function calls to keep the data accesses consistent.
Fixes: 1c87f7357849 ("mmc: block: Fix bug when removing RPMB chardev ")
Signed-off-by: Peng Hao <richard.peng@oppo.com>
Cc: stable@vger.kernel.org
[Uffe: Fixed up mangled patch and updated commit message]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux
Pull cpupower utility updates for v5.8-rc1 from Shuah Khan:
"This cpupower update for Linux 5.8-rc1 consists of a single
patch to fix coccicheck unneeded semicolon warning."
* tag 'linux-cpupower-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
cpupower: Remove unneeded semicolon
|
|
Fixes bitmask for HE opration's default PE duration.
Fixes: daa5b83513a7 ("mac80211: update HE operation fields to D3.0")
Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org>
Link: https://lore.kernel.org/r/20200506102430.5153-1-pradeepc@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
On a non-forwarding 802.11s link between two fairly busy
neighboring nodes (iperf with -P 16 at ~850MBit/s TCP;
1733.3 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 4), so with
frequent PREQ retries, usually after around 30-40 seconds the
following crash would occur:
[ 1110.822428] Unable to handle kernel read from unreadable memory at virtual address 00000000
[ 1110.830786] Mem abort info:
[ 1110.833573] Exception class = IABT (current EL), IL = 32 bits
[ 1110.839494] SET = 0, FnV = 0
[ 1110.842546] EA = 0, S1PTW = 0
[ 1110.845678] user pgtable: 4k pages, 48-bit VAs, pgd = ffff800076386000
[ 1110.852204] [0000000000000000] *pgd=00000000f6322003, *pud=00000000f62de003, *pmd=0000000000000000
[ 1110.861167] Internal error: Oops: 86000004 [#1] PREEMPT SMP
[ 1110.866730] Modules linked in: pppoe ppp_async batman_adv ath10k_pci ath10k_core ath pppox ppp_generic nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ip_tables crc_ccitt compat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 usb_storage xhci_plat_hcd xhci_pci xhci_hcd dwc3 usbcore usb_common
[ 1110.932190] Process swapper/3 (pid: 0, stack limit = 0xffff0000090c8000)
[ 1110.938884] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.14.162 #0
[ 1110.944965] Hardware name: LS1043A RGW Board (DT)
[ 1110.949658] task: ffff8000787a81c0 task.stack: ffff0000090c8000
[ 1110.955568] PC is at 0x0
[ 1110.958097] LR is at call_timer_fn.isra.27+0x24/0x78
[ 1110.963055] pc : [<0000000000000000>] lr : [<ffff0000080ff29c>] pstate: 00400145
[ 1110.970440] sp : ffff00000801be10
[ 1110.973744] x29: ffff00000801be10 x28: ffff000008bf7018
[ 1110.979047] x27: ffff000008bf87c8 x26: ffff000008c160c0
[ 1110.984352] x25: 0000000000000000 x24: 0000000000000000
[ 1110.989657] x23: dead000000000200 x22: 0000000000000000
[ 1110.994959] x21: 0000000000000000 x20: 0000000000000101
[ 1111.000262] x19: ffff8000787a81c0 x18: 0000000000000000
[ 1111.005565] x17: ffff0000089167b0 x16: 0000000000000058
[ 1111.010868] x15: ffff0000089167b0 x14: 0000000000000000
[ 1111.016172] x13: ffff000008916788 x12: 0000000000000040
[ 1111.021475] x11: ffff80007fda9af0 x10: 0000000000000001
[ 1111.026777] x9 : ffff00000801bea0 x8 : 0000000000000004
[ 1111.032080] x7 : 0000000000000000 x6 : ffff80007fda9aa8
[ 1111.037383] x5 : ffff00000801bea0 x4 : 0000000000000010
[ 1111.042685] x3 : ffff00000801be98 x2 : 0000000000000614
[ 1111.047988] x1 : 0000000000000000 x0 : 0000000000000000
[ 1111.053290] Call trace:
[ 1111.055728] Exception stack(0xffff00000801bcd0 to 0xffff00000801be10)
[ 1111.062158] bcc0: 0000000000000000 0000000000000000
[ 1111.069978] bce0: 0000000000000614 ffff00000801be98 0000000000000010 ffff00000801bea0
[ 1111.077798] bd00: ffff80007fda9aa8 0000000000000000 0000000000000004 ffff00000801bea0
[ 1111.085618] bd20: 0000000000000001 ffff80007fda9af0 0000000000000040 ffff000008916788
[ 1111.093437] bd40: 0000000000000000 ffff0000089167b0 0000000000000058 ffff0000089167b0
[ 1111.101256] bd60: 0000000000000000 ffff8000787a81c0 0000000000000101 0000000000000000
[ 1111.109075] bd80: 0000000000000000 dead000000000200 0000000000000000 0000000000000000
[ 1111.116895] bda0: ffff000008c160c0 ffff000008bf87c8 ffff000008bf7018 ffff00000801be10
[ 1111.124715] bdc0: ffff0000080ff29c ffff00000801be10 0000000000000000 0000000000400145
[ 1111.132534] bde0: ffff8000787a81c0 ffff00000801bde8 0000ffffffffffff 000001029eb19be8
[ 1111.140353] be00: ffff00000801be10 0000000000000000
[ 1111.145220] [< (null)>] (null)
[ 1111.149917] [<ffff0000080ff77c>] run_timer_softirq+0x184/0x398
[ 1111.155741] [<ffff000008081938>] __do_softirq+0x100/0x1fc
[ 1111.161130] [<ffff0000080a2e28>] irq_exit+0x80/0xd8
[ 1111.166002] [<ffff0000080ea708>] __handle_domain_irq+0x88/0xb0
[ 1111.171825] [<ffff000008081678>] gic_handle_irq+0x68/0xb0
[ 1111.177213] Exception stack(0xffff0000090cbe30 to 0xffff0000090cbf70)
[ 1111.183642] be20: 0000000000000020 0000000000000000
[ 1111.191461] be40: 0000000000000001 0000000000000000 00008000771af000 0000000000000000
[ 1111.199281] be60: ffff000008c95180 0000000000000000 ffff000008c19360 ffff0000090cbef0
[ 1111.207101] be80: 0000000000000810 0000000000000400 0000000000000098 ffff000000000000
[ 1111.214920] bea0: 0000000000000001 ffff0000089167b0 0000000000000000 ffff0000089167b0
[ 1111.222740] bec0: 0000000000000000 ffff000008c198e8 ffff000008bf7018 ffff000008c19000
[ 1111.230559] bee0: 0000000000000000 0000000000000000 ffff8000787a81c0 ffff000008018000
[ 1111.238380] bf00: ffff00000801c000 ffff00000913ba34 ffff8000787a81c0 ffff0000090cbf70
[ 1111.246199] bf20: ffff0000080857cc ffff0000090cbf70 ffff0000080857d0 0000000000400145
[ 1111.254020] bf40: ffff000008018000 ffff00000801c000 ffffffffffffffff ffff0000080fa574
[ 1111.261838] bf60: ffff0000090cbf70 ffff0000080857d0
[ 1111.266706] [<ffff0000080832e8>] el1_irq+0xe8/0x18c
[ 1111.271576] [<ffff0000080857d0>] arch_cpu_idle+0x10/0x18
[ 1111.276880] [<ffff0000080d7de4>] do_idle+0xec/0x1b8
[ 1111.281748] [<ffff0000080d8020>] cpu_startup_entry+0x20/0x28
[ 1111.287399] [<ffff00000808f81c>] secondary_start_kernel+0x104/0x110
[ 1111.293662] Code: bad PC value
[ 1111.296710] ---[ end trace 555b6ca4363c3edd ]---
[ 1111.301318] Kernel panic - not syncing: Fatal exception in interrupt
[ 1111.307661] SMP: stopping secondary CPUs
[ 1111.311574] Kernel Offset: disabled
[ 1111.315053] CPU features: 0x0002000
[ 1111.318530] Memory Limit: none
[ 1111.321575] Rebooting in 3 seconds..
With some added debug output / delays we were able to push the crash from
the timer callback runner into the callback function and by that shedding
some light on which object holding the timer gets corrupted:
[ 401.720899] Unable to handle kernel read from unreadable memory at virtual address 00000868
[...]
[ 402.335836] [<ffff0000088fafa4>] _raw_spin_lock_bh+0x14/0x48
[ 402.341548] [<ffff000000dbe684>] mesh_path_timer+0x10c/0x248 [mac80211]
[ 402.348154] [<ffff0000080ff29c>] call_timer_fn.isra.27+0x24/0x78
[ 402.354150] [<ffff0000080ff77c>] run_timer_softirq+0x184/0x398
[ 402.359974] [<ffff000008081938>] __do_softirq+0x100/0x1fc
[ 402.365362] [<ffff0000080a2e28>] irq_exit+0x80/0xd8
[ 402.370231] [<ffff0000080ea708>] __handle_domain_irq+0x88/0xb0
[ 402.376053] [<ffff000008081678>] gic_handle_irq+0x68/0xb0
The issue happens due to the following sequence of events:
1) mesh_path_start_discovery():
-> spin_unlock_bh(&mpath->state_lock) before mesh_path_sel_frame_tx()
2) mesh_path_free_rcu()
-> del_timer_sync(&mpath->timer)
[...]
-> kfree_rcu(mpath)
3) mesh_path_start_discovery():
-> mod_timer(&mpath->timer, ...)
[...]
-> rcu_read_unlock()
4) mesh_path_free_rcu()'s kfree_rcu():
-> kfree(mpath)
5) mesh_path_timer() starts after timeout, using freed mpath object
So a use-after-free issue due to a timer re-arming bug caused by an
early spin-unlocking.
This patch fixes this issue by re-checking if mpath is about to be
free'd and if so bails out of re-arming the timer.
Cc: stable@vger.kernel.org
Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol")
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
Link: https://lore.kernel.org/r/20200522170413.14973-1-linus.luessing@c0d3.blue
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Terratec Cinergy S2 PCIe Dual is a PCIe device with two tuners that
actually contains two USB devices. The devices are visible in the
lsusb printout.
Bus 004 Device 002: ID 153b:1182 TerraTec Electronic GmbH Cinergy S2 PCIe Dual Port 2
Bus 003 Device 002: ID 153b:1181 TerraTec Electronic GmbH Cinergy S2 PCIe Dual Port 1
The devices use the Montage M88DS3000/M88TS2022 demod/tuner.
Signed-off-by: Dirk Nehring <dnehring@gmx.net>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
Fixes bug exposed by:
[a3fbc2e6bb0: media: mc-entity.c: use WARN_ON, validate link pads]
The dvbdev incorrectly requests a tuner sink pad to connect to a demux
sink pad. The media controller failure percolates back and the dvb device
creation fails. Fix this by requesting a tuner source pad. Instead of
forcing that pad to be index zero, check if a negative integer error
is returned. A note is added that first source pad found is chosen.
Affected bridges cx231xx and em28xx printed the below warning[s]
when a variety of media controller dvb enabled devices were connected.
The warning returns an error causing all affected devices to fail DVB
device creation.
[ 253.138332] ------------[ cut here ]------------
[ 253.138339] WARNING: CPU: 0 PID: 1550 at drivers/media/mc/mc-entity.c:669 media_create_pad_link+0x1e0/0x200 [mc]
[ 253.138339] Modules linked in: si2168 em28xx_dvb(+) em28xx si2157 lgdt3306a cx231xx_dvb dvb_core cx231xx_alsa cx25840 cx231xx tveeprom cx2341x i2c_mux videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc ir_rc5_decoder rc_hauppauge mceusb rc_core eda
c_mce_amd kvm nls_iso8859_1 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper efi_pstore wmi_bmof k10temp asix usbnet mii nouveau snd_hda_codec_realtek snd_hda_codec_generic input_leds ledtrig_audio snd_hda_codec_hdmi mxm_wmi snd_hda_in
tel video snd_intel_dspcfg ttm snd_hda_codec drm_kms_helper snd_hda_core drm snd_hwdep snd_seq_midi snd_seq_midi_event i2c_algo_bit snd_pcm snd_rawmidi fb_sys_fops snd_seq syscopyarea sysfillrect snd_seq_device sysimgblt snd_timer snd soundcore ccp mac_hid sch_fq_codel parport_p
c ppdev lp parport ip_tables x_tables autofs4 vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio hid_generic usbhid hid i2c_piix4 ahci libahci wmi gpio_amdpt
[ 253.138370] gpio_generic
[ 253.138372] CPU: 0 PID: 1550 Comm: modprobe Tainted: G W 5.7.0-rc2+ #181
[ 253.138373] Hardware name: MSI MS-7A39/B350M GAMING PRO (MS-7A39), BIOS 2.G0 04/27/2018
[ 253.138376] RIP: 0010:media_create_pad_link+0x1e0/0x200 [mc]
[ 253.138378] Code: 26 fd ff ff 44 8b 4d d0 eb d9 0f 0b 41 b9 ea ff ff ff 44 89 c8 c3 0f 0b 41 b9 ea ff ff ff eb f2 0f 0b 41 b9 ea ff ff ff eb e8 <0f> 0b 41 b9 ea ff ff ff eb af 0f 0b 41 b9 ea ff ff ff eb a5 66 90
[ 253.138379] RSP: 0018:ffffb9ecc0ee7a78 EFLAGS: 00010246
[ 253.138380] RAX: ffff943f706c99d8 RBX: 0000000000000000 RCX: 0000000000000000
[ 253.138381] RDX: ffff943f613e0180 RSI: 0000000000000000 RDI: ffff943f706c9958
[ 253.138381] RBP: ffffb9ecc0ee7ab0 R08: 0000000000000001 R09: ffff943f613e0180
[ 253.138382] R10: ffff943f613e0180 R11: ffff943f706c9400 R12: 0000000000000000
[ 253.138383] R13: 0000000000000001 R14: ffff943f706c9958 R15: 0000000000000001
[ 253.138384] FS: 00007f3cd29ba540(0000) GS:ffff943f8ec00000(0000) knlGS:0000000000000000
[ 253.138385] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 253.138385] CR2: 000055f7de0ca830 CR3: 00000003dd208000 CR4: 00000000003406f0
[ 253.138386] Call Trace:
[ 253.138392] media_create_pad_links+0x104/0x1b0 [mc]
[ 253.138397] dvb_create_media_graph+0x350/0x5f0 [dvb_core]
[ 253.138402] em28xx_dvb_init+0x5ea/0x2600 [em28xx_dvb]
[ 253.138408] em28xx_register_extension+0x63/0xc0 [em28xx]
[ 253.138410] ? 0xffffffffc039c000
[ 253.138412] em28xx_dvb_register+0x15/0x1000 [em28xx_dvb]
[ 253.138416] do_one_initcall+0x71/0x250
[ 253.138418] ? do_init_module+0x27/0x22e
[ 253.138421] ? _cond_resched+0x1a/0x50
[ 253.138423] ? kmem_cache_alloc_trace+0x1ec/0x270
[ 253.138425] ? __vunmap+0x1e3/0x240
[ 253.138427] do_init_module+0x5f/0x22e
[ 253.138430] load_module+0x2525/0x2d40
[ 253.138436] __do_sys_finit_module+0xe5/0x120
[ 253.138438] ? __do_sys_finit_module+0xe5/0x120
[ 253.138442] __x64_sys_finit_module+0x1a/0x20
[ 253.138443] do_syscall_64+0x57/0x1b0
[ 253.138445] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 253.138446] RIP: 0033:0x7f3cd24dc839
[ 253.138448] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[ 253.138449] RSP: 002b:00007ffe4fc514d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 253.138450] RAX: ffffffffffffffda RBX: 000055a9237f63f0 RCX: 00007f3cd24dc839
[ 253.138451] RDX: 0000000000000000 RSI: 000055a922c3ad2e RDI: 0000000000000000
[ 253.138451] RBP: 000055a922c3ad2e R08: 0000000000000000 R09: 0000000000000000
[ 253.138452] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 253.138453] R13: 000055a9237f5550 R14: 0000000000040000 R15: 000055a9237f63f0
[ 253.138456] ---[ end trace a60f19c54aa96ec4 ]---
[ 234.915628] ------------[ cut here ]------------
[ 234.915640] WARNING: CPU: 0 PID: 1502 at drivers/media/mc/mc-entity.c:669 media_create_pad_link+0x1e0/0x200 [mc]
[ 234.915641] Modules linked in: si2157 lgdt3306a cx231xx_dvb(+) dvb_core cx231xx_alsa cx25840 cx231xx tveeprom cx2341x i2c_mux videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc ir_rc5_decoder rc_hauppauge mceusb rc_core edac_mce_amd kvm nls_iso8859
_1 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper efi_pstore wmi_bmof k10temp asix usbnet mii nouveau snd_hda_codec_realtek snd_hda_codec_generic input_leds ledtrig_audio snd_hda_codec_hdmi mxm_wmi snd_hda_intel video snd_intel_dspcf
g ttm snd_hda_codec drm_kms_helper snd_hda_core drm snd_hwdep snd_seq_midi snd_seq_midi_event i2c_algo_bit snd_pcm snd_rawmidi fb_sys_fops snd_seq syscopyarea sysfillrect snd_seq_device sysimgblt snd_timer snd soundcore ccp mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tab
les x_tables autofs4 vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio hid_generic usbhid hid i2c_piix4 ahci libahci wmi gpio_amdpt gpio_generic
[ 234.915700] CPU: 0 PID: 1502 Comm: modprobe Not tainted 5.7.0-rc2+ #181
[ 234.915702] Hardware name: MSI MS-7A39/B350M GAMING PRO (MS-7A39), BIOS 2.G0 04/27/2018
[ 234.915709] RIP: 0010:media_create_pad_link+0x1e0/0x200 [mc]
[ 234.915712] Code: 26 fd ff ff 44 8b 4d d0 eb d9 0f 0b 41 b9 ea ff ff ff 44 89 c8 c3 0f 0b 41 b9 ea ff ff ff eb f2 0f 0b 41 b9 ea ff ff ff eb e8 <0f> 0b 41 b9 ea ff ff ff eb af 0f 0b 41 b9 ea ff ff ff eb a5 66 90
[ 234.915714] RSP: 0018:ffffb9ecc1b6fa50 EFLAGS: 00010246
[ 234.915717] RAX: ffff943f8c94a9d8 RBX: 0000000000000000 RCX: 0000000000000000
[ 234.915719] RDX: ffff943f613e0900 RSI: 0000000000000000 RDI: ffff943f8c94a958
[ 234.915721] RBP: ffffb9ecc1b6fa88 R08: 0000000000000001 R09: ffff943f613e0900
[ 234.915723] R10: ffff943f613e0900 R11: ffff943f6b590c00 R12: 0000000000000000
[ 234.915724] R13: 0000000000000001 R14: ffff943f8c94a958 R15: 0000000000000001
[ 234.915727] FS: 00007f4ca3646540(0000) GS:ffff943f8ec00000(0000) knlGS:0000000000000000
[ 234.915729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 234.915731] CR2: 00007fff7a53ba18 CR3: 00000003da614000 CR4: 00000000003406f0
[ 234.915733] Call Trace:
[ 234.915745] media_create_pad_links+0x104/0x1b0 [mc]
[ 234.915756] dvb_create_media_graph+0x350/0x5f0 [dvb_core]
[ 234.915766] dvb_init.part.4+0x691/0x1360 [cx231xx_dvb]
[ 234.915780] dvb_init+0x1a/0x20 [cx231xx_dvb]
[ 234.915787] cx231xx_register_extension+0x71/0xa0 [cx231xx]
[ 234.915791] ? 0xffffffffc042f000
[ 234.915796] cx231xx_dvb_register+0x15/0x1000 [cx231xx_dvb]
[ 234.915802] do_one_initcall+0x71/0x250
[ 234.915807] ? do_init_module+0x27/0x22e
[ 234.915811] ? _cond_resched+0x1a/0x50
[ 234.915816] ? kmem_cache_alloc_trace+0x1ec/0x270
[ 234.915820] ? __vunmap+0x1e3/0x240
[ 234.915826] do_init_module+0x5f/0x22e
[ 234.915831] load_module+0x2525/0x2d40
[ 234.915848] __do_sys_finit_module+0xe5/0x120
[ 234.915850] ? __do_sys_finit_module+0xe5/0x120
[ 234.915862] __x64_sys_finit_module+0x1a/0x20
[ 234.915865] do_syscall_64+0x57/0x1b0
[ 234.915870] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 234.915872] RIP: 0033:0x7f4ca3168839
[ 234.915876] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[ 234.915878] RSP: 002b:00007ffcea3db3b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 234.915881] RAX: ffffffffffffffda RBX: 000055af22c29340 RCX: 00007f4ca3168839
[ 234.915882] RDX: 0000000000000000 RSI: 000055af22c38390 RDI: 0000000000000001
[ 234.915884] RBP: 000055af22c38390 R08: 0000000000000000 R09: 0000000000000000
[ 234.915885] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[ 234.915887] R13: 000055af22c29060 R14: 0000000000040000 R15: 0000000000000000
[ 234.915896] ---[ end trace a60f19c54aa96ec3 ]---
Signed-off-by: Brad Love <brad@nextdimension.cc>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
|
|
The previous commit:
c6e7bd7afaeb: ("sched/core: Optimize ttwu() spinning on p->on_cpu")
avoids spinning on p->on_rq when the task is descheduling, but only if the
wakee is on a CPU that does not share cache with the waker.
This patch offloads the activation of the wakee to the CPU that is about to
go idle if the task is the only one on the runqueue. This potentially allows
the waker task to continue making progress when the wakeup is not strictly
synchronous.
This is very obvious with netperf UDP_STREAM running on localhost. The
waker is sending packets as quickly as possible without waiting for any
reply. It frequently wakes the server for the processing of packets and
when netserver is using local memory, it quickly completes the processing
and goes back to idle. The waker often observes that netserver is on_rq
and spins excessively leading to a drop in throughput.
This is a comparison of 5.7-rc6 against "sched: Optimize ttwu() spinning
on p->on_cpu" and against this patch labeled vanilla, optttwu-v1r1 and
localwakelist-v1r2 respectively.
5.7.0-rc6 5.7.0-rc6 5.7.0-rc6
vanilla optttwu-v1r1 localwakelist-v1r2
Hmean send-64 251.49 ( 0.00%) 258.05 * 2.61%* 305.59 * 21.51%*
Hmean send-128 497.86 ( 0.00%) 519.89 * 4.43%* 600.25 * 20.57%*
Hmean send-256 944.90 ( 0.00%) 997.45 * 5.56%* 1140.19 * 20.67%*
Hmean send-1024 3779.03 ( 0.00%) 3859.18 * 2.12%* 4518.19 * 19.56%*
Hmean send-2048 7030.81 ( 0.00%) 7315.99 * 4.06%* 8683.01 * 23.50%*
Hmean send-3312 10847.44 ( 0.00%) 11149.43 * 2.78%* 12896.71 * 18.89%*
Hmean send-4096 13436.19 ( 0.00%) 13614.09 ( 1.32%) 15041.09 * 11.94%*
Hmean send-8192 22624.49 ( 0.00%) 23265.32 * 2.83%* 24534.96 * 8.44%*
Hmean send-16384 34441.87 ( 0.00%) 36457.15 * 5.85%* 35986.21 * 4.48%*
Note that this benefit is not universal to all wakeups, it only applies
to the case where the waker often spins on p->on_rq.
The impact can be seen from a "perf sched latency" report generated from
a single iteration of one packet size:
-----------------------------------------------------------------------------------------------------------------
Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
-----------------------------------------------------------------------------------------------------------------
vanilla
netperf:4337 | 21709.193 ms | 2932 | avg: 0.002 ms | max: 0.041 ms | max at: 112.154512 s
netserver:4338 | 14629.459 ms | 5146990 | avg: 0.001 ms | max: 1615.864 ms | max at: 140.134496 s
localwakelist-v1r2
netperf:4339 | 29789.717 ms | 2460 | avg: 0.002 ms | max: 0.059 ms | max at: 138.205389 s
netserver:4340 | 18858.767 ms | 7279005 | avg: 0.001 ms | max: 0.362 ms | max at: 135.709683 s
-----------------------------------------------------------------------------------------------------------------
Note that the average wakeup delay is quite small on both the vanilla
kernel and with the two patches applied. However, there are significant
outliers with the vanilla kernel with the maximum one measured as 1615
milliseconds with a vanilla kernel but never worse than 0.362 ms with
both patches applied and a much higher rate of context switching.
Similarly a separate profile of cycles showed that 2.83% of all cycles
were spent in try_to_wake_up() with almost half of the cycles spent
on spinning on p->on_rq. With the two patches, the percentage of cycles
spent in try_to_wake_up() drops to 1.13%
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: valentin.schneider@arm.com
Cc: Hillf Danton <hdanton@sina.com>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20200524202956.27665-3-mgorman@techsingularity.net
|
|
Both Rik and Mel reported seeing ttwu() spend significant time on:
smp_cond_load_acquire(&p->on_cpu, !VAL);
Attempt to avoid this by queueing the wakeup on the CPU that owns the
p->on_cpu value. This will then allow the ttwu() to complete without
further waiting.
Since we run schedule() with interrupts disabled, the IPI is
guaranteed to happen after p->on_cpu is cleared, this is what makes it
safe to queue early.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: valentin.schneider@arm.com
Cc: Hillf Danton <hdanton@sina.com>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20200524202956.27665-2-mgorman@techsingularity.net
|
|
vxlan_fdb_info() is not always called with RTNL held or from an RCU
read-side critical section. For example, in the following call path:
vxlan_cleanup()
vxlan_fdb_destroy()
vxlan_fdb_notify()
__vxlan_fdb_notify()
vxlan_fdb_info()
The use of rtnl_dereference() can therefore result in the following
splat [1].
Fix this by dereferencing the nexthop under RCU read-side critical
section.
[1]
[May24 22:56] =============================
[ +0.004676] WARNING: suspicious RCU usage
[ +0.004614] 5.7.0-rc5-custom-16219-g201392003491 #2772 Not tainted
[ +0.007116] -----------------------------
[ +0.004657] drivers/net/vxlan.c:276 suspicious rcu_dereference_check() usage!
[ +0.008164]
other info that might help us debug this:
[ +0.009126]
rcu_scheduler_active = 2, debug_locks = 1
[ +0.007504] 5 locks held by bash/6892:
[ +0.004392] #0: ffff8881d47e3410 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: __do_execve_file.isra.27+0x392/0x23c0
[ +0.011795] #1: ffff8881d47e34b0 (&sig->exec_update_mutex){+.+.}-{3:3}, at: flush_old_exec+0x510/0x2030
[ +0.010947] #2: ffff8881a141b0b0 (ptlock_ptr(page)#2){+.+.}-{2:2}, at: unmap_page_range+0x9c0/0x2590
[ +0.010585] #3: ffff888230009d50 ((&vxlan->age_timer)){+.-.}-{0:0}, at: call_timer_fn+0xe8/0x800
[ +0.010192] #4: ffff888183729bc8 (&vxlan->hash_lock[h]){+.-.}-{2:2}, at: vxlan_cleanup+0x133/0x4a0
[ +0.010382]
stack backtrace:
[ +0.005103] CPU: 1 PID: 6892 Comm: bash Not tainted 5.7.0-rc5-custom-16219-g201392003491 #2772
[ +0.009675] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[ +0.010155] Call Trace:
[ +0.002775] <IRQ>
[ +0.002313] dump_stack+0xfd/0x178
[ +0.003895] lockdep_rcu_suspicious+0x14a/0x153
[ +0.005157] vxlan_fdb_info+0xe39/0x12a0
[ +0.004775] __vxlan_fdb_notify+0xb8/0x160
[ +0.004672] vxlan_fdb_notify+0x8e/0xe0
[ +0.004370] vxlan_fdb_destroy+0x117/0x330
[ +0.004662] vxlan_cleanup+0x1aa/0x4a0
[ +0.004329] call_timer_fn+0x1c4/0x800
[ +0.004357] run_timer_softirq+0x129d/0x17e0
[ +0.004762] __do_softirq+0x24c/0xaef
[ +0.004232] irq_exit+0x167/0x190
[ +0.003767] smp_apic_timer_interrupt+0x1dd/0x6a0
[ +0.005340] apic_timer_interrupt+0xf/0x20
[ +0.004620] </IRQ>
Fixes: 1274e1cc4226 ("vxlan: ecmp support for mac fdb entries")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ido Schimmel says:
====================
mlxsw: Various trap changes - part 1
This patch set contains various changes in mlxsw trap configuration.
Another set will perform similar changes before exposing control traps
(e.g., IGMP query, ARP request) via devlink-trap.
Tested with existing devlink-trap selftests. Please see individual
patches for a detailed changelog.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix incorrect spelling of "advertisement".
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|