Age | Commit message (Collapse) | Author |
|
Since both unused block groups and reclaim bgs lists are protected by
unused_bgs_lock then free them in the same critical section without
doing an extra unlock/lock pair.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When enabling quotas, we attempt to commit a transaction while holding the
mutex fs_info->qgroup_ioctl_lock. This can result on a deadlock with other
quota operations such as:
- qgroup creation and deletion, ioctl BTRFS_IOC_QGROUP_CREATE;
- adding and removing qgroup relations, ioctl BTRFS_IOC_QGROUP_ASSIGN.
This is because these operations join a transaction and after that they
attempt to lock the mutex fs_info->qgroup_ioctl_lock. Acquiring that mutex
after joining or starting a transaction is a pattern followed everywhere
in qgroups, so the quota enablement operation is the one at fault here,
and should not commit a transaction while holding that mutex.
Fix this by making the transaction commit while not holding the mutex.
We are safe from two concurrent tasks trying to enable quotas because
we are serialized by the rw semaphore fs_info->subvol_sem at
btrfs_ioctl_quota_ctl(), which is the only call site for enabling
quotas.
When this deadlock happens, it produces a trace like the following:
INFO: task syz-executor:25604 blocked for more than 143 seconds.
Not tainted 5.15.0-rc6 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor state:D stack:24800 pid:25604 ppid: 24873 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0xcd9/0x2530 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
btrfs_commit_transaction+0x994/0x2e90 fs/btrfs/transaction.c:2201
btrfs_quota_enable+0x95c/0x1790 fs/btrfs/qgroup.c:1120
btrfs_ioctl_quota_ctl fs/btrfs/ioctl.c:4229 [inline]
btrfs_ioctl+0x637e/0x7b70 fs/btrfs/ioctl.c:5010
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f86920b2c4d
RSP: 002b:00007f868f61ac58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f86921d90a0 RCX: 00007f86920b2c4d
RDX: 0000000020005e40 RSI: 00000000c0109428 RDI: 0000000000000008
RBP: 00007f869212bd80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f86921d90a0
R13: 00007fff6d233e4f R14: 00007fff6d233ff0 R15: 00007f868f61adc0
INFO: task syz-executor:25628 blocked for more than 143 seconds.
Not tainted 5.15.0-rc6 #4
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor state:D stack:29080 pid:25628 ppid: 24873 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4940 [inline]
__schedule+0xcd9/0x2530 kernel/sched/core.c:6287
schedule+0xd3/0x270 kernel/sched/core.c:6366
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425
__mutex_lock_common kernel/locking/mutex.c:669 [inline]
__mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729
btrfs_remove_qgroup+0xb7/0x7d0 fs/btrfs/qgroup.c:1548
btrfs_ioctl_qgroup_create fs/btrfs/ioctl.c:4333 [inline]
btrfs_ioctl+0x683c/0x7b70 fs/btrfs/ioctl.c:5014
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Hao Sun <sunhao.th@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CACkBjsZQF19bQ1C6=yetF3BvL10OSORpFUcWXTP6HErshDB4dQ@mail.gmail.com/
Fixes: 340f1aa27f36 ("btrfs: qgroups: Move transaction management inside btrfs_quota_enable/disable")
CC: stable@vger.kernel.org # 4.19
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
When doing a direct IO write against a file range that either has
preallocated extents in that range or has regular extents and the file
has the NOCOW attribute set, the write fails with -ENOSPC when all of
the following conditions are met:
1) There are no data blocks groups with enough free space matching
the size of the write;
2) There's not enough unallocated space for allocating a new data block
group;
3) The extents in the target file range are not shared, neither through
snapshots nor through reflinks.
This is wrong because a NOCOW write can be done in such case, and in fact
it's possible to do it using a buffered IO write, since when failing to
allocate data space, the buffered IO path checks if a NOCOW write is
possible.
The failure in direct IO write path comes from the fact that early on,
at btrfs_dio_iomap_begin(), we try to allocate data space for the write
and if it that fails we return the error and stop - we never check if we
can do NOCOW. But later, at btrfs_get_blocks_direct_write(), we check
if we can do a NOCOW write into the range, or a subset of the range, and
then release the previously reserved data space.
Fix this by doing the data reservation only if needed, when we must COW,
at btrfs_get_blocks_direct_write() instead of doing it at
btrfs_dio_iomap_begin(). This also simplifies a bit the logic and removes
the inneficiency of doing unnecessary data reservations.
The following example test script reproduces the problem:
$ cat dio-nocow-enospc.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
# Use a small fixed size (1G) filesystem so that it's quick to fill
# it up.
# Make sure the mixed block groups feature is not enabled because we
# later want to not have more space available for allocating data
# extents but still have enough metadata space free for the file writes.
mkfs.btrfs -f -b $((1024 * 1024 * 1024)) -O ^mixed-bg $DEV
mount $DEV $MNT
# Create our test file with the NOCOW attribute set.
touch $MNT/foobar
chattr +C $MNT/foobar
# Now fill in all unallocated space with data for our test file.
# This will allocate a data block group that will be full and leave
# no (or a very small amount of) unallocated space in the device, so
# that it will not be possible to allocate a new block group later.
echo
echo "Creating test file with initial data..."
xfs_io -c "pwrite -S 0xab -b 1M 0 900M" $MNT/foobar
# Now try a direct IO write against file range [0, 10M[.
# This should succeed since this is a NOCOW file and an extent for the
# range was previously allocated.
echo
echo "Trying direct IO write over allocated space..."
xfs_io -d -c "pwrite -S 0xcd -b 10M 0 10M" $MNT/foobar
umount $MNT
When running the test:
$ ./dio-nocow-enospc.sh
(...)
Creating test file with initial data...
wrote 943718400/943718400 bytes at offset 0
900 MiB, 900 ops; 0:00:01.43 (625.526 MiB/sec and 625.5265 ops/sec)
Trying direct IO write over allocated space...
pwrite: No space left on device
A test case for fstests will follow, testing both this direct IO write
scenario as well as the buffered IO write scenario to make it less likely
to get future regressions on the buffered IO case.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
Moreover, as stated in [1], dma_set_mask() with a 64-bit mask will never
fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.
That said, 'high_dma' can only be 1 after a successful
dma_set_mask_and_coherent().
Simplify code and remove some dead code accordingly, including a now
useless parameter to vxge_device_register().
[1]: https://lkml.org/lkml/2021/6/7/398
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
Moreover, as stated in [1], dma_set_mask() with a 64-bit mask will never
fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.
That said, 'dma_flag' can only be 'true' after a successful
dma_set_mask_and_coherent().
Simplify code and remove some dead code accordingly, including the now
useless 'high_dma_flag' field in 'struct s2io_nic'.
[1]: https://lkml.org/lkml/2021/6/7/398
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Sorry for being rude but new vendors/drivers are supposed to be disabled
by default, otherwise we will have to manually keep track of all vendors
we are not interested in building.
Fixes: 2f207cbf0dd4 ("net: vertexcom: Add MSE102x SPI support")
CC: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GPIO library now accepts fwnode as a firmware node, so
switch the driver to use it and hence rectify the ACPI
case which uses software nodes.
Note, in this case it's rather logical fix that doesn't
affect functionality, thus no backporting required.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
If the driver sets the fwnode in struct gpio_chip, let it take
precedence over the parent's fwnode.
This is a follow up to the commit 9126a738edc1 ("gpiolib: of: make
fwnode take precedence in struct gpio_chip").
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
Add Doug and Florian as maintainers for gpio-brcmstb, and remove myself.
Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
Each aspeed sgpio bank has 64 gpio pins(32 input pins and 32 output pins).
The hwirq base for each sgpio bank should be multiples of 64 rather than
multiples of 32.
Signed-off-by: Steven Lee <steven_lee@aspeedtech.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
The commit 6c56c6cd8031 ("gpio: samsung: Drop support for Exynos SoCs")
removed support for the Samsung Exynos SoC in lrgacy GPIO driver, since
it was moved to new pinctrl driver. Remove old, unused bindings.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
kfree() and bitmap_free() are the same. But using the later is more
consistent when freeing memory allocated with bitmap_alloc().
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix TUI exit screen refresh race condition in 'perf top'.
- Fix parsing of Intel PT VM time correlation arguments.
- Honour CPU filtering command line request of a script's switch events
in 'perf script'.
- Fix printing of switch events in Intel PT python script.
- Fix duplicate alias events list printing in 'perf list', noticed on
heterogeneous arm64 systems.
- Fix return value of ids__new(), users expect NULL for failure, not
ERR_PTR(-ENOMEM).
* tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf top: Fix TUI exit screen refresh race condition
perf pmu: Fix alias events list
perf scripts python: intel-pt-events.py: Fix printing of switch events
perf script: Fix CPU filtering of a script's switch events
perf intel-pt: Fix parsing of VM time correlation arguments
perf expr: Fix return value of ids__new()
|
|
Colin Foster says:
====================
lynx pcs interface cleanup
The current Felix driver (and Seville) rely directly on the lynx_pcs
device. There are other possible PCS interfaces that can be used with
this hardware, so this should be abstracted from felix. The generic
phylink_pcs is used instead.
While going through the code, there were some opportunities to change
some misleading variable names. Those are included in this patch set.
v1->v2
* compile-time fixes for freescale parts
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pcs-lynx.c used lynx_pcs and lynx as a variable name within the same file.
This standardizes all internal variables to just "lynx"
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A simple variable update from "pcs" to "mdio_device" for the mdio device
will make things a little cleaner.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A simple variable update from "pcs" to "mdio_device" for the mdio device
will make things a little cleaner.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Simple rename of a variable to make things more logical.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove references to lynx_pcs structures so drivers like the Felix DSA
can reference alternate PCS drivers.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 26eee0210ad7 ("net/fsl: fix a bug in xgmac_mdio") fixed a bug in
the QorIQ mdio driver but left the (now unused) incorrect bit definition
for MDIO_DATA_BSY in the code. This commit removes it.
Signed-off-by: Markus Koch <markus@notsyncing.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Better input validation for compat ioctls and a documentation bugfix
for 5.16"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Docs: Fixes link to I2C specification
i2c: validate user data in compat ioctl
|
|
As the only user of pcmcia_release_io() is not interested in its return
value, and we cannot do anything on failure, convert the function to return
void.
Reported-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
nonstatic_find_mem_region()
In nonstatic_find_mem_region(), pcmcia_make_resource() is assigned to
res and used in pci_bus_alloc_resource(). There a dereference of res
in pci_bus_alloc_resource(), which could lead to a NULL pointer
dereference on failure of pcmcia_make_resource().
Fix this bug by adding a check of res.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_PCCARD_NONSTATIC=y show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: 49b1153adfe1 ("pcmcia: move all pcmcia_resource_ops providers into one module")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
__nonstatic_find_io_region()
In __nonstatic_find_io_region(), pcmcia_make_resource() is assigned to
res and used in pci_bus_alloc_resource(). There is a dereference of res
in pci_bus_alloc_resource(), which could lead to a NULL pointer
dereference on failure of pcmcia_make_resource().
Fix this bug by adding a check of res.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_PCCARD_NONSTATIC=y show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: 49b1153adfe1 ("pcmcia: move all pcmcia_resource_ops providers into one module")
Signed-off-by: Zhou Qingyang <zhou1615@umn.edu>
[linux@dominikbrodowski.net: Fix typo in commit message]
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
The exca_readw() function is currently unused; therefore, comment it out.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
Use the helper macro SET_NOIRQ_SYSTEM_SLEEP_PM_OPS() instead of
the verbose operators ".suspend_noirq /.resume_noirq/.freeze_noirq/
.thaw_noirq/.poweroff_noirq/.restore_noirq", because the
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS() is a nice helper macro that could
be brought in to make code a little clearer, a little more concise.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
Commit 9d3239147d6d ("ARM: pxa: remove Compulab pxa2xx boards") removes
the config MACH_ARMCORE in ./arch/arm/mach-pxa/Kconfig.
Hence, ./scripts/checkkconfigsymbols.py warns on non-existing configs:
MACH_ARMCORE
Referencing files: drivers/pcmcia/Kconfig, drivers/pcmcia/Makefile
Clean up the dead remains of pcmcia drivers for Compulab pxa2xx boards.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
- Use the proper CONFIG symbol in a preprocessor check.
* tag 'x86_urgent_for_v5.16_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Use the proper name CONFIG_FW_LOADER
|
|
In [1], Christoph Hellwig has proposed to remove the wrappers in
include/linux/pci-dma-compat.h.
Some reasons why this API should be removed have been given by Julia
Lawall in [2].
A coccinelle script has been used to perform the needed transformation
Only relevant parts are given below.
@@
expression e1, e2;
@@
- pci_dma_mapping_error(e1, e2)
+ dma_mapping_error(&e1->dev, e2)
[1]: https://lore.kernel.org/kernel-janitors/20200421081257.GA131897@infradead.org/
[2]: https://lore.kernel.org/kernel-janitors/alpine.DEB.2.22.394.2007120902170.2424@hadrien/
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
Moreover, as stated in [1], dma_set_mask() with a 64-bit mask will never
fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.
Simplify code and remove some dead code accordingly.
Now that qed_set_coherency_mask() is mostly a single call to
dma_set_mask_and_coherent(), fold it in its only caller.
[1]: https://lkml.org/lkml/2021/6/7/398
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
Moreover, as stated in [1], dma_set_mask() with a 64-bit mask will never
fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.
That said, 'pci_using_dac' can only be 1 after a successful
dma_set_mask_and_coherent().
Simplify code and remove some dead code accordingly.
[1]: https://lkml.org/lkml/2021/6/7/398
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
Moreover, as stated in [1], dma_set_mask() with a 64-bit mask will never
fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.
That said, 'pci_using_dac' can only be 1 after a successful
dma_set_mask_and_coherent().
Simplify code and remove some dead code accordingly.
[1]: https://lkml.org/lkml/2021/6/7/398
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Hytera makes a range of digital (DMR) radios. These radios can be
programmed to a allow a computer to control them over Ethernet over USB,
either using NCM or RNDIS.
This commit adds support for RNDIS for Hytera radios. I tested with a
Hytera PD785 and a Hytera MD785G. When these radios are programmed to
set up a Radio to PC Network using RNDIS, an USB interface will be added
with class 2 (Communications), subclass 2 (Abstract Modem Control) and
an interface protocol of 255 ("vendor specific" - lsusb even hints "MSFT
RNDIS?").
This patch is similar to the solution of this StackOverflow user, but
that only works for the Hytera MD785:
https://stackoverflow.com/a/53550858
To use the "Radio to PC Network" functionality of Hytera DMR radios, the
radios need to be programmed correctly in CPS (Hytera's Customer
Programming Software). "Forward to PC" should be checked in "Network"
(under "General Setting" in "Conventional") and the "USB Network
Communication Protocol" should be set to RNDIS.
Signed-off-by: Thomas Toye <thomas@toye.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add comments for both smc_link_sendable() and smc_link_usable()
to help better distinguish and use them.
No function changes.
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When the following command is executed several times, a coredump file is
generated.
$ timeout -k 9 5 perf top -e task-clock
*******
*******
*******
0.01% [kernel] [k] __do_softirq
0.01% libpthread-2.28.so [.] __pthread_mutex_lock
0.01% [kernel] [k] __ll_sc_atomic64_sub_return
double free or corruption (!prev) perf top --sort comm,dso
timeout: the monitored command dumped core
When we terminate "perf top" using sending signal method,
SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen
management routines by freeing all memory allocated while it was active.
However SLsmg_reinit_smg() maybe be called by another thread.
SLsmg_reinit_smg() will free the same memory accessed by
SLsmg_reset_smg(), thus it results in a double free.
SLsmg_reinit_smg() is called already protected by ui__lock, so we fix
the problem by adding pthread_mutex_trylock of ui__lock when calling
SLsmg_reset_smg().
Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: wuxu.wu@huawei.com
Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.com
Signed-off-by: Hewenliang <hewenliang4@huawei.com>
Signed-off-by: yaowenbin <yaowenbin1@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit 0e0ae8742207c3b4 ("perf list: Display hybrid PMU events with cpu
type") changes the event list for uncore PMUs or arm64 heterogeneous CPU
systems, such that duplicate aliases are incorrectly listed per PMU
(which they should not be), like:
# perf list
...
unc_cbo_cache_lookup.any_es
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in E or S-state]
unc_cbo_cache_lookup.any_es
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in E or S-state]
unc_cbo_cache_lookup.any_i
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in I-state]
unc_cbo_cache_lookup.any_i
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in I-state]
...
Notice how the events are listed twice.
The named commit changed how we remove duplicate events, in that events
for different PMUs are not treated as duplicates. I suppose this is to
handle how "Each hybrid pmu event has been assigned with a pmu name".
Fix PMU alias listing by restoring behaviour to remove duplicates for
non-hybrid PMUs.
Fixes: 0e0ae8742207c3b4 ("perf list: Display hybrid PMU events with cpu type")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The same fix in commit 5ec7d18d1813 ("sctp: use call_rcu to free endpoint")
is also needed for dumping one asoc and sock after the lookup.
Fixes: 86fdb3448cc1 ("sctp: ensure ep is not destroyed before doing the dump")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Arthur Kiyanovski says:
====================
ENA driver bug fixes
Patchset V2 chages:
-------------------
Updated SHA1 of Fixes tag in patch 3/3 to be 12 digits long
Original cover letter:
----------------------
ENA driver bug fixes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The role of ena_calc_max_io_queue_num() is to return the number
of queues supported by the device, which means the return value
should be >=0.
The function that calls ena_calc_max_io_queue_num(), checks
the return value. If it is 0, it means the device reported
it supports 0 IO queues. This case is considered an error
and is handled by the calling function accordingly.
However the current implementation of ena_calc_max_io_queue_num()
is wrong, since when it detects the device supports 0 IO queues,
it returns -EFAULT.
In such a case the calling function doesn't detect the error,
and therefore doesn't handle it.
This commit changes ena_calc_max_io_queue_num() to return 0
in case the device reported it supports 0 queues, allowing the
calling function to properly handle the error case.
Fixes: 736ce3f414cc ("net: ena: make ethtool -l show correct max number of queues")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
A wrong request id received from the device is a sign that
something is wrong with it, therefore trigger a device reset.
Also add some debug info to the "Page is NULL" print to make
it easier to debug.
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ena_com_tx_comp_req_id_get() checks the req_id of a received completion,
and if it is out of bounds returns -EINVAL. This is a sign that
something is wrong with the device and it needs to be reset.
The current code does not reset the device in this case, which leaves
the driver in an undefined state, where this completion is not properly
handled.
This commit adds a call to handle_invalid_req_id() in ena_clean_tx_irq()
and ena_clean_xdp_irq() which resets the device to fix the issue.
This commit also removes unnecessary request id checks from
validate_tx_req_id() and validate_xdp_req_id(). This check is unneeded
because it was already performed in ena_com_tx_comp_req_id_get(), which
is called right before these functions.
Fixes: 548c4940b9f1 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
Moreover, as stated in [1], dma_set_mask_and_coherent() with a 64-bit mask
will never fail if dev->dma_mask is non-NULL.
So, if it fails, the 32 bits case will also fail for the same reason.
That said, 'pci_using_dac' can only be 1 after a successful
dma_set_mask_and_coherent().
Simplify code and remove some dead code accordingly.
[1]: https://lkml.org/lkml/2021/6/7/398
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use dma_set_mask_and_coherent() instead of unrolling it with some
dma_set_mask()+dma_set_coherent_mask().
This simplifies code and removes some dead code (dma_set_coherent_mask()
can not fail after a successful dma_set_mask())
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Removed spaces and added a tab that was causing an error on checkpatch
Signed-off-by: Hamish MacDonald <elusivenode@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add neighbour source flag in mctp_neigh_remove(...) to allow removal of
only static neighbours.
This should be a no-op change and might be useful later when mctp can
have MCTP_NEIGH_DISCOVER neighbours.
Signed-off-by: Gagan Kumar <gagan1kumar.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
v3:
- Report 'backlog' (bytes) instead of 'qlen' (number of packets)
v2:
- Fix sparse warning (use rcu_dereference)
This patch adds support for the queue depth in IOAM trace data fields.
The draft [1] says the following:
The "queue depth" field is a 4-octet unsigned integer field. This
field indicates the current length of the egress interface queue of
the interface from where the packet is forwarded out. The queue
depth is expressed as the current amount of memory buffers used by
the queue (a packet could consume one or more memory buffers,
depending on its size).
An existing function (i.e., qdisc_qstats_qlen_backlog) is used to
retrieve the current queue length without reinventing the wheel.
Note: it was tested and qlen is increasing when an artificial delay is
added on the egress with tc.
[1] https://datatracker.ietf.org/doc/html/draft-ietf-ippm-ioam-data#section-5.4.2.7
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The pointer link is being re-assigned the same value that it was
initialized with in the previous declaration statement. The
re-assignment is redundant and can be removed.
Fixes: 387707fdf486 ("net/smc: convert static link ID to dynamic references")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This implements TCP ULP for SMC, helps applications to replace TCP with
SMC protocol in place. And we use it to implement transparent
replacement.
This replaces original TCP sockets with SMC, reuse TCP as clcsock when
calling setsockopt with TCP_ULP option, and without any overhead.
To replace TCP sockets with SMC, there are two approaches:
- use setsockopt() syscall with TCP_ULP option, if error, it would
fallback to TCP.
- use BPF prog with types BPF_CGROUP_INET_SOCK_CREATE or others to
replace transparently. BPF hooks some points in create socket, bind
and others, users can inject their BPF logics without modifying their
applications, and choose which connections should be replaced with SMC
by calling setsockopt() in BPF prog, based on rules, such as TCP tuples,
PID, cgroup, etc...
BPF doesn't support calling setsockopt with TCP_ULP now, I will send the
patches after this accepted.
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tony Lu says:
====================
RDMA device net namespace support for SMC
This patch set introduces net namespace support for linkgroups.
Path 1 is the main approach to implement net ns support.
Path 2 - 4 are the additional modifications to let us know the netns.
Also, I will submit changes of smc-tools to github later.
Currently, smc doesn't support net namespace isolation. The ibdevs
registered to smc are shared for all linkgroups and connections. When
running applications in different net namespaces, such as container
environment, applications should only use the ibdevs that belongs to the
same net namespace.
This adds a new field, net, in smc linkgroup struct. During first
contact, it checks and find the linkgroup has same net namespace, if
not, it is going to create and initialized the net field with first
link's ibdev net namespace. When finding the rdma devices, it also checks
the sk net device's and ibdev's net namespaces. After net namespace
destroyed, the net device and ibdev move to root net namespace,
linkgroups won't be matched, and wait for lgr free.
If rdma net namespace exclusive mode is not enabled, it behaves as
before.
Steps to enable and test net namespaces:
1. enable RDMA device net namespace exclusive support
rdma system set netns exclusive # default is shared
2. create new net namespace, move and initialize them
ip netns add test1
rdma dev set mlx5_1 netns test1
ip link set dev eth2 netns test1
ip netns exec test1 ip link set eth2 up
ip netns exec test1 ip addr add ${HOST_IP}/26 dev eth2
3. setup server and client, connect N <-> M
ip netns exec test1 smc_run sockperf server --tcp # server
ip netns exec test1 smc_run sockperf pp --tcp -i ${SERVER_IP} # client
4. netns isolated linkgroups (2 * 2 mesh) with their own linkgroups
- server
LG-ID LG-Role LG-Type VLAN #Conns PNET-ID
00000100 SERV SINGLE 0 0
00000200 SERV SINGLE 0 0
00000300 SERV SINGLE 0 0
00000400 SERV SINGLE 0 0
- client
LG-ID LG-Role LG-Type VLAN #Conns PNET-ID
00000100 CLNT SINGLE 0 0
00000200 CLNT SINGLE 0 0
00000300 CLNT SINGLE 0 0
00000400 CLNT SINGLE 0 0
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|