Age | Commit message (Collapse) | Author |
|
Split off part of devfreq_get_devfreq_by_phandle into a separate
function. This allows callers to fetch devfreq instances by enumerating
devicetree instead of explicit phandles.
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
[cw00.choi: Export devfreq_get_devfreq_by_node function and
add function to devfreq.h when CONFIG_PM_DEVFREQ is enabled.]
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
|
|
An error occues when sampling non-PEBS INST_RETIRED.PREC_DIST(0x01c0)
event.
perf record -e cpu/event=0xc0,umask=0x01/ -- sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (cpu/event=0xc0,umask=0x01/).
/bin/dmesg | grep -i perf may provide additional information.
The idxmsk64 of the event is set to 0. The event never be successfully
scheduled.
The event should be limit to the fixed counter 0.
Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support")
Reported-by: Yi, Ammy <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200928134726.13090-1-kan.liang@linux.intel.com
|
|
The "MiB" result of the IMC free-running bandwidth events,
uncore_imc_free_running/read/ and uncore_imc_free_running/write/ are 16
times too small.
The "MiB" value equals the raw IMC free-running bandwidth counter value
times a "scale" which is inaccurate.
The IMC free-running bandwidth events should be incremented per 64B
cache line, not DWs (4 bytes). The "scale" should be 6.103515625e-5.
Fix the "scale" for both Snow Ridge and Ice Lake.
Fixes: 2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Fixes: ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200928133240.12977-1-kan.liang@linux.intel.com
|
|
Introduced early attributes /sys/devices/uncore_iio_<pmu_idx>/die* are
initialized by skx_iio_set_mapping(), however, for example, for multiple
segment platforms skx_iio_get_topology() returns -EPERM before a list of
attributes in skx_iio_mapping_group will have been initialized.
As a result the list is being NULL. Thus the warning
"sysfs: (bin_)attrs not set by subsystem for group: uncore_iio_*/" appears
and uncore_iio pmus are not available in sysfs. Clear IIO attr_update
to properly handle the cases when topology information cannot be
retrieved.
Fixes: bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Reported-by: Kyle Meyer <kyle.meyer@hpe.com>
Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alexei Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/20200928102133.61041-1-alexander.antonov@linux.intel.com
|
|
The Jasper Lake processor is also a Tremont microarchitecture. From the
perspective of perf MSR, there is nothing changed compared with
Elkhart Lake.
Share the code path with Elkhart Lake.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1601296242-32763-2-git-send-email-kan.liang@linux.intel.com
|
|
The Jasper Lake processor is also a Tremont microarchitecture. From the
perspective of Intel PMU, there is nothing changed compared with
Elkhart Lake.
Share the perf code with Elkhart Lake.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1601296242-32763-1-git-send-email-kan.liang@linux.intel.com
|
|
An oops is triggered by the fuzzy test.
[ 327.853081] unchecked MSR access error: RDMSR from 0x70c at rIP:
0xffffffffc082c820 (uncore_msr_read_counter+0x10/0x50 [intel_uncore])
[ 327.853083] Call Trace:
[ 327.853085] <IRQ>
[ 327.853089] uncore_pmu_event_start+0x85/0x170 [intel_uncore]
[ 327.853093] uncore_pmu_event_add+0x1a4/0x410 [intel_uncore]
[ 327.853097] ? event_sched_in.isra.118+0xca/0x240
There are 2 GP counters for each CBOX, but the current code claims 4
counters. Accessing the invalid registers triggers the oops.
Fixes: 6e394376ee89 ("perf/x86/intel/uncore: Add Intel Icelake uncore support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200925134905.8839-3-kan.liang@linux.intel.com
|
|
There are some updates for the Icelake model specific uncore performance
monitors. (The update can be found at 10th generation intel core
processors families specification update Revision 004, ICL068)
1) Counter 0 of ARB uncore unit is not available for software use
2) The global 'enable bit' (bit 29) and 'freeze bit' (bit 31) of
MSR_UNC_PERF_GLOBAL_CTRL cannot be used to control counter behavior.
Needs to use local enable in event select MSR.
Accessing the modified bit/registers will be ignored by HW. Users may
observe inaccurate results with the current code.
The changes of the MSR_UNC_PERF_GLOBAL_CTRL imply that groups cannot be
read atomically anymore. Although the error of the result for a group
becomes a bit bigger, it still far lower than not using a group. The
group support is still kept. Only Remove the *_box() related
implementation.
Since the counter 0 of ARB uncore unit is not available, update the MSR
address for the ARB uncore unit.
There is no change for IMC uncore unit, which only include free-running
counters.
Fixes: 6e394376ee89 ("perf/x86/intel/uncore: Add Intel Icelake uncore support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200925134905.8839-2-kan.liang@linux.intel.com
|
|
Previously, the MSR uncore for the Ice Lake and Tiger Lake are
identical. The code path is shared. However, with recent update, the
global MSR_UNC_PERF_GLOBAL_CTRL register and ARB uncore unit are changed
for the Ice Lake. Split the Ice Lake and Tiger Lake MSR uncore support.
The changes only impact the MSR ops() and the ARB uncore unit. Other
codes can still be shared between the Ice Lake and the Tiger Lake.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200925134905.8839-1-kan.liang@linux.intel.com
|
|
Qian Cai reported a BFS_EQUEUEFULL warning [1] after read recursive
deadlock detection merged into tip tree recently. Unlike the previous
lockep graph searching, which iterate every lock class (every node in
the graph) exactly once, the graph searching for read recurisve deadlock
detection needs to iterate every lock dependency (every edge in the
graph) once, as a result, the maximum memory cost of the circular queue
changes from O(V), where V is the number of lock classes (nodes or
vertices) in the graph, to O(E), where E is the number of lock
dependencies (edges), because every lock class or dependency gets
enqueued once in the BFS. Therefore we hit the BFS_EQUEUEFULL case.
However, actually we don't need to enqueue all dependencies for the BFS,
because every time we enqueue a dependency, we almostly enqueue all
other dependencies in the same dependency list ("almostly" is because
we currently check before enqueue, so if a dependency doesn't pass the
check stage we won't enqueue it, however, we can always do in reverse
ordering), based on this, we can only enqueue the first dependency from
a dependency list and every time we want to fetch a new dependency to
work, we can either:
1) fetch the dependency next to the current dependency in the
dependency list
or
2) if the dependency in 1) doesn't exist, fetch the dependency from
the queue.
With this approach, the "max bfs queue depth" for a x86_64_defconfig +
lockdep and selftest config kernel can get descreased from:
max bfs queue depth: 201
to (after apply this patch)
max bfs queue depth: 61
While I'm at it, clean up the code logic a little (e.g. directly return
other than set a "ret" value and goto the "exit" label).
[1]: https://lore.kernel.org/lkml/17343f6f7f2438fc376125384133c5ba70c2a681.camel@redhat.com/
Reported-by: Qian Cai <cai@redhat.com>
Reported-by: syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200917080210.108095-1-boqun.feng@gmail.com
|
|
Like all genl families ethtool_genl_family needs to not
be a straight up constant, because it's modified/initialized
by genl_register_family(). After init, however, it's only
passed to genlmsg_put() & co. therefore we can mark it
as __ro_after_init.
Since genl_family structure contains function pointers
mark this as a fix.
Fixes: 2b4a8990b7df ("ethtool: introduce ethtool netlink interface")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Validation flags are missing kdoc, add it.
Fixes: ef6243acb478 ("genetlink: optionally validate strictly/dumps")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adds the driver_info and usb ids of the AX88179 based MCT U3-A9003 USB
3.0 ethernet adapter.
Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adds the missing .stop entry in the Belkin driver_info structure.
Fixes: e20bd60bf62a ("net: usb: asix88179_178a: Add support for the Belkin B2B128")
Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Touchpad on this laptop is not detected properly during boot, as PNP
enumerates (wrongly) AUX port as disabled on this machine.
Fix that by adding this board (with admittedly quite funny DMI
identifiers) to nopnp quirk list.
Reported-by: Andrés Barrantes Silman <andresbs2000@protonmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2009252337340.3336@cbobk.fhfr.pm
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
Add Synaptics IDs in trackpoint_start_protocol() to mark them as valid.
Signed-off-by: Vincent Huang <vincent.huang@tw.synaptics.com>
Fixes: 6c77545af100 ("Input: trackpoint - add new trackpoint variant IDs")
Reviewed-by: Harry Cutts <hcutts@chromium.org>
Tested-by: Harry Cutts <hcutts@chromium.org>
Link: https://lore.kernel.org/r/20200924053013.1056953-1-vincent.huang@tw.synaptics.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
The rcu read locks are needed to avoid potential race condition while
dereferencing radix tree from multiple threads. The issue was identified
by syzbot. Below is the crash report:
=============================
WARNING: suspicious RCU usage
5.7.0-syzkaller #0 Not tainted
-----------------------------
include/linux/radix-tree.h:176 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by kworker/u4:1/21:
#0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: spin_unlock_irq include/linux/spinlock.h:403 [inline]
#0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: process_one_work+0x6df/0xfd0 kernel/workqueue.c:2241
#1: ffffc90000dd7d80 ((work_completion)(&qrtr_ns.work)){+.+.}-{0:0}, at: process_one_work+0x71e/0xfd0 kernel/workqueue.c:2243
stack backtrace:
CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.7.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: qrtr_ns_handler qrtr_ns_worker
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1e9/0x30e lib/dump_stack.c:118
radix_tree_deref_slot include/linux/radix-tree.h:176 [inline]
ctrl_cmd_new_lookup net/qrtr/ns.c:558 [inline]
qrtr_ns_worker+0x2aff/0x4500 net/qrtr/ns.c:674
process_one_work+0x76e/0xfd0 kernel/workqueue.c:2268
worker_thread+0xa7f/0x1450 kernel/workqueue.c:2414
kthread+0x353/0x380 kernel/kthread.c:268
Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace")
Reported-and-tested-by: syzbot+0f84f6eed90503da72fc@syzkaller.appspotmail.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement the IOMMU device feature callbacks to support the SVA feature.
At the moment dev_has_feat() returns false since I/O Page Faults and BTM
aren't yet implemented.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-12-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Aggregate all sanity-checks for sharing CPU page tables with the SMMU
under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to
check FEAT_ATS and FEAT_PRI. For platform SVA, they will have to check
FEAT_STALLS.
Introduce ARM_SMMU_FEAT_BTM (Broadcast TLB Maintenance), but don't
enable it at the moment. Since the entire VMID space is shared with the
CPU, enabling DVM (by clearing SMMU_CR2.PTM) could result in
over-invalidation and affect performance of stage-2 mappings.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200918101852.582559-11-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The SMMU has a single ASID space, the union of shared and private ASID
sets. This means that the SMMU driver competes with the arch allocator
for ASIDs. Shared ASIDs are those of Linux processes, allocated by the
arch, and contribute in broadcast TLB maintenance. Private ASIDs are
allocated by the SMMU driver and used for "classic" map/unmap DMA. They
require command-queue TLB invalidations.
When we pin down an mm_context and get an ASID that is already in use by
the SMMU, it belongs to a private context. We used to simply abort the
bind, but this is unfair to users that would be unable to bind a few
seemingly random processes. Try to allocate a new private ASID for the
context, and make the old ASID shared.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-10-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR,
MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split
into two sets, shared and private. Shared ASIDs correspond to those
obtained from the arch ASID allocator, and private ASIDs are used for
"classic" map/unmap DMA.
A possible conflict happens when trying to use a shared ASID that has
already been allocated for private use by the SMMU driver. This will be
addressed in a later patch by replacing the private ASID. At the
moment we return -EBUSY.
Each mm_struct shared with the SMMU will have a single context
descriptor. Add a refcount to keep track of this. It will be protected
by the global SVA lock.
Introduce a new arm-smmu-v3-sva.c file and the CONFIG_ARM_SMMU_V3_SVA
option to let users opt in SVA support.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-9-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Allow sharing structure definitions with the upcoming SVA support for
Arm SMMUv3, by moving them to a separate header. We could surgically
extract only what is needed but keeping all definitions in one place
looks nicer.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-8-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Extract some of the most generic TCR defines, so they can be reused by
the page table sharing code.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200918101852.582559-6-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-joerg/arm-smmu/updates
Pull in core arm64 changes required to enable Shared Virtual Memory
(SVM) using SMMUv3. This brings us increasingly closer to being able to
share page-tables directly between user-space tasks running on the CPU
and their corresponding contexts on coherent devices performing DMA
through the SMMU.
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We offer both PTRACE_PEEKMTETAGS and PTRACE_POKEMTETAGS requests via
ptrace().
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Taehee Yoo says:
====================
net: core: fix a lockdep splat in the dev_addr_list.
This patchset is to avoid lockdep splat.
When a stacked interface graph is changed, netif_addr_lock() is called
recursively and it internally calls spin_lock_nested().
The parameter of spin_lock_nested() is 'dev->lower_level',
this is called subclass.
The problem of 'dev->lower_level' is that while 'dev->lower_level' is
being used as a subclass of spin_lock_nested(), its value can be changed.
So, spin_lock_nested() would be called recursively with the same
subclass value, the lockdep understands a deadlock.
In order to avoid this, a new variable is needed and it is going to be
used as a parameter of spin_lock_nested().
The first and second patch is a preparation patch for the third patch.
In the third patch, the problem will be fixed.
The first patch is to add __netdev_upper_dev_unlink().
An existed netdev_upper_dev_unlink() is renamed to
__netdev_upper_dev_unlink(). and netdev_upper_dev_unlink()
is added as an wrapper of this function.
The second patch is to add the netdev_nested_priv structure.
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.
The third patch is to add a new variable 'nested_level'
into the net_device structure.
This variable will be used as a parameter of spin_lock_nested() of
dev->addr_list_lock.
Due to this variable, it can avoid lockdep splat.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to add a new variable 'nested_level' into the net_device
structure.
This variable will be used as a parameter of spin_lock_nested() of
dev->addr_list_lock.
netif_addr_lock() can be called recursively so spin_lock_nested() is
used instead of spin_lock() and dev->lower_level is used as a parameter
of spin_lock_nested().
But, dev->lower_level value can be updated while it is being used.
So, lockdep would warn a possible deadlock scenario.
When a stacked interface is deleted, netif_{uc | mc}_sync() is
called recursively.
So, spin_lock_nested() is called recursively too.
At this moment, the dev->lower_level variable is used as a parameter of it.
dev->lower_level value is updated when interfaces are being unlinked/linked
immediately.
Thus, After unlinking, dev->lower_level shouldn't be a parameter of
spin_lock_nested().
A (macvlan)
|
B (vlan)
|
C (bridge)
|
D (macvlan)
|
E (vlan)
|
F (bridge)
A->lower_level : 6
B->lower_level : 5
C->lower_level : 4
D->lower_level : 3
E->lower_level : 2
F->lower_level : 1
When an interface 'A' is removed, it releases resources.
At this moment, netif_addr_lock() would be called.
Then, netdev_upper_dev_unlink() is called recursively.
Then dev->lower_level is updated.
There is no problem.
But, when the bridge module is removed, 'C' and 'F' interfaces
are removed at once.
If 'F' is removed first, a lower_level value is like below.
A->lower_level : 5
B->lower_level : 4
C->lower_level : 3
D->lower_level : 2
E->lower_level : 1
F->lower_level : 1
Then, 'C' is removed. at this moment, netif_addr_lock() is called
recursively.
The ordering is like this.
C(3)->D(2)->E(1)->F(1)
At this moment, the lower_level value of 'E' and 'F' are the same.
So, lockdep warns a possible deadlock scenario.
In order to avoid this problem, a new variable 'nested_level' is added.
This value is the same as dev->lower_level - 1.
But this value is updated in rtnl_unlock().
So, this variable can be used as a parameter of spin_lock_nested() safely
in the rtnl context.
Test commands:
ip link add br0 type bridge vlan_filtering 1
ip link add vlan1 link br0 type vlan id 10
ip link add macvlan2 link vlan1 type macvlan
ip link add br3 type bridge vlan_filtering 1
ip link set macvlan2 master br3
ip link add vlan4 link br3 type vlan id 10
ip link add macvlan5 link vlan4 type macvlan
ip link add br6 type bridge vlan_filtering 1
ip link set macvlan5 master br6
ip link add vlan7 link br6 type vlan id 10
ip link add macvlan8 link vlan7 type macvlan
ip link set br0 up
ip link set vlan1 up
ip link set macvlan2 up
ip link set br3 up
ip link set vlan4 up
ip link set macvlan5 up
ip link set br6 up
ip link set vlan7 up
ip link set macvlan8 up
modprobe -rv bridge
Splat looks like:
[ 36.057436][ T744] WARNING: possible recursive locking detected
[ 36.058848][ T744] 5.9.0-rc6+ #728 Not tainted
[ 36.059959][ T744] --------------------------------------------
[ 36.061391][ T744] ip/744 is trying to acquire lock:
[ 36.062590][ T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30
[ 36.064922][ T744]
[ 36.064922][ T744] but task is already holding lock:
[ 36.066626][ T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60
[ 36.068851][ T744]
[ 36.068851][ T744] other info that might help us debug this:
[ 36.070731][ T744] Possible unsafe locking scenario:
[ 36.070731][ T744]
[ 36.072497][ T744] CPU0
[ 36.073238][ T744] ----
[ 36.074007][ T744] lock(&vlan_netdev_addr_lock_key);
[ 36.075290][ T744] lock(&vlan_netdev_addr_lock_key);
[ 36.076590][ T744]
[ 36.076590][ T744] *** DEADLOCK ***
[ 36.076590][ T744]
[ 36.078515][ T744] May be due to missing lock nesting notation
[ 36.078515][ T744]
[ 36.080491][ T744] 3 locks held by ip/744:
[ 36.081471][ T744] #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490
[ 36.083614][ T744] #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60
[ 36.085942][ T744] #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80
[ 36.088400][ T744]
[ 36.088400][ T744] stack backtrace:
[ 36.089772][ T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728
[ 36.091364][ T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 36.093630][ T744] Call Trace:
[ 36.094416][ T744] dump_stack+0x77/0x9b
[ 36.095385][ T744] __lock_acquire+0xbc3/0x1f40
[ 36.096522][ T744] lock_acquire+0xb4/0x3b0
[ 36.097540][ T744] ? dev_set_rx_mode+0x19/0x30
[ 36.098657][ T744] ? rtmsg_ifinfo+0x1f/0x30
[ 36.099711][ T744] ? __dev_notify_flags+0xa5/0xf0
[ 36.100874][ T744] ? rtnl_is_locked+0x11/0x20
[ 36.101967][ T744] ? __dev_set_promiscuity+0x7b/0x1a0
[ 36.103230][ T744] _raw_spin_lock_bh+0x38/0x70
[ 36.104348][ T744] ? dev_set_rx_mode+0x19/0x30
[ 36.105461][ T744] dev_set_rx_mode+0x19/0x30
[ 36.106532][ T744] dev_set_promiscuity+0x36/0x50
[ 36.107692][ T744] __dev_set_promiscuity+0x123/0x1a0
[ 36.108929][ T744] dev_set_promiscuity+0x1e/0x50
[ 36.110093][ T744] br_port_set_promisc+0x1f/0x40 [bridge]
[ 36.111415][ T744] br_manage_promisc+0x8b/0xe0 [bridge]
[ 36.112728][ T744] __dev_set_promiscuity+0x123/0x1a0
[ 36.113967][ T744] ? __hw_addr_sync_one+0x23/0x50
[ 36.115135][ T744] __dev_set_rx_mode+0x68/0x90
[ 36.116249][ T744] dev_uc_sync+0x70/0x80
[ 36.117244][ T744] dev_uc_add+0x50/0x60
[ 36.118223][ T744] macvlan_open+0x18e/0x1f0 [macvlan]
[ 36.119470][ T744] __dev_open+0xd6/0x170
[ 36.120470][ T744] __dev_change_flags+0x181/0x1d0
[ 36.121644][ T744] dev_change_flags+0x23/0x60
[ 36.122741][ T744] do_setlink+0x30a/0x11e0
[ 36.123778][ T744] ? __lock_acquire+0x92c/0x1f40
[ 36.124929][ T744] ? __nla_validate_parse.part.6+0x45/0x8e0
[ 36.126309][ T744] ? __lock_acquire+0x92c/0x1f40
[ 36.127457][ T744] __rtnl_newlink+0x546/0x8e0
[ 36.128560][ T744] ? lock_acquire+0xb4/0x3b0
[ 36.129623][ T744] ? deactivate_slab.isra.85+0x6a1/0x850
[ 36.130946][ T744] ? __lock_acquire+0x92c/0x1f40
[ 36.132102][ T744] ? lock_acquire+0xb4/0x3b0
[ 36.133176][ T744] ? is_bpf_text_address+0x5/0xe0
[ 36.134364][ T744] ? rtnl_newlink+0x2e/0x70
[ 36.135445][ T744] ? rcu_read_lock_sched_held+0x32/0x60
[ 36.136771][ T744] ? kmem_cache_alloc_trace+0x2d8/0x380
[ 36.138070][ T744] ? rtnl_newlink+0x2e/0x70
[ 36.139164][ T744] rtnl_newlink+0x47/0x70
[ ... ]
Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
infrastructure
Functions related to nested interface infrastructure such as
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.
In the following patch, a new member variable will be added into this
struct to fix the lockdep issue.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The netdev_upper_dev_unlink() has to work differently according to flags.
This idea is the same with __netdev_upper_dev_link().
In the following patches, new flags will be added.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reading the 'prod' MMIO register in order to determine whether or not
there is valid data beyond 'cons' for a given queue does not provide
sufficient dependency ordering, as the resulting access is address
dependent only on 'cons' and can therefore be speculated ahead of time,
potentially allowing stale data to be read by the CPU.
Use readl() instead of readl_relaxed() when updating the shadow copy of
the 'prod' pointer, so that all speculated memory reads from the
corresponding queue can occur only from valid slots.
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com
[will: Use readl() instead of explicit barrier. Update 'cons' side to match.]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Dentries that represent no-key names must have a dentry_operations that
includes fscrypt_d_revalidate(). Currently, this is handled by
fscrypt_prepare_lookup() installing fscrypt_d_ops.
However, ceph support for encryption
(https://lore.kernel.org/r/20200914191707.380444-1-jlayton@kernel.org)
can't use fscrypt_d_ops, since ceph already has its own
dentry_operations.
Similarly, ext4 and f2fs support for directories that are both encrypted
and casefolded
(https://lore.kernel.org/r/20200923010151.69506-1-drosen@google.com)
can't use fscrypt_d_ops either, since casefolding requires some dentry
operations too.
To satisfy both users, we need to move the responsibility of installing
the dentry_operations to filesystems.
In preparation for this, export fscrypt_d_revalidate() and give it a
!CONFIG_FS_ENCRYPTION stub.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20200924054721.187797-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
|
|
This is a Chinese translated version of Documentation/arm64/amu.rst
Signed-off-by: Bailu Lin <bailu.lin@vivo.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Link: https://lore.kernel.org/r/20200926025233.47214-1-bailu.lin@vivo.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to
share CPU page tables with devices. Allow the driver to be built as
module by exporting the read_sanitized_ftr_reg() cpufeature symbol.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200918101852.582559-7-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add arm64 subdirectory into the table of Contents for zh_CN,
then add other translations in arm64 conveniently.
Signed-off-by: Bailu Lin <bailu.lin@vivo.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Link: https://lore.kernel.org/r/20200926022558.46232-1-bailu.lin@vivo.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
Map the address to my private mail, because my Marvell account has been suspended.
Signed-off-by: Mark Starovoytov <mstarovo@pm.me>
Link: https://lore.kernel.org/r/20200928183948.589-1-mstarovo@pm.me
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
There are behavioural requirements on the seq_file next() function in
terms of how it updates *pos at end-of-file, and these are now enforced
by a warning.
I was recently attempting to justify the reason this was needed, and
couldn't remember the details, and didn't find them in the
documentation.
So I re-read the code until I understood it again, and updated the
documentation to match.
I also enhanced the text about SEQ_START_TOKEN as it seemed potentially
misleading.
Cc: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/87eemqiazh.fsf@notabene.neil.brown.name
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
To enable address space sharing with the IOMMU, introduce
arm64_mm_context_get() and arm64_mm_context_put(), that pin down a
context and ensure that it will keep its ASID after a rollover. Export
the symbols to let the modular SMMUv3 driver use them.
Pinning is necessary because a device constantly needs a valid ASID,
unlike tasks that only require one when running. Without pinning, we would
need to notify the IOMMU when we're about to use a new ASID for a task,
and it would get complicated when a new task is assigned a shared ASID.
Consider the following scenario with no ASID pinned:
1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1)
2. Task t2 is scheduled on CPUx, gets ASID (1, 2)
3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1)
We would now have to immediately generate a new ASID for t1, notify
the IOMMU, and finally enable task tn. We are holding the lock during
all that time, since we can't afford having another CPU trigger a
rollover. The IOMMU issues invalidation commands that can take tens of
milliseconds.
It gets needlessly complicated. All we wanted to do was schedule task tn,
that has no business with the IOMMU. By letting the IOMMU pin tasks when
needed, we avoid stalling the slow path, and let the pinning fail when
we're out of shareable ASIDs.
After a rollover, the allocator expects at least one ASID to be available
in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS -
1) is the maximum number of ASIDs that can be shared with the IOMMU.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-5-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
_sdei_event_unregister() is called by sdei_event_unregister() and
sdei_device_freeze(). _sdei_event_unregister() covers the shared
and private events, but sdei_device_freeze() only covers the shared
events. So the logic to cover the private events isn't needed by
sdei_device_freeze().
sdei_event_unregister sdei_device_freeze
_sdei_event_unregister sdei_unregister_shared
_sdei_event_unregister
This removes _sdei_event_unregister(). Its logic is moved to its
callers accordingly. This shouldn't cause any logical changes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-14-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The function _sdei_event_register() is called by sdei_event_register()
and sdei_device_thaw() as the following functional call chain shows.
_sdei_event_register() covers the shared and private events, but
sdei_device_thaw() only covers the shared events. So the logic to
cover the private events in _sdei_event_register() isn't needed by
sdei_device_thaw().
Similarly, sdei_reregister_event_llocked() covers the shared and
private events in the regard of reenablement. The logic to cover
the private events isn't needed by sdei_device_thaw() either.
sdei_event_register sdei_device_thaw
_sdei_event_register sdei_reregister_shared
sdei_reregister_event_llocked
_sdei_event_register
This removes _sdei_event_register() and sdei_reregister_event_llocked().
Their logic is moved to sdei_event_register() and sdei_reregister_shared().
This shouldn't cause any logical changes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-13-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
During the CPU hotplug, the private events are registered, enabled
or unregistered on the specific CPU. It repeats the same steps:
initializing cross call argument, make function call on local CPU,
check the returned error.
This introduces sdei_do_local_call() to cover the first steps. The
other benefit is to make CROSSCALL_INIT and struct sdei_crosscall_args
are only visible to sdei_do_{cross, local}_call().
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-12-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This applies cleanup on the cross call functions, no functional
changes are introduced:
* Wrap the code block of CROSSCALL_INIT inside "do { } while (0)"
as linux kernel usually does. Otherwise, scripts/checkpatch.pl
reports warning regarding this.
* Use smp_call_func_t for @fn argument in sdei_do_cross_call()
as the function is called on target CPU(s).
* Remove unnecessary space before @event in sdei_do_cross_call()
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-11-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This removes the unnecessary while loop in sdei_event_unregister()
because of the following two reasons. This shouldn't cause any
functional changes.
* The while loop is executed for once, meaning it's not needed
in theory.
* With the while loop removed, the nested statements can be
avoid to make the code a bit cleaner.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200922130423.10173-10-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This removes the unnecessary while loop in sdei_event_register()
because of the following two reasons. This shouldn't cause any
functional changes.
* The while loop is executed for once, meaning it's not needed
in theory.
* With the while loop removed, the nested statements can be
avoid to make the code a bit cleaner.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200922130423.10173-9-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This removes the redundant error message in sdei_probe() because
the case can be identified from the errno in next error message.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-8-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The following two checks are duplicate because @acpi_disabled doesn't
depend on CONFIG_ACPI. So the duplicate check (IS_ENABLED(CONFIG_ACPI))
can be dropped. More details is provided to keep the commit log complete:
* @acpi_disabled is defined in arch/arm64/kernel/acpi.c when
CONFIG_ACPI is enabled.
* @acpi_disabled in defined in include/acpi.h when CONFIG_ACPI
is disabled.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-7-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The SDEI platform device is created from device-tree node or ACPI
(SDEI) table. For the later case, the platform device is created
explicitly by this module. It'd better to unregister the driver on
failure to create the device to keep the symmetry. The driver, owned
by this module, isn't needed if the device isn't existing.
Besides, the errno (@ret) should be updated accordingly in this
case.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-6-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In sdei_init(), the nested statements can be avoided by bailing
on error from platform_driver_register() or absent ACPI SDEI table.
With it, the code looks a bit more readable.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-5-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In sdei_event_create(), the event number is retrieved from the
variable @event_num for the shared event. The event number was
stored in the event instance. So we can fetch it from the event
instance, similar to what we're doing for the private event.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-4-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There are multiple calls of kfree(event) in the failing path of
sdei_event_create() to free the SDEI event. It means we need to
call it again when adding more code in the failing path. It's
prone to miss doing that and introduce memory leakage.
This introduces common block for failing path in sdei_event_create()
to resolve the issue. This shouldn't cause functional changes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200922130423.10173-3-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
sdei_is_err() is only called by sdei_to_linux_errno(). The logic of
checking on the error number is common to them. They can be combined
finely.
This removes sdei_is_err() and its logic is combined to the function
sdei_to_linux_errno(). Also, the assignment of @err to zero is also
dropped in invoke_sdei_fn() because it's always overridden afterwards.
This shouldn't cause functional changes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200922130423.10173-2-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
|