linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-09-29	PM / devfreq: Add devfreq_get_devfreq_by_node function	Leonard Crestez
	Split off part of devfreq_get_devfreq_by_phandle into a separate function. This allows callers to fetch devfreq instances by enumerating devicetree instead of explicit phandles. Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> [cw00.choi: Export devfreq_get_devfreq_by_node function and add function to devfreq.h when CONFIG_PM_DEVFREQ is enabled.] Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2020-09-29	perf/x86/intel: Fix Ice Lake event constraint table	Kan Liang
	An error occues when sampling non-PEBS INST_RETIRED.PREC_DIST(0x01c0) event. perf record -e cpu/event=0xc0,umask=0x01/ -- sleep 1 Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/event=0xc0,umask=0x01/). /bin/dmesg \| grep -i perf may provide additional information. The idxmsk64 of the event is set to 0. The event never be successfully scheduled. The event should be limit to the fixed counter 0. Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support") Reported-by: Yi, Ammy <ammy.yi@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200928134726.13090-1-kan.liang@linux.intel.com
2020-09-29	perf/x86/intel/uncore: Fix the scale of the IMC free-running events	Kan Liang
	The "MiB" result of the IMC free-running bandwidth events, uncore_imc_free_running/read/ and uncore_imc_free_running/write/ are 16 times too small. The "MiB" value equals the raw IMC free-running bandwidth counter value times a "scale" which is inaccurate. The IMC free-running bandwidth events should be incremented per 64B cache line, not DWs (4 bytes). The "scale" should be 6.103515625e-5. Fix the "scale" for both Snow Ridge and Ice Lake. Fixes: 2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support") Fixes: ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200928133240.12977-1-kan.liang@linux.intel.com
2020-09-29	perf/x86/intel/uncore: Fix for iio mapping on Skylake Server	Alexander Antonov
	Introduced early attributes /sys/devices/uncore_iio_<pmu_idx>/die* are initialized by skx_iio_set_mapping(), however, for example, for multiple segment platforms skx_iio_get_topology() returns -EPERM before a list of attributes in skx_iio_mapping_group will have been initialized. As a result the list is being NULL. Thus the warning "sysfs: (bin_)attrs not set by subsystem for group: uncore_iio_*/" appears and uncore_iio pmus are not available in sysfs. Clear IIO attr_update to properly handle the cases when topology information cannot be retrieved. Fixes: bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping") Reported-by: Kyle Meyer <kyle.meyer@hpe.com> Suggested-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexei Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lkml.kernel.org/r/20200928102133.61041-1-alexander.antonov@linux.intel.com
2020-09-29	perf/x86/msr: Add Jasper Lake support	Kan Liang
	The Jasper Lake processor is also a Tremont microarchitecture. From the perspective of perf MSR, there is nothing changed compared with Elkhart Lake. Share the code path with Elkhart Lake. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1601296242-32763-2-git-send-email-kan.liang@linux.intel.com
2020-09-29	perf/x86/intel: Add Jasper Lake support	Kan Liang
	The Jasper Lake processor is also a Tremont microarchitecture. From the perspective of Intel PMU, there is nothing changed compared with Elkhart Lake. Share the perf code with Elkhart Lake. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1601296242-32763-1-git-send-email-kan.liang@linux.intel.com
2020-09-29	perf/x86/intel/uncore: Reduce the number of CBOX counters	Kan Liang
	An oops is triggered by the fuzzy test. [ 327.853081] unchecked MSR access error: RDMSR from 0x70c at rIP: 0xffffffffc082c820 (uncore_msr_read_counter+0x10/0x50 [intel_uncore]) [ 327.853083] Call Trace: [ 327.853085] <IRQ> [ 327.853089] uncore_pmu_event_start+0x85/0x170 [intel_uncore] [ 327.853093] uncore_pmu_event_add+0x1a4/0x410 [intel_uncore] [ 327.853097] ? event_sched_in.isra.118+0xca/0x240 There are 2 GP counters for each CBOX, but the current code claims 4 counters. Accessing the invalid registers triggers the oops. Fixes: 6e394376ee89 ("perf/x86/intel/uncore: Add Intel Icelake uncore support") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200925134905.8839-3-kan.liang@linux.intel.com
2020-09-29	perf/x86/intel/uncore: Update Ice Lake uncore units	Kan Liang
	There are some updates for the Icelake model specific uncore performance monitors. (The update can be found at 10th generation intel core processors families specification update Revision 004, ICL068) 1) Counter 0 of ARB uncore unit is not available for software use 2) The global 'enable bit' (bit 29) and 'freeze bit' (bit 31) of MSR_UNC_PERF_GLOBAL_CTRL cannot be used to control counter behavior. Needs to use local enable in event select MSR. Accessing the modified bit/registers will be ignored by HW. Users may observe inaccurate results with the current code. The changes of the MSR_UNC_PERF_GLOBAL_CTRL imply that groups cannot be read atomically anymore. Although the error of the result for a group becomes a bit bigger, it still far lower than not using a group. The group support is still kept. Only Remove the *_box() related implementation. Since the counter 0 of ARB uncore unit is not available, update the MSR address for the ARB uncore unit. There is no change for IMC uncore unit, which only include free-running counters. Fixes: 6e394376ee89 ("perf/x86/intel/uncore: Add Intel Icelake uncore support") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200925134905.8839-2-kan.liang@linux.intel.com
2020-09-29	perf/x86/intel/uncore: Split the Ice Lake and Tiger Lake MSR uncore support	Kan Liang
	Previously, the MSR uncore for the Ice Lake and Tiger Lake are identical. The code path is shared. However, with recent update, the global MSR_UNC_PERF_GLOBAL_CTRL register and ARB uncore unit are changed for the Ice Lake. Split the Ice Lake and Tiger Lake MSR uncore support. The changes only impact the MSR ops() and the ARB uncore unit. Other codes can still be shared between the Ice Lake and the Tiger Lake. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200925134905.8839-1-kan.liang@linux.intel.com
2020-09-29	lockdep: Optimize the memory usage of circular queue	Boqun Feng
	Qian Cai reported a BFS_EQUEUEFULL warning [1] after read recursive deadlock detection merged into tip tree recently. Unlike the previous lockep graph searching, which iterate every lock class (every node in the graph) exactly once, the graph searching for read recurisve deadlock detection needs to iterate every lock dependency (every edge in the graph) once, as a result, the maximum memory cost of the circular queue changes from O(V), where V is the number of lock classes (nodes or vertices) in the graph, to O(E), where E is the number of lock dependencies (edges), because every lock class or dependency gets enqueued once in the BFS. Therefore we hit the BFS_EQUEUEFULL case. However, actually we don't need to enqueue all dependencies for the BFS, because every time we enqueue a dependency, we almostly enqueue all other dependencies in the same dependency list ("almostly" is because we currently check before enqueue, so if a dependency doesn't pass the check stage we won't enqueue it, however, we can always do in reverse ordering), based on this, we can only enqueue the first dependency from a dependency list and every time we want to fetch a new dependency to work, we can either: 1) fetch the dependency next to the current dependency in the dependency list or 2) if the dependency in 1) doesn't exist, fetch the dependency from the queue. With this approach, the "max bfs queue depth" for a x86_64_defconfig + lockdep and selftest config kernel can get descreased from: max bfs queue depth: 201 to (after apply this patch) max bfs queue depth: 61 While I'm at it, clean up the code logic a little (e.g. directly return other than set a "ret" value and goto the "exit" label). [1]: https://lore.kernel.org/lkml/17343f6f7f2438fc376125384133c5ba70c2a681.camel@redhat.com/ Reported-by: Qian Cai <cai@redhat.com> Reported-by: syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200917080210.108095-1-boqun.feng@gmail.com
2020-09-28	ethtool: mark netlink family as __ro_after_init	Jakub Kicinski
	Like all genl families ethtool_genl_family needs to not be a straight up constant, because it's modified/initialized by genl_register_family(). After init, however, it's only passed to genlmsg_put() & co. therefore we can mark it as __ro_after_init. Since genl_family structure contains function pointers mark this as a fix. Fixes: 2b4a8990b7df ("ethtool: introduce ethtool netlink interface") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	genetlink: add missing kdoc for validation flags	Jakub Kicinski
	Validation flags are missing kdoc, add it. Fixes: ef6243acb478 ("genetlink: optionally validate strictly/dumps") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: usb: ax88179_178a: add MCT usb 3.0 adapter	Wilken Gottwalt
	Adds the driver_info and usb ids of the AX88179 based MCT U3-A9003 USB 3.0 ethernet adapter. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: usb: ax88179_178a: fix missing stop entry in driver_info	Wilken Gottwalt
	Adds the missing .stop entry in the Belkin driver_info structure. Fixes: e20bd60bf62a ("net: usb: asix88179_178a: Add support for the Belkin B2B128") Signed-off-by: Wilken Gottwalt <wilken.gottwalt@mailbox.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	Input: i8042 - add nopnp quirk for Acer Aspire 5 A515	Jiri Kosina
	Touchpad on this laptop is not detected properly during boot, as PNP enumerates (wrongly) AUX port as disabled on this machine. Fix that by adding this board (with admittedly quite funny DMI identifiers) to nopnp quirk list. Reported-by: Andrés Barrantes Silman <andresbs2000@protonmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2009252337340.3336@cbobk.fhfr.pm Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-09-28	Input: trackpoint - enable Synaptics trackpoints	Vincent Huang
	Add Synaptics IDs in trackpoint_start_protocol() to mark them as valid. Signed-off-by: Vincent Huang <vincent.huang@tw.synaptics.com> Fixes: 6c77545af100 ("Input: trackpoint - add new trackpoint variant IDs") Reviewed-by: Harry Cutts <hcutts@chromium.org> Tested-by: Harry Cutts <hcutts@chromium.org> Link: https://lore.kernel.org/r/20200924053013.1056953-1-vincent.huang@tw.synaptics.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-09-28	net: qrtr: ns: Protect radix_tree_deref_slot() using rcu read locks	Manivannan Sadhasivam
	The rcu read locks are needed to avoid potential race condition while dereferencing radix tree from multiple threads. The issue was identified by syzbot. Below is the crash report: ============================= WARNING: suspicious RCU usage 5.7.0-syzkaller #0 Not tainted ----------------------------- include/linux/radix-tree.h:176 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 2 locks held by kworker/u4:1/21: #0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: spin_unlock_irq include/linux/spinlock.h:403 [inline] #0: ffff88821b097938 ((wq_completion)qrtr_ns_handler){+.+.}-{0:0}, at: process_one_work+0x6df/0xfd0 kernel/workqueue.c:2241 #1: ffffc90000dd7d80 ((work_completion)(&qrtr_ns.work)){+.+.}-{0:0}, at: process_one_work+0x71e/0xfd0 kernel/workqueue.c:2243 stack backtrace: CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.7.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: qrtr_ns_handler qrtr_ns_worker Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1e9/0x30e lib/dump_stack.c:118 radix_tree_deref_slot include/linux/radix-tree.h:176 [inline] ctrl_cmd_new_lookup net/qrtr/ns.c:558 [inline] qrtr_ns_worker+0x2aff/0x4500 net/qrtr/ns.c:674 process_one_work+0x76e/0xfd0 kernel/workqueue.c:2268 worker_thread+0xa7f/0x1450 kernel/workqueue.c:2414 kthread+0x353/0x380 kernel/kthread.c:268 Fixes: 0c2204a4ad71 ("net: qrtr: Migrate nameservice to kernel from userspace") Reported-and-tested-by: syzbot+0f84f6eed90503da72fc@syzkaller.appspotmail.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	iommu/arm-smmu-v3: Add SVA device feature	Jean-Philippe Brucker
	Implement the IOMMU device feature callbacks to support the SVA feature. At the moment dev_has_feat() returns false since I/O Page Faults and BTM aren't yet implemented. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-12-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	iommu/arm-smmu-v3: Check for SVA features	Jean-Philippe Brucker
	Aggregate all sanity-checks for sharing CPU page tables with the SMMU under a single ARM_SMMU_FEAT_SVA bit. For PCIe SVA, users also need to check FEAT_ATS and FEAT_PRI. For platform SVA, they will have to check FEAT_STALLS. Introduce ARM_SMMU_FEAT_BTM (Broadcast TLB Maintenance), but don't enable it at the moment. Since the entire VMID space is shared with the CPU, enabling DVM (by clearing SMMU_CR2.PTM) could result in over-invalidation and affect performance of stage-2 mappings. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200918101852.582559-11-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	iommu/arm-smmu-v3: Seize private ASID	Jean-Philippe Brucker
	The SMMU has a single ASID space, the union of shared and private ASID sets. This means that the SMMU driver competes with the arch allocator for ASIDs. Shared ASIDs are those of Linux processes, allocated by the arch, and contribute in broadcast TLB maintenance. Private ASIDs are allocated by the SMMU driver and used for "classic" map/unmap DMA. They require command-queue TLB invalidations. When we pin down an mm_context and get an ASID that is already in use by the SMMU, it belongs to a private context. We used to simply abort the bind, but this is unfair to users that would be unable to bind a few seemingly random processes. Try to allocate a new private ASID for the context, and make the old ASID shared. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-10-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	iommu/arm-smmu-v3: Share process page tables	Jean-Philippe Brucker
	With Shared Virtual Addressing (SVA), we need to mirror CPU TTBR, TCR, MAIR and ASIDs in SMMU contexts. Each SMMU has a single ASID space split into two sets, shared and private. Shared ASIDs correspond to those obtained from the arch ASID allocator, and private ASIDs are used for "classic" map/unmap DMA. A possible conflict happens when trying to use a shared ASID that has already been allocated for private use by the SMMU driver. This will be addressed in a later patch by replacing the private ASID. At the moment we return -EBUSY. Each mm_struct shared with the SMMU will have a single context descriptor. Add a refcount to keep track of this. It will be protected by the global SVA lock. Introduce a new arm-smmu-v3-sva.c file and the CONFIG_ARM_SMMU_V3_SVA option to let users opt in SVA support. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-9-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	iommu/arm-smmu-v3: Move definitions to a header	Jean-Philippe Brucker
	Allow sharing structure definitions with the upcoming SVA support for Arm SMMUv3, by moving them to a separate header. We could surgically extract only what is needed but keeping all definitions in one place looks nicer. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-8-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	iommu/io-pgtable-arm: Move some definitions to a header	Jean-Philippe Brucker
	Extract some of the most generic TCR defines, so they can be reused by the page table sharing code. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200918101852.582559-6-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	Merge branch 'for-next/svm' of ↵	Will Deacon
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into for-joerg/arm-smmu/updates Pull in core arm64 changes required to enable Shared Virtual Memory (SVM) using SMMUv3. This brings us increasingly closer to being able to share page-tables directly between user-space tasks running on the CPU and their corresponding contexts on coherent devices performing DMA through the SMMU. Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	arm64: mte: Fix typo in memory tagging ABI documentation	Will Deacon
	We offer both PTRACE_PEEKMTETAGS and PTRACE_POKEMTETAGS requests via ptrace(). Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	Merge branch 'net-core-fix-a-lockdep-splat-in-the-dev_addr_list'	David S. Miller
	Taehee Yoo says: ==================== net: core: fix a lockdep splat in the dev_addr_list. This patchset is to avoid lockdep splat. When a stacked interface graph is changed, netif_addr_lock() is called recursively and it internally calls spin_lock_nested(). The parameter of spin_lock_nested() is 'dev->lower_level', this is called subclass. The problem of 'dev->lower_level' is that while 'dev->lower_level' is being used as a subclass of spin_lock_nested(), its value can be changed. So, spin_lock_nested() would be called recursively with the same subclass value, the lockdep understands a deadlock. In order to avoid this, a new variable is needed and it is going to be used as a parameter of spin_lock_nested(). The first and second patch is a preparation patch for the third patch. In the third patch, the problem will be fixed. The first patch is to add __netdev_upper_dev_unlink(). An existed netdev_upper_dev_unlink() is renamed to __netdev_upper_dev_unlink(). and netdev_upper_dev_unlink() is added as an wrapper of this function. The second patch is to add the netdev_nested_priv structure. netdev_walk_all_{ upper \| lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. The third patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock. Due to this variable, it can avoid lockdep splat. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: core: add nested_level variable in net_device	Taehee Yoo
	This patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock. netif_addr_lock() can be called recursively so spin_lock_nested() is used instead of spin_lock() and dev->lower_level is used as a parameter of spin_lock_nested(). But, dev->lower_level value can be updated while it is being used. So, lockdep would warn a possible deadlock scenario. When a stacked interface is deleted, netif_{uc \| mc}_sync() is called recursively. So, spin_lock_nested() is called recursively too. At this moment, the dev->lower_level variable is used as a parameter of it. dev->lower_level value is updated when interfaces are being unlinked/linked immediately. Thus, After unlinking, dev->lower_level shouldn't be a parameter of spin_lock_nested(). A (macvlan) \| B (vlan) \| C (bridge) \| D (macvlan) \| E (vlan) \| F (bridge) A->lower_level : 6 B->lower_level : 5 C->lower_level : 4 D->lower_level : 3 E->lower_level : 2 F->lower_level : 1 When an interface 'A' is removed, it releases resources. At this moment, netif_addr_lock() would be called. Then, netdev_upper_dev_unlink() is called recursively. Then dev->lower_level is updated. There is no problem. But, when the bridge module is removed, 'C' and 'F' interfaces are removed at once. If 'F' is removed first, a lower_level value is like below. A->lower_level : 5 B->lower_level : 4 C->lower_level : 3 D->lower_level : 2 E->lower_level : 1 F->lower_level : 1 Then, 'C' is removed. at this moment, netif_addr_lock() is called recursively. The ordering is like this. C(3)->D(2)->E(1)->F(1) At this moment, the lower_level value of 'E' and 'F' are the same. So, lockdep warns a possible deadlock scenario. In order to avoid this problem, a new variable 'nested_level' is added. This value is the same as dev->lower_level - 1. But this value is updated in rtnl_unlock(). So, this variable can be used as a parameter of spin_lock_nested() safely in the rtnl context. Test commands: ip link add br0 type bridge vlan_filtering 1 ip link add vlan1 link br0 type vlan id 10 ip link add macvlan2 link vlan1 type macvlan ip link add br3 type bridge vlan_filtering 1 ip link set macvlan2 master br3 ip link add vlan4 link br3 type vlan id 10 ip link add macvlan5 link vlan4 type macvlan ip link add br6 type bridge vlan_filtering 1 ip link set macvlan5 master br6 ip link add vlan7 link br6 type vlan id 10 ip link add macvlan8 link vlan7 type macvlan ip link set br0 up ip link set vlan1 up ip link set macvlan2 up ip link set br3 up ip link set vlan4 up ip link set macvlan5 up ip link set br6 up ip link set vlan7 up ip link set macvlan8 up modprobe -rv bridge Splat looks like: [ 36.057436][ T744] WARNING: possible recursive locking detected [ 36.058848][ T744] 5.9.0-rc6+ #728 Not tainted [ 36.059959][ T744] -------------------------------------------- [ 36.061391][ T744] ip/744 is trying to acquire lock: [ 36.062590][ T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30 [ 36.064922][ T744] [ 36.064922][ T744] but task is already holding lock: [ 36.066626][ T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.068851][ T744] [ 36.068851][ T744] other info that might help us debug this: [ 36.070731][ T744] Possible unsafe locking scenario: [ 36.070731][ T744] [ 36.072497][ T744] CPU0 [ 36.073238][ T744] ---- [ 36.074007][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.075290][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.076590][ T744] [ 36.076590][ T744] * DEADLOCK * [ 36.076590][ T744] [ 36.078515][ T744] May be due to missing lock nesting notation [ 36.078515][ T744] [ 36.080491][ T744] 3 locks held by ip/744: [ 36.081471][ T744] #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490 [ 36.083614][ T744] #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.085942][ T744] #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80 [ 36.088400][ T744] [ 36.088400][ T744] stack backtrace: [ 36.089772][ T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728 [ 36.091364][ T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 36.093630][ T744] Call Trace: [ 36.094416][ T744] dump_stack+0x77/0x9b [ 36.095385][ T744] __lock_acquire+0xbc3/0x1f40 [ 36.096522][ T744] lock_acquire+0xb4/0x3b0 [ 36.097540][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.098657][ T744] ? rtmsg_ifinfo+0x1f/0x30 [ 36.099711][ T744] ? __dev_notify_flags+0xa5/0xf0 [ 36.100874][ T744] ? rtnl_is_locked+0x11/0x20 [ 36.101967][ T744] ? __dev_set_promiscuity+0x7b/0x1a0 [ 36.103230][ T744] _raw_spin_lock_bh+0x38/0x70 [ 36.104348][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.105461][ T744] dev_set_rx_mode+0x19/0x30 [ 36.106532][ T744] dev_set_promiscuity+0x36/0x50 [ 36.107692][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.108929][ T744] dev_set_promiscuity+0x1e/0x50 [ 36.110093][ T744] br_port_set_promisc+0x1f/0x40 [bridge] [ 36.111415][ T744] br_manage_promisc+0x8b/0xe0 [bridge] [ 36.112728][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.113967][ T744] ? __hw_addr_sync_one+0x23/0x50 [ 36.115135][ T744] __dev_set_rx_mode+0x68/0x90 [ 36.116249][ T744] dev_uc_sync+0x70/0x80 [ 36.117244][ T744] dev_uc_add+0x50/0x60 [ 36.118223][ T744] macvlan_open+0x18e/0x1f0 [macvlan] [ 36.119470][ T744] __dev_open+0xd6/0x170 [ 36.120470][ T744] __dev_change_flags+0x181/0x1d0 [ 36.121644][ T744] dev_change_flags+0x23/0x60 [ 36.122741][ T744] do_setlink+0x30a/0x11e0 [ 36.123778][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.124929][ T744] ? __nla_validate_parse.part.6+0x45/0x8e0 [ 36.126309][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.127457][ T744] __rtnl_newlink+0x546/0x8e0 [ 36.128560][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.129623][ T744] ? deactivate_slab.isra.85+0x6a1/0x850 [ 36.130946][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.132102][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.133176][ T744] ? is_bpf_text_address+0x5/0xe0 [ 36.134364][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.135445][ T744] ? rcu_read_lock_sched_held+0x32/0x60 [ 36.136771][ T744] ? kmem_cache_alloc_trace+0x2d8/0x380 [ 36.138070][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.139164][ T744] rtnl_newlink+0x47/0x70 [ ... ] Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: core: introduce struct netdev_nested_priv for nested interface ↵	Taehee Yoo
	infrastructure Functions related to nested interface infrastructure such as netdev_walk_all_{ upper \| lower }_dev() pass both private functions and "data" pointer to handle their own things. At this point, the data pointer type is void *. In order to make it easier to expand common variables and functions, this new netdev_nested_priv structure is added. In the following patch, a new member variable will be added into this struct to fix the lockdep issue. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	net: core: add __netdev_upper_dev_unlink()	Taehee Yoo
	The netdev_upper_dev_unlink() has to work differently according to flags. This idea is the same with __netdev_upper_dev_link(). In the following patches, new flags will be added. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28	iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer	Zhou Wang
	Reading the 'prod' MMIO register in order to determine whether or not there is valid data beyond 'cons' for a given queue does not provide sufficient dependency ordering, as the resulting access is address dependent only on 'cons' and can therefore be speculated ahead of time, potentially allowing stale data to be read by the CPU. Use readl() instead of readl_relaxed() when updating the shadow copy of the 'prod' pointer, so that all speculated memory reads from the corresponding queue can occur only from valid slots. Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzhou1@hisilicon.com [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	fscrypt: export fscrypt_d_revalidate()	Eric Biggers
	Dentries that represent no-key names must have a dentry_operations that includes fscrypt_d_revalidate(). Currently, this is handled by fscrypt_prepare_lookup() installing fscrypt_d_ops. However, ceph support for encryption (https://lore.kernel.org/r/20200914191707.380444-1-jlayton@kernel.org) can't use fscrypt_d_ops, since ceph already has its own dentry_operations. Similarly, ext4 and f2fs support for directories that are both encrypted and casefolded (https://lore.kernel.org/r/20200923010151.69506-1-drosen@google.com) can't use fscrypt_d_ops either, since casefolding requires some dentry operations too. To satisfy both users, we need to move the responsibility of installing the dentry_operations to filesystems. In preparation for this, export fscrypt_d_revalidate() and give it a !CONFIG_FS_ENCRYPTION stub. Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20200924054721.187797-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-09-28	Documentation: Chinese translation of Documentation/arm64/amu.rst	Bailu Lin
	This is a Chinese translated version of Documentation/arm64/amu.rst Signed-off-by: Bailu Lin <bailu.lin@vivo.com> Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com> Link: https://lore.kernel.org/r/20200926025233.47214-1-bailu.lin@vivo.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-09-28	arm64: cpufeature: Export symbol read_sanitised_ftr_reg()	Jean-Philippe Brucker
	The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to share CPU page tables with devices. Allow the driver to be built as module by exporting the read_sanitized_ftr_reg() cpufeature symbol. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200918101852.582559-7-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	doc: zh_CN: index files in arm64 subdirectory	Bailu Lin
	Add arm64 subdirectory into the table of Contents for zh_CN, then add other translations in arm64 conveniently. Signed-off-by: Bailu Lin <bailu.lin@vivo.com> Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com> Link: https://lore.kernel.org/r/20200926022558.46232-1-bailu.lin@vivo.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-09-28	mailmap: add entry for <mstarovoitov@marvell.com>	Mark Starovoytov
	Map the address to my private mail, because my Marvell account has been suspended. Signed-off-by: Mark Starovoytov <mstarovo@pm.me> Link: https://lore.kernel.org/r/20200928183948.589-1-mstarovo@pm.me Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-09-28	doc: seq_file: clarify role of *pos in ->next()	NeilBrown
	There are behavioural requirements on the seq_file next() function in terms of how it updates *pos at end-of-file, and these are now enforced by a warning. I was recently attempting to justify the reason this was needed, and couldn't remember the details, and didn't find them in the documentation. So I re-read the code until I understood it again, and updated the documentation to match. I also enhanced the text about SEQ_START_TOKEN as it seemed potentially misleading. Cc: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/87eemqiazh.fsf@notabene.neil.brown.name Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-09-28	arm64: mm: Pin down ASIDs for sharing mm with devices	Jean-Philippe Brucker
	To enable address space sharing with the IOMMU, introduce arm64_mm_context_get() and arm64_mm_context_put(), that pin down a context and ensure that it will keep its ASID after a rollover. Export the symbols to let the modular SMMUv3 driver use them. Pinning is necessary because a device constantly needs a valid ASID, unlike tasks that only require one when running. Without pinning, we would need to notify the IOMMU when we're about to use a new ASID for a task, and it would get complicated when a new task is assigned a shared ASID. Consider the following scenario with no ASID pinned: 1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1) 2. Task t2 is scheduled on CPUx, gets ASID (1, 2) 3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1) We would now have to immediately generate a new ASID for t1, notify the IOMMU, and finally enable task tn. We are holding the lock during all that time, since we can't afford having another CPU trigger a rollover. The IOMMU issues invalidation commands that can take tens of milliseconds. It gets needlessly complicated. All we wanted to do was schedule task tn, that has no business with the IOMMU. By letting the IOMMU pin tasks when needed, we avoid stalling the slow path, and let the pinning fail when we're out of shareable ASIDs. After a rollover, the allocator expects at least one ASID to be available in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS - 1) is the maximum number of ASIDs that can be shared with the IOMMU. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200918101852.582559-5-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove _sdei_event_unregister()	Gavin Shan
	_sdei_event_unregister() is called by sdei_event_unregister() and sdei_device_freeze(). _sdei_event_unregister() covers the shared and private events, but sdei_device_freeze() only covers the shared events. So the logic to cover the private events isn't needed by sdei_device_freeze(). sdei_event_unregister sdei_device_freeze _sdei_event_unregister sdei_unregister_shared _sdei_event_unregister This removes _sdei_event_unregister(). Its logic is moved to its callers accordingly. This shouldn't cause any logical changes. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-14-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove _sdei_event_register()	Gavin Shan
	The function _sdei_event_register() is called by sdei_event_register() and sdei_device_thaw() as the following functional call chain shows. _sdei_event_register() covers the shared and private events, but sdei_device_thaw() only covers the shared events. So the logic to cover the private events in _sdei_event_register() isn't needed by sdei_device_thaw(). Similarly, sdei_reregister_event_llocked() covers the shared and private events in the regard of reenablement. The logic to cover the private events isn't needed by sdei_device_thaw() either. sdei_event_register sdei_device_thaw _sdei_event_register sdei_reregister_shared sdei_reregister_event_llocked _sdei_event_register This removes _sdei_event_register() and sdei_reregister_event_llocked(). Their logic is moved to sdei_event_register() and sdei_reregister_shared(). This shouldn't cause any logical changes. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-13-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Introduce sdei_do_local_call()	Gavin Shan
	During the CPU hotplug, the private events are registered, enabled or unregistered on the specific CPU. It repeats the same steps: initializing cross call argument, make function call on local CPU, check the returned error. This introduces sdei_do_local_call() to cover the first steps. The other benefit is to make CROSSCALL_INIT and struct sdei_crosscall_args are only visible to sdei_do_{cross, local}_call(). Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-12-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Cleanup on cross call function	Gavin Shan
	This applies cleanup on the cross call functions, no functional changes are introduced: * Wrap the code block of CROSSCALL_INIT inside "do { } while (0)" as linux kernel usually does. Otherwise, scripts/checkpatch.pl reports warning regarding this. * Use smp_call_func_t for @fn argument in sdei_do_cross_call() as the function is called on target CPU(s). * Remove unnecessary space before @event in sdei_do_cross_call() Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-11-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove while loop in sdei_event_unregister()	Gavin Shan
	This removes the unnecessary while loop in sdei_event_unregister() because of the following two reasons. This shouldn't cause any functional changes. * The while loop is executed for once, meaning it's not needed in theory. * With the while loop removed, the nested statements can be avoid to make the code a bit cleaner. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200922130423.10173-10-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove while loop in sdei_event_register()	Gavin Shan
	This removes the unnecessary while loop in sdei_event_register() because of the following two reasons. This shouldn't cause any functional changes. * The while loop is executed for once, meaning it's not needed in theory. * With the while loop removed, the nested statements can be avoid to make the code a bit cleaner. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200922130423.10173-9-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove redundant error message in sdei_probe()	Gavin Shan
	This removes the redundant error message in sdei_probe() because the case can be identified from the errno in next error message. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-8-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove duplicate check in sdei_get_conduit()	Gavin Shan
	The following two checks are duplicate because @acpi_disabled doesn't depend on CONFIG_ACPI. So the duplicate check (IS_ENABLED(CONFIG_ACPI)) can be dropped. More details is provided to keep the commit log complete: * @acpi_disabled is defined in arch/arm64/kernel/acpi.c when CONFIG_ACPI is enabled. * @acpi_disabled in defined in include/acpi.h when CONFIG_ACPI is disabled. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-7-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Unregister driver on error in sdei_init()	Gavin Shan
	The SDEI platform device is created from device-tree node or ACPI (SDEI) table. For the later case, the platform device is created explicitly by this module. It'd better to unregister the driver on failure to create the device to keep the symmetry. The driver, owned by this module, isn't needed if the device isn't existing. Besides, the errno (@ret) should be updated accordingly in this case. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-6-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Avoid nested statements in sdei_init()	Gavin Shan
	In sdei_init(), the nested statements can be avoided by bailing on error from platform_driver_register() or absent ACPI SDEI table. With it, the code looks a bit more readable. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-5-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Retrieve event number from event instance	Gavin Shan
	In sdei_event_create(), the event number is retrieved from the variable @event_num for the shared event. The event number was stored in the event instance. So we can fetch it from the event instance, similar to what we're doing for the private event. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-4-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Common block for failing path in sdei_event_create()	Gavin Shan
	There are multiple calls of kfree(event) in the failing path of sdei_event_create() to free the SDEI event. It means we need to call it again when adding more code in the failing path. It's prone to miss doing that and introduce memory leakage. This introduces common block for failing path in sdei_event_create() to resolve the issue. This shouldn't cause functional changes. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20200922130423.10173-3-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28	firmware: arm_sdei: Remove sdei_is_err()	Gavin Shan
	sdei_is_err() is only called by sdei_to_linux_errno(). The logic of checking on the error number is common to them. They can be combined finely. This removes sdei_is_err() and its logic is combined to the function sdei_to_linux_errno(). Also, the assignment of @err to zero is also dropped in invoke_sdei_fn() because it's always overridden afterwards. This shouldn't cause functional changes. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20200922130423.10173-2-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org>