summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2019-10-29KVM: arm/arm64: vgic: Remove the declaration of kvm_send_userspace_msi()Zenghui Yu
The callsite of kvm_send_userspace_msi() is currently arch agnostic. There seems no reason to keep an extra declaration of it in arm_vgic.h (we already have one in include/linux/kvm_host.h). Remove it. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20191029071919.177-2-yuzenghui@huawei.com
2019-10-29regulator: fixed: add off-on-delayPeng Fan
Depends on board design, the gpio controlling regulator may connects with a big capacitance. When need off, it takes some time to let the regulator to be truly off. If not add enough delay, the regulator might have always been on, so introduce off-on-delay to handle such case. Signed-off-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/1572311875-22880-3-git-send-email-peng.fan@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-10-29drm: bridge: dw-hdmi: Report connector status using callbackCheng-Yi Chiang
Allow codec driver register callback function for plug event. The callback registration flow: dw-hdmi <--- hw-hdmi-i2s-audio <--- hdmi-codec dw-hdmi-i2s-audio implements hook_plugged_cb op so codec driver can register the callback. dw-hdmi exports a function dw_hdmi_set_plugged_cb so platform device can register the callback. When connector plug/unplug event happens, report this event using the callback. Make sure that audio and drm are using the single source of truth for connector status. Signed-off-by: Cheng-Yi Chiang <cychiang@chromium.org> Link: https://lore.kernel.org/r/20191028071930.145899-2-cychiang@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2019-10-29resource: Add a resource_list_first_type helperRob Herring
A common pattern is looping over a resource_list just to get a matching entry with a specific type. Add resource_list_first_type() helper which implements this. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2019-10-29sched/kcpustat: Introduce vtime-aware kcpustat accessor for CPUTIME_SYSTEMFrederic Weisbecker
Kcpustat is not correctly supported on nohz_full CPUs. The tick doesn't fire and the cputime therefore doesn't move forward. The issue has shown up after the vanishing of the remaining 1Hz which has made the stall visible. We are solving that with checking the task running on a CPU through RCU and reading its vtime delta that we add to the raw kcpustat values. We make sure that we fetch a coherent raw-kcpustat/vtime-delta couple sequence while checking that the CPU referred by the target vtime is the correct one, under the locked vtime seqcount. Only CPUTIME_SYSTEM is handled here as a start because it's the trivial case. User and guest time will require more preparation work to correctly handle niceness. Reported-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpengli@tencent.com> Link: https://lkml.kernel.org/r/20191025020303.19342-1-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29context_tracking: Check static key on context_tracking_enabled_*cpu()Frederic Weisbecker
guest_enter() doesn't call context_tracking_enabled() before calling context_tracking_enabled_this_cpu(). Therefore the guest code doesn't benefit from the static key on the fast path. Just make sure that context_tracking_enabled_*cpu() functions check the static key by themselves to propagate the optimization. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-11-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29sched/vtime: Introduce vtime_accounting_enabled_cpu()Frederic Weisbecker
This allows us to check if a remote CPU runs vtime accounting (ie: is nohz_full). We'll need that to reliably support reading kcpustat on nohz_full CPUs. Also simplify a bit the condition in the local flavoured function while at it. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-10-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29sched/vtime: Rename vtime_accounting_cpu_enabled() to ↵Frederic Weisbecker
vtime_accounting_enabled_this_cpu() Standardize the naming on top of the vtime_accounting_enabled_*() base. Also make it clear we are checking the vtime state of the *current* CPU with this function. We'll need to add an API to check that state on remote CPUs as well, so we must disambiguate the naming. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-9-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29context_tracking: Introduce context_tracking_enabled_cpu()Frederic Weisbecker
This allows us to check if a remote CPU runs context tracking (ie: is nohz_full). We'll need that to reliably support "nice" accounting on kcpustat. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-8-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29context_tracking: Rename context_tracking_is_cpu_enabled() to ↵Frederic Weisbecker
context_tracking_enabled_this_cpu() Standardize the naming on top of the context_tracking_enabled_*() base. Also make it clear we are checking the context tracking state of the *current* CPU with this function. We'll need to add an API to check that state on remote CPUs as well, so we must disambiguate the naming. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-7-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29context_tracking: Rename context_tracking_is_enabled() => ↵Frederic Weisbecker
context_tracking_enabled() Remove the superfluous "is" in the middle of the name. We want to standardize the naming so that it can be expanded through suffixes: context_tracking_enabled() context_tracking_enabled_cpu() context_tracking_enabled_this_cpu() Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-6-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29context_tracking: Remove context_tracking_active()Frederic Weisbecker
This function is a leftover from old removal or rename. We can drop it. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-5-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29sched/cputime: Add vtime guest task stateFrederic Weisbecker
Record guest as a VTIME state instead of guessing it from VTIME_SYS and PF_VCPU. This is going to simplify the cputime read side especially as its state machine is going to further expand in order to fully support kcpustat on nohz_full. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-4-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29sched/cputime: Add vtime idle task stateFrederic Weisbecker
Record idle as a VTIME state instead of guessing it from VTIME_SYS and is_idle_task(). This is going to simplify the cputime read side especially as its state machine is going to further expand in order to fully support kcpustat on nohz_full. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-3-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29sched/vtime: Record CPU under seqcount for kcpustat needsFrederic Weisbecker
In order to compute the kcpustat delta on a nohz CPU, we'll need to fetch the task running on that target. Checking that its vtime state snapshot actually refers to the relevant target involves recording that CPU under the seqcount locked on task switch. This is a step toward making kcpustat moving forward on full nohz CPUs. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rjw@rjwysocki.net> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191016025700.31277-2-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-29dt-bindings: reset: Add Realtek RTD1295Andreas Färber
Add a header with symbolic reset indices for Realtek RTD1295 SoC. Naming was derived from reset-names in an OEM's Device Tree. Acked-by: Rob Herring <robh@kernel.org> [AF: Dropped RTD1295 specific binding definition, updated SPDX] Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Andreas Färber <afaerber@suse.de>
2019-10-28net: dsa: Add support for devlink device parametersAndrew Lunn
Add plumbing to allow DSA drivers to register parameters with devlink. To keep with the abstraction, the DSA drivers pass the ds structure to these helpers, and the DSA core then translates that to the devlink structure associated to the device. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: fix sk_page_frag() recursion from memory reclaimTejun Heo
sk_page_frag() optimizes skb_frag allocations by using per-task skb_frag cache when it knows it's the only user. The condition is determined by seeing whether the socket allocation mask allows blocking - if the allocation may block, it obviously owns the task's context and ergo exclusively owns current->task_frag. Unfortunately, this misses recursion through memory reclaim path. Please take a look at the following backtrace. [2] RIP: 0010:tcp_sendmsg_locked+0xccf/0xe10 ... tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 sock_xmit.isra.24+0xa1/0x170 [nbd] nbd_send_cmd+0x1d2/0x690 [nbd] nbd_queue_rq+0x1b5/0x3b0 [nbd] __blk_mq_try_issue_directly+0x108/0x1b0 blk_mq_request_issue_directly+0xbd/0xe0 blk_mq_try_issue_list_directly+0x41/0xb0 blk_mq_sched_insert_requests+0xa2/0xe0 blk_mq_flush_plug_list+0x205/0x2a0 blk_flush_plug_list+0xc3/0xf0 [1] blk_finish_plug+0x21/0x2e _xfs_buf_ioapply+0x313/0x460 __xfs_buf_submit+0x67/0x220 xfs_buf_read_map+0x113/0x1a0 xfs_trans_read_buf_map+0xbf/0x330 xfs_btree_read_buf_block.constprop.42+0x95/0xd0 xfs_btree_lookup_get_block+0x95/0x170 xfs_btree_lookup+0xcc/0x470 xfs_bmap_del_extent_real+0x254/0x9a0 __xfs_bunmapi+0x45c/0xab0 xfs_bunmapi+0x15/0x30 xfs_itruncate_extents_flags+0xca/0x250 xfs_free_eofblocks+0x181/0x1e0 xfs_fs_destroy_inode+0xa8/0x1b0 destroy_inode+0x38/0x70 dispose_list+0x35/0x50 prune_icache_sb+0x52/0x70 super_cache_scan+0x120/0x1a0 do_shrink_slab+0x120/0x290 shrink_slab+0x216/0x2b0 shrink_node+0x1b6/0x4a0 do_try_to_free_pages+0xc6/0x370 try_to_free_mem_cgroup_pages+0xe3/0x1e0 try_charge+0x29e/0x790 mem_cgroup_charge_skmem+0x6a/0x100 __sk_mem_raise_allocated+0x18e/0x390 __sk_mem_schedule+0x2a/0x40 [0] tcp_sendmsg_locked+0x8eb/0xe10 tcp_sendmsg+0x27/0x40 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x26d/0x2b0 __sys_sendmsg+0x57/0xa0 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 In [0], tcp_send_msg_locked() was using current->page_frag when it called sk_wmem_schedule(). It already calculated how many bytes can be fit into current->page_frag. Due to memory pressure, sk_wmem_schedule() called into memory reclaim path which called into xfs and then IO issue path. Because the filesystem in question is backed by nbd, the control goes back into the tcp layer - back into tcp_sendmsg_locked(). nbd sets sk_allocation to (GFP_NOIO | __GFP_MEMALLOC) which makes sense - it's in the process of freeing memory and wants to be able to, e.g., drop clean pages to make forward progress. However, this confused sk_page_frag() called from [2]. Because it only tests whether the allocation allows blocking which it does, it now thinks current->page_frag can be used again although it already was being used in [0]. After [2] used current->page_frag, the offset would be increased by the used amount. When the control returns to [0], current->page_frag's offset is increased and the previously calculated number of bytes now may overrun the end of allocated memory leading to silent memory corruptions. Fix it by adding gfpflags_normal_context() which tests sleepable && !reclaim and use it to determine whether to use current->task_frag. v2: Eric didn't like gfp flags being tested twice. Introduce a new helper gfpflags_normal_context() and combine the two tests. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28ACPICA: Update version to 20191018Bob Moore
ACPICA commit 3d70fd4894824ed1e685f2d059ca22ccd9ac6163 Version 20191018. Link: https://github.com/acpica/acpica/commit/3d70fd48 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-10-28ACPICA: make acpi_load_table() return table indexNikolaus Voss
ACPICA commit d1716a829d19be23277d9157c575a03b9abb7457 For unloading an ACPI table, it is necessary to provide the index of the table. The method intended for dynamically loading or hotplug addition of tables, acpi_load_table(), should provide this information via an optional pointer to the loaded table index. This patch fixes the table unload function of acpi_configfs. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Fixes: d06c47e3dd07f ("ACPI: configfs: Resolve objects on host-directed table loads") Link: https://github.com/acpica/acpica/commit/d1716a82 Signed-off-by: Nikolaus Voss <nikolaus.voss@loewensteinmedical.de> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-10-28ACPICA: Add new external interface, acpi_unload_table()Bob Moore
ACPICA commit c69369cd9cf0134e1aac516e97d612947daa8dc2 Unload a table via the table_index. Link: https://github.com/acpica/acpica/commit/c69369cd Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-10-28net: Fix various misspellings of "connect"Geert Uytterhoeven
Fix misspellings of "disconnect", "disconnecting", "connections", and "disconnected". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kalle Valo <kvalo@codeaurora.org> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: Fix misspellings of "configure" and "configuration"Geert Uytterhoeven
Fix various misspellings of "configuration" and "configure". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28net: add skb_queue_empty_lockless()Eric Dumazet
Some paths call skb_queue_empty() without holding the queue lock. We must use a barrier in order to not let the compiler do strange things, and avoid KCSAN splats. Adding a barrier in skb_queue_empty() might be overkill, I prefer adding a new helper to clearly identify points where the callers might be lockless. This might help us finding real bugs. The corresponding WRITE_ONCE() should add zero cost for current compilers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28Merge branch 'odp_rework' into rdma.git for-nextJason Gunthorpe
Jason Gunthorpe says: ==================== In order to hoist the interval tree code out of the drivers and into the mmu_notifiers it is necessary for the drivers to not use the interval tree for other things. This series replaces the interval tree with an xarray and along the way re-aligns all the locking to use a sensible SRCU model where the 'update' step is done by modifying an xarray. The result is overall much simpler and with less locking in the critical path. Many functions were reworked for clarity and small details like using 'imr' to refer to the implicit MR make the entire code flow here more readable. This also squashes at least two race bugs on its own, and quite possibily more that haven't been identified. ==================== Merge conflicts with the odp statistics patch resolved. * branch 'odp_rework': RDMA/odp: Remove broken debugging call to invalidate_range RDMA/mlx5: Do not race with mlx5_ib_invalidate_range during create and destroy RDMA/mlx5: Do not store implicit children in the odp_mkeys xarray RDMA/mlx5: Rework implicit ODP destroy RDMA/mlx5: Avoid double lookups on the pagefault path RDMA/mlx5: Reduce locking in implicit_mr_get_data() RDMA/mlx5: Use an xarray for the children of an implicit ODP RDMA/mlx5: Split implicit handling from pagefault_mr RDMA/mlx5: Set the HW IOVA of the child MRs to their place in the tree RDMA/mlx5: Lift implicit_mr_alloc() into the two routines that call it RDMA/mlx5: Rework implicit_mr_get_data RDMA/mlx5: Delete struct mlx5_priv->mkey_table RDMA/mlx5: Use a dedicated mkey xarray for ODP RDMA/mlx5: Split sig_err MR data into its own xarray RDMA/mlx5: Use SRCU properly in ODP prefetch Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Rework implicit ODP destroyJason Gunthorpe
Use SRCU in a sensible way by removing all MRs in the implicit tree from the two xarrays (the update operation), then a synchronize, followed by a normal single threaded teardown. This is only a little unusual from the normal pattern as there can still be some work pending in the unbound wq that may also require a workqueue flush. This is tracked with a single atomic, consolidating the redundant existing atomics and wait queue. For understand-ability the entire ODP implicit create/destroy flow now largely exists in a single pair of functions within odp.c, with a few support functions for tearing down an unused child. Link: https://lore.kernel.org/r/20191009160934.3143-13-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Use an xarray for the children of an implicit ODPJason Gunthorpe
Currently the child leaves are stored in the shared interval tree and every lookup for a child must be done under the interval tree rwsem. This is further complicated by dropping the rwsem during iteration (ie the odp_lookup(), odp_next() pattern), which requires a very tricky an difficult to understand locking scheme with SRCU. Instead reserve the interval tree for the exclusive use of the mmu notifier related code in umem_odp.c and give each implicit MR a xarray containing all the child MRs. Since the size of each child is 1GB of VA, a 1 level xarray will index 64G of VA, and a 2 level will index 2TB, making xarray a much better data structure choice than an interval tree. The locking properties of xarray will be used in the next patches to rework the implicit ODP locking scheme into something simpler. At this point, the xarray is locked by the implicit MR's umem_mutex, and read can also be locked by the odp_srcu. Link: https://lore.kernel.org/r/20191009160934.3143-10-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/mlx5: Delete struct mlx5_priv->mkey_tableJason Gunthorpe
No users are left, delete it. Link: https://lore.kernel.org/r/20191009160934.3143-5-jgg@ziepe.ca Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28Merge tag 'v5.4-rc5' into rdma.git for-nextJason Gunthorpe
Linux 5.4-rc5 For dependencies in the next patches Conflict resolved by keeping the delete of the unlock. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUEChristian Brauner
Switch from BIT(0) to (1UL << 0). First, there are already two different forms used in the header, so there's no need to add a third. Second, the BIT() macros is kernel internal and afaict not actually exposed to userspace. Maybe there's some magic there I'm missing but it definitely causes issues when compiling a program that tries to use SECCOMP_USER_NOTIF_FLAG_CONTINUE. It currently fails in the following way: # github.com/lxc/lxd/lxd /usr/bin/ld: $WORK/b001/_x003.o: in function `__do_user_notification_continue': lxd/main_checkfeature.go:240: undefined reference to `BIT' collect2: error: ld returned 1 exit status Switching to (1UL << 0) should prevent that and is more in line what is already done in the rest of the header. Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191024212539.4059-1-christian.brauner@ubuntu.com Signed-off-by: Kees Cook <keescook@chromium.org>
2019-10-28RDMA/vmw_pvrdma: Use resource ids from physical device if availableBryan Tan
This change allows the RDMA stack to use physical resource numbers if they are passed up from the device. This is accomplished by separating the concept of the QP number from the QP handle. Previously, the two were the same, as the QP number was exposed to the guest and also used to reference a virtual QP in the device backend. With physical resource numbers exposed, the QP number given to the guest is the number assigned from the physical HCA's QP, while the QP handle is still the internal handle used to reference a virtual QP. Regardless of whether the device is exposing physical ids, the driver will still try to pick up the QP handle from the backend if possible. The MR keys exposed to the guest will also be the MR keys created by the physical HCA, instead of virtual MR keys. The distinction between handle and keys is already present for MRs so there is no need to do anything special here. A new version of the create QP response has been added to the device API to pass up the QP number and handle. The driver will also report these to userspace in the udata response if userspace supports it or not create the queuepair if not. I also had to do a refactor of the destroy qp code to reuse it if we fail to copy to userspace. Link: https://lore.kernel.org/r/20191028181444.19448-1-aditr@vmware.com Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/core: Fix ib_dma_max_seg_size()Bart Van Assche
If dev->dma_device->params == NULL then the maximum DMA segment size is 64 KB. See also the dma_get_max_seg_size() implementation. This patch fixes the following kernel warning: DMA-API: infiniband rxe0: mapping sg segment longer than device claims to support [len=126976] [max=65536] WARNING: CPU: 4 PID: 4848 at kernel/dma/debug.c:1220 debug_dma_map_sg+0x3d9/0x450 RIP: 0010:debug_dma_map_sg+0x3d9/0x450 Call Trace: srp_queuecommand+0x626/0x18d0 [ib_srp] scsi_queue_rq+0xd02/0x13e0 [scsi_mod] __blk_mq_try_issue_directly+0x2b3/0x3f0 blk_mq_request_issue_directly+0xac/0xf0 blk_insert_cloned_request+0xdf/0x170 dm_mq_queue_rq+0x43d/0x830 [dm_mod] __blk_mq_try_issue_directly+0x2b3/0x3f0 blk_mq_request_issue_directly+0xac/0xf0 blk_mq_try_issue_list_directly+0xb8/0x170 blk_mq_sched_insert_requests+0x23c/0x3b0 blk_mq_flush_plug_list+0x529/0x730 blk_flush_plug_list+0x21f/0x260 blk_mq_make_request+0x56b/0xf20 generic_make_request+0x196/0x660 submit_bio+0xae/0x290 blkdev_direct_IO+0x822/0x900 generic_file_direct_write+0x110/0x200 __generic_file_write_iter+0x124/0x2a0 blkdev_write_iter+0x168/0x270 aio_write+0x1c4/0x310 io_submit_one+0x971/0x1390 __x64_sys_io_submit+0x12a/0x390 do_syscall_64+0x6f/0x2e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Link: https://lore.kernel.org/r/20191025225830.257535-2-bvanassche@acm.org Cc: <stable@vger.kernel.org> Fixes: 0b5cb3300ae5 ("RDMA/srp: Increase max_segment_size") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28rdma: Remove nes ABI headerJason Gunthorpe
This was missed when nes was removed. Fixes: 2d3c72ed5041 ("rdma: Remove nes") Link: https://lore.kernel.org/r/20191024135059.GA20084@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/putMarc Zyngier
When the VHE code was reworked, a lot of the vgic stuff was moved around, but the GICv4 residency code did stay untouched, meaning that we come in and out of residency on each flush/sync, which is obviously suboptimal. To address this, let's move things around a bit: - Residency entry (flush) moves to vcpu_load - Residency exit (sync) moves to vcpu_put - On blocking (entry to WFI), we "put" - On unblocking (exit from WFI), we "load" Because these can nest (load/block/put/load/unblock/put, for example), we now have per-VPE tracking of the residency state. Additionally, vgic_v4_put gains a "need doorbell" parameter, which only gets set to true when blocking because of a WFI. This allows a finer control of the doorbell, which now also gets disabled as soon as it gets signaled. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20191027144234.8395-2-maz@kernel.org
2019-10-28Merge tag 'reset-for-v5.5' of git://git.pengutronix.de/git/pza/linux into ↵Olof Johansson
arm/drivers Reset controller updates for v5.5 This tag adds support for Meson SM1 ARB resets, Uniphier Pro5 USB3 resets, the Meson-A1 reset controller, SocFPGA Agilex resets, and Realtek RTD1195/RTD1295 resets. It adds some reset controller API keywords for get_maintainers.pl and makes a few remaining reset_control_ops const. Also included are a conversion of the Qualcomm device tree bindings to yaml and a few small kerneldoc improvements. * tag 'reset-for-v5.5' of git://git.pengutronix.de/git/pza/linux: reset: document (devm_)reset_control_get_optional variants reset: improve of_xlate documentation reset: simple: Add Realtek RTD1195/RTD1295 reset: simple: Keep alphabetical order MAINTAINERS: add reset controller framework keywords reset: zynqmp: Make reset_control_ops const reset: hisilicon: hi3660: Make reset_control_ops const reset: build simple reset controller driver for Agilex reset: add support for the Meson-A1 SoC Reset Controller dt-bindings: reset: add bindings for the Meson-A1 SoC Reset Controller reset: uniphier-glue: Add Pro5 USB3 support dt-bindings: reset: pdc: Convert PDC Global bindings to yaml dt-bindings: reset: aoss: Convert AOSS reset bindings to yaml reset: Remove copy'n'paste redundancy in the comments reset: meson-audio-arb: add sm1 support reset: dt-bindings: meson: update arb bindings for sm1 Link: https://lore.kernel.org/r/ede6874508472d0917dca770ef80b90626b0f205.camel@pengutronix.de Signed-off-by: Olof Johansson <olof@lixom.net>
2019-10-28export: avoid code duplication in include/linux/export.hMasahiro Yamada
include/linux/export.h has lots of code duplication between EXPORT_SYMBOL and EXPORT_SYMBOL_NS. To improve the maintainability and readability, unify the implementation. When the symbol has no namespace, pass the empty string "" to the 'ns' parameter. The drawback of this change is, it grows the code size. When the symbol has no namespace, sym->namespace was previously NULL, but it is now an empty string "". So, it increases 1 byte for every no namespace EXPORT_SYMBOL. A typical kernel configuration has 10K exported symbols, so it increases 10KB in rough estimation. I did not come up with a good idea to refactor it without increasing the code size. I am not sure how big a deal it is, but at least include/linux/export.h looks nicer. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> [maennich: rebase on top of 3 fixes for the namespace feature] Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-10-28fs: add generic UNRESVSP and ZERO_RANGE ioctl handlersChristoph Hellwig
These use the same scheme as the pre-existing mapping of the XFS RESVP ioctls to ->falloc, so just extend it and remove the XFS implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> [darrick: fix compile error on s390] Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-28Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/coreCatalin Marinas
This is required to solve the conflicts with subsequent merges of two more errata workaround branches. * arm64/for-next/fixes: arm64: tags: Preserve tags for addresses translated via TTBR1 arm64: mm: fix inverted PAR_EL1.F check arm64: sysreg: fix incorrect definition of SYS_PAR_EL1_F arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled arm64: hibernate: check pgd table allocation arm64: cpufeature: Treat ID_AA64ZFR0_EL1 as RAZ when SVE is not enabled arm64: Fix kcore macros after 52-bit virtual addressing fallout arm64: Allow CAVIUM_TX2_ERRATUM_219 to be selected arm64: Avoid Cavium TX2 erratum 219 when switching TTBR arm64: Enable workaround for Cavium TX2 erratum 219 when running SMT arm64: KVM: Trap VM ops when ARM64_WORKAROUND_CAVIUM_TX2_219_TVM is set
2019-10-28ASoC: SOF: ipc: introduce message for DSP power gatingKeyon Jie
Add new ipc messages which will be sent from driver to FW, to ask FW to enter specific power saving state. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20191025224122.7718-14-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-10-28ASoC: SOF: token: add tokens for PCM compatible with D0i3 substateKeyon Jie
Add stream token SOF_TKN_STREAM_PLAYBACK_COMPATIBLE_D0I3 and SOF_TKN_STREAM_CAPTURE_COMPATIBLE_D0I3 to denote if the stream can be opened at low power d0i3 status or not. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20191025224122.7718-9-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-10-28ACPI: button: Remove unused acpi_lid_notifier_[un]register() functionsHans de Goede
There are no users of the acpi_lid_notifier_[un]register functions, so lets remove them. Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-10-28RDMA/cm: Update copyright together with SPDX tagLeon Romanovsky
Add Mellanox to lust of copyright holders and replace copyright boilerplate with relevant SPDX tag. Link: https://lore.kernel.org/r/20191020071559.9743-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28RDMA/cm: Use specific keyword to check defineLeon Romanovsky
There is a specific define keyword to check if define exists or not, let's use it instead of open-coded variant. Link: https://lore.kernel.org/r/20191020071559.9743-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28mac80211: fix a typo of "function"Joe Perches
Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/4d53be6c963542878d370ff1a6dc7c3a89b28d23.camel@perches.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-10-28mac80211: typo fixes in kerneldoc commentsChris Packham
Correct some trivial typos in kerneldoc comments. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20191024213647.5507-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-10-28perf/core, perf/x86: Introduce swap_task_ctx() method at 'struct pmu'Alexey Budankov
Declare swap_task_ctx() methods at the generic and x86 specific pmu types to bridge calls to platform specific PMU code on optimized context switch path between equivalent task perf event contexts. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/9a0aa84a-f062-9b64-3133-373658550c4b@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fixes from Michael Tsirkin: "Some minor fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vringh: fix copy direction of vringh_iov_push_kern() vsock/virtio: remove unused 'work' field from 'struct virtio_vsock_pkt' virtio_ring: fix stalls for packed rings
2019-10-28Merge branch 'for-linus' into for-nextTakashi Iwai
Back-merge the development process for catching up the HD-audio fix (and apply a new one on top of that). Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-10-28Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-28clk: imx7ulp: do not export out IMX7ULP_CLK_MIPI_PLL clockFancy Fang
The mipi pll clock comes from the MIPI PHY PLL output, so it should not be a fixed clock. MIPI PHY PLL is in the MIPI DSI space, and it is used as the bit clock for transferring the pixel data out and its output clock is configured according to the display mode. So it should be used only for MIPI DSI and not be exported out for other usages. Signed-off-by: Fancy Fang <chen.fang@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>