summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2020-07-21fscrypt: use smp_load_acquire() for ->i_crypt_infoEric Biggers
Normally smp_store_release() or cmpxchg_release() is paired with smp_load_acquire(). Sometimes smp_load_acquire() can be replaced with the more lightweight READ_ONCE(). However, for this to be safe, all the published memory must only be accessed in a way that involves the pointer itself. This may not be the case if allocating the object also involves initializing a static or global variable, for example. fscrypt_info includes various sub-objects which are internal to and are allocated by other kernel subsystems such as keyrings and crypto. So by using READ_ONCE() for ->i_crypt_info, we're relying on internal implementation details of these other kernel subsystems. Remove this fragile assumption by using smp_load_acquire() instead. (Note: I haven't seen any real-world problems here. This change is just fixing the code to be guaranteed correct and less fragile.) Fixes: e37a784d8b6a ("fscrypt: use READ_ONCE() to access ->i_crypt_info") Link: https://lore.kernel.org/r/20200721225920.114347-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-07-21net: phylink: add interface to configure clause 22 PCS PHYRussell King
Add an interface to configure the advertisement for a clause 22 PCS PHY, and set the AN enable flag in the BMCR appropriately. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21net: phylink: add struct phylink_pcsRussell King
Add a way for MAC PCS to have private data while keeping independence from struct phylink_config, which is used for the MAC itself. We need this independence as we will have stand-alone code for PCS that is independent of the MAC. Introduce struct phylink_pcs, which is designed to be embedded in a driver private data structure. This structure does not include a mdio_device as there are PCS implementations such as the Marvell DSA and network drivers where this is not necessary. Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21net: phylink: re-implement interface configuration with PCSRussell King
With PCS support, how we implement interface reconfiguration (or other major reconfiguration) is not up to the job; we end up reconfiguring the PCS for an interface change while the link could potentially be up. In order to solve this, add two additional MAC methods for major configuration, one to prepare for the change, and one to finish the change. This allows mvneta and mvpp2 to shutdown what they require prior to the MAC and PCS configuration calls, and then restart as appropriate. This impacts ksettings_set(), which now needs to identify whether the change is a minor tweak to the advertisement masks or whether the interface mode has changed, and call the appropriate function for that update. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21net: phylink: in-band pause mode advertisement update for PCSRussell King
Re-code the pause in-band advertisement update in light of the addition of PCS support, so that we perform the minimum required; only the PCS configuration function needs to be called in this case, followed by the request to trigger a restart of negotiation if the programmed advertisement changed. We need to change the pcs_config() signature to pass whether resolved pause should be passed to the MAC for setups such as mvneta and mvpp2 where doing so overrides the MAC manual flow controls. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21Input: synaptics-rmi4 - drop a duplicated wordRandy Dunlap
Drop the repeated word "to" in a comment. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20200719003131.21050-1-rdunlap@infradead.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-07-21remoteproc: Add inline coredump functionalityRishabh Bhatnagar
The current coredump implementation uses vmalloc area to copy all the segments. But this might put strain on low memory targets as the firmware size sometimes is in tens of MBs. The situation becomes worse if there are multiple remote processors undergoing recovery at the same time. This patch adds inline coredump functionality that avoids extra memory usage. This requires recovery to be halted until data is read by userspace and free function is called. Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Sibi Sankar <sibis@codeaurora.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org> Tested-by: Sibi Sankar <sibis@codeaurora.org> Link: https://lore.kernel.org/r/1594938035-7327-5-git-send-email-rishabhb@codeaurora.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-07-21remoteproc: Pass size and offset as arguments to segment dump functionRishabh Bhatnagar
Change the segment dump API signature to include size and offset arguments. Refactor the qcom_q6v5_mss driver to use these arguments while copying the segment. Doing this lays the ground work for "inline" coredump functionality being added in the next patch. Tested-by: Sibi Sankar <sibis@codeaurora.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Sibi Sankar <sibis@codeaurora.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Rishabh Bhatnagar <rishabhb@codeaurora.org> Link: https://lore.kernel.org/r/1594938035-7327-4-git-send-email-rishabhb@codeaurora.org Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2020-07-21bpf: net: Use precomputed btf_id for bpf iteratorsYonghong Song
One additional field btf_id is added to struct bpf_ctx_arg_aux to store the precomputed btf_ids. The btf_id is computed at build time with BTF_ID_LIST or BTF_ID_LIST_GLOBAL macro definitions. All existing bpf iterators are changed to used pre-compute btf_ids. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200720163403.1393551-1-yhs@fb.com
2020-07-21bpf: Make btf_sock_ids globalYonghong Song
tcp and udp bpf_iter can reuse some socket ids in btf_sock_ids, so make it global. I put the extern definition in btf_ids.h as a central place so it can be easily discovered by developers. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200720163402.1393427-1-yhs@fb.com
2020-07-21bpf: Add BTF_ID_LIST_GLOBAL in btf_ids.hYonghong Song
Existing BTF_ID_LIST used a local static variable to store btf_ids. This patch provided a new macro BTF_ID_LIST_GLOBAL to store btf_ids in a global variable which can be shared among multiple files. The existing BTF_ID_LIST is still retained. Two reasons. First, BTF_ID_LIST is also used to build btf_ids for helper arguments which typically is an array of 5. Since typically different helpers have different signature, it makes little sense to share them. Second, some current computed btf_ids are indeed local. If later those btf_ids are shared between different files, they can use BTF_ID_LIST_GLOBAL then. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/bpf/20200720163401.1393159-1-yhs@fb.com
2020-07-21bpf: Compute bpf_skc_to_*() helper socket btf ids at build timeYonghong Song
Currently, socket types (struct tcp_sock, udp_sock, etc.) used by bpf_skc_to_*() helpers are computed when vmlinux_btf is first built in the kernel. Commit 5a2798ab32ba ("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros") implemented a mechanism to compute btf_ids at kernel build time which can simplify kernel implementation and reduce runtime overhead by removing in-kernel btf_id calculation. This patch did exactly this, removing in-kernel btf_id computation and utilizing build-time btf_id computation. If CONFIG_DEBUG_INFO_BTF is not defined, BTF_ID_LIST will define an array with size of 5, which is not enough for btf_sock_ids. So define its own static array if CONFIG_DEBUG_INFO_BTF is not defined. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200720163358.1393023-1-yhs@fb.com
2020-07-21power_supply: Add additional health properties to the headerDan Murphy
Add HEALTH_WARM, HEALTH_COOL and HEALTH_HOT to the health enum. HEALTH_WARM, HEALTH_COOL, and HEALTH_HOT properties are taken from JEITA specification JISC8712:2015 Acked-by: Andrew F. Davis <afd@ti.com> Tested-by: Guru Das Srinagesh <gurus@codeaurora.org> Signed-off-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2020-07-21audit: purge audit_log_string from the intra-kernel audit APIRichard Guy Briggs
audit_log_string() was inteded to be an internal audit function and since there are only two internal uses, remove them. Purge all external uses of it by restructuring code to use an existing audit_log_format() or using audit_log_format(). Please see the upstream issue https://github.com/linux-audit/audit-kernel/issues/84 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-22powerpc/64s: Remove PROT_SAO supportNicholas Piggin
ISA v3.1 does not support the SAO storage control attribute required to implement PROT_SAO. PROT_SAO was used by specialised system software (Lx86) that has been discontinued for about 7 years, and is not thought to be used elsewhere, so removal should not cause problems. We rather remove it than keep support for older processors, because live migrating guest partitions to newer processors may not be possible if SAO is in use (or worse allowed with silent races). - PROT_SAO stays in the uapi header so code using it would still build. - arch_validate_prot() is removed, the generic version rejects PROT_SAO so applications would get a failure at mmap() time. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Drop KVM change for the time being] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200703011958.1166620-3-npiggin@gmail.com
2020-07-21coresight: Add default sink selection to CoreSight baseMike Leach
Adds a method to select a suitable sink connected to a given source. In cases where no sink is defined, the coresight_find_default_sink routine can search from a given source, through the child connections until a suitable sink is found. The suitability is defined in by the sink coresight_dev_subtype on the CoreSight device, and the distance from the source by counting connections. Higher value subtype is preferred - where these are equal, shorter distance from source is used as a tie-break. This allows for default sink to be discovered were none is specified (e.g. perf command line) Signed-off-by: Mike Leach <mike.leach@linaro.org> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20200716175746.3338735-15-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21coresight: Fix comment in main header fileMike Leach
Comment for an elemnt in the coresight_device structure appears to have been corrupted and makes no sense. Fix this before making further changes. Signed-off-by: Mike Leach <mike.leach@linaro.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20200716175746.3338735-12-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21exec: Implement kernel_execveEric W. Biederman
To allow the kernel not to play games with set_fs to call exec implement kernel_execve. The function kernel_execve takes pointers into kernel memory and copies the values pointed to onto the new userspace stack. The calls with arguments from kernel space of do_execve are replaced with calls to kernel_execve. The calls do_execve and do_execveat are made static as there are now no callers outside of exec. The comments that mention do_execve are updated to refer to kernel_execve or execve depending on the circumstances. In addition to correcting the comments, this makes it easy to grep for do_execve and verify it is not used. Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21exec: Move initialization of bprm->filename into alloc_bprmEric W. Biederman
Currently it is necessary for the usermode helper code and the code that launches init to use set_fs so that pages coming from the kernel look like they are coming from userspace. To allow that usage of set_fs to be removed cleanly the argument copying from userspace needs to happen earlier. Move the computation of bprm->filename and possible allocation of a name in the case of execveat into alloc_bprm to make that possible. The exectuable name, the arguments, and the environment are copied into the new usermode stack which is stored in bprm until exec passes the point of no return. As the executable name is copied first onto the usermode stack it needs to be known. As there are no dependencies to computing the executable name, compute it early in alloc_bprm. As an implementation detail if the filename needs to be generated because it embeds a file descriptor store that filename in a new field bprm->fdpath, and free it in free_bprm. Previously this was done in an independent variable pathbuf. I have renamed pathbuf fdpath because fdpath is more suggestive of what kind of path is in the variable. I moved fdpath into struct linux_binprm because it is tightly tied to the other variables in struct linux_binprm, and as such is needed to allow the call alloc_binprm to move. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lkml.kernel.org/r/87k0z66x8f.fsf@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21exec: Remove unnecessary spaces from binfmts.hEric W. Biederman
The general convention in the linux kernel is to define a pointer member as "type *name". The declaration of struct linux_binprm has several pointer defined as "type * name". Update them to the form of "type *name" for consistency. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lkml.kernel.org/r/87v9iq6x9x.fsf@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21Merge tag 'thunderbolt-for-v5.9' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next Mika writes: thunderbolt: Changes for v5.9 merge window This includes following Thunderbolt/USB4 changes for v5.9 merge window: * Improvements around NHI (Native Host Interface) HopID allocation * Improvements to tunneling and USB3 bandwidth management support * Add KUnit tests for path walking and tunneling * Initial support for USB4 retimer firmware upgrade * Implement Thunderbolt device firmware upgrade mechanism that runs the NVM image authentication when the device is disconnected. * A couple of small non-critical fixes * tag 'thunderbolt-for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (32 commits) thunderbolt: Fix old style declaration warning thunderbolt: Add support for authenticate on disconnect thunderbolt: Add support for separating the flush to SPI and authenticate thunderbolt: Ensure left shift of 512 does not overflow a 32 bit int thunderbolt: Add support for on-board retimers thunderbolt: Implement USB4 port sideband operations for retimer access thunderbolt: Retry USB4 block read operation thunderbolt: Generalize usb4_switch_do_[read|write]_data() thunderbolt: Split common NVM functionality into a separate file thunderbolt: Add Intel USB-IF ID to the NVM upgrade supported list thunderbolt: Add KUnit tests for tunneling thunderbolt: Add USB3 bandwidth management thunderbolt: Make tb_port_get_link_speed() available to other files thunderbolt: Implement USB3 bandwidth negotiation routines thunderbolt: Increase DP DPRX wait timeout thunderbolt: Report consumed bandwidth in both directions thunderbolt: Make usb4_switch_map_pcie_down() also return enabled ports thunderbolt: Make usb4_switch_map_usb3_down() also return enabled ports thunderbolt: Do not tunnel USB3 if link is not USB4 thunderbolt: Add DP IN resources for all routers ...
2020-07-21USB: Replace HTTP links with HTTPS onesAlexander A. Klimov
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Link: https://lore.kernel.org/r/20200719160910.60018-1-grandmaster@al2klimov.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21usb: typec: tcpm: Stay in BIST mode till hardreset or unattachedBadhri Jagan Sridharan
Port starts to toggle when transitioning to unattached state. This is incorrect while in BIST mode. 6.4.3.1 BIST Carrier Mode Upon receipt of a BIST Message, with a BIST Carrier Mode BIST Data Object, the UUT Shall send out a continuous string of BMC encoded alternating "1"s and “0”s. The UUT Shall exit the Continuous BIST Mode within tBISTContMode of this Continuous BIST Mode being enabled(see Section 6.6.7.2). 6.4.3.2 BIST Test Data Upon receipt of a BIST Message, with a BIST Test Data BIST Data Object, the UUT Shall return a GoodCRC Message and Shall enter a test mode in which it sends no further Messages except for GoodCRC Messages in response to received Messages. See Section 5.9.2 for the definition of the Test Data Frame. The test Shall be ended by sending Hard Reset Signaling to reset the UUT. Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200716034128.1251728-3-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21usb: typec: tcpm: Support bist test data mode for complianceBadhri Jagan Sridharan
TCPM supports BIST carried mode. PD compliance tests require BIST Test Data to be supported as well. Introducing set_bist_data callback to signal tcpc driver for configuring the port controller hardware to enable/disable BIST Test Data mode. Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200716034128.1251728-1-badhri@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21soundwire: intel: revisit SHIM programming sequences.Pierre-Louis Bossart
Somehow the existing code is not aligned with the steps described in the documentation, refactor code and make sure the register programming sequences are correct. Also add missing power-up, power-down and wake capabilities (the last two are used in follow-up patches but introduced here for consistency). Some of the SHIM registers exposed fields that are link specific, and in addition some of the power-related registers (SPA/CPA) take time to be updated. Uncontrolled access leads to timeouts or errors. Add a mutex, shared by all links, so that all accesses to such registers are serialized, and follow a pattern of read-modify-write. This includes making sure SHIM_SYNC is programmed only once, before the first master is powered on. We use a 'shim_mask' field, shared between all links and protected by a mutex, to deal with power-up and power-down sequences. Note that the SYNCPRD value is tied only to the XTAL value and not the current bus frequency or the frame rate. BugLink: https://github.com/thesofproject/linux/issues/1555 Signed-off-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20200716150947.22119-3-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-07-21compiler.h: Move compiletime_assert() macros into compiler_types.hWill Deacon
The kernel test robot reports that moving READ_ONCE() out into its own header breaks a W=1 build for parisc, which is relying on the definition of compiletime_assert() being available: | In file included from ./arch/parisc/include/generated/asm/rwonce.h:1, | from ./include/asm-generic/barrier.h:16, | from ./arch/parisc/include/asm/barrier.h:29, | from ./arch/parisc/include/asm/atomic.h:11, | from ./include/linux/atomic.h:7, | from kernel/locking/percpu-rwsem.c:2: | ./arch/parisc/include/asm/atomic.h: In function 'atomic_read': | ./include/asm-generic/rwonce.h:36:2: error: implicit declaration of function 'compiletime_assert' [-Werror=implicit-function-declaration] | 36 | compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \ | | ^~~~~~~~~~~~~~~~~~ | ./include/asm-generic/rwonce.h:49:2: note: in expansion of macro 'compiletime_assert_rwonce_type' | 49 | compiletime_assert_rwonce_type(x); \ | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ./arch/parisc/include/asm/atomic.h:73:9: note: in expansion of macro 'READ_ONCE' | 73 | return READ_ONCE((v)->counter); | | ^~~~~~~~~ Move these macros into compiler_types.h, so that they are available to READ_ONCE() and friends. Link: http://lists.infradead.org/pipermail/linux-arm-kernel/2020-July/587094.html Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21include/linux: Remove smp_read_barrier_depends() from commentsWill Deacon
smp_read_barrier_depends() doesn't exist any more, so reword the two comments that mention it to refer to "dependency ordering" instead. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h'Will Deacon
Now that 'smp_read_barrier_depends()' has gone the way of the Norwegian Blue, drop the inclusion of <asm/barrier.h> in 'asm-generic/rwonce.h'. This requires fixups to some architecture vdso headers which were previously relying on 'asm/barrier.h' coming in via 'linux/compiler.h'. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21compiler.h: Split {READ,WRITE}_ONCE definitions out into rwonce.hWill Deacon
In preparation for allowing architectures to define their own implementation of the READ_ONCE() macro, move the generic {READ,WRITE}_ONCE() definitions out of the unwieldy 'linux/compiler.h' file and into a new 'rwonce.h' header under 'asm-generic'. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21thermal/drivers/clock_cooling: Remove clock_cooling codeAmit Kucheria
clock_cooling has no in-kernel users. It has never found any use in drivers as far as I can tell. Remove the code. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/aa5d5ac2589cf7b14ece882130731b4a916849a6.1593619943.git.amit.kucheria@linaro.org
2020-07-21dma-fence: prime lockdep annotationsDaniel Vetter
Two in one go: - it is allowed to call dma_fence_wait() while holding a dma_resv_lock(). This is fundamental to how eviction works with ttm, so required. - it is allowed to call dma_fence_wait() from memory reclaim contexts, specifically from shrinker callbacks (which i915 does), and from mmu notifier callbacks (which amdgpu does, and which i915 sometimes also does, and probably always should, but that's kinda a debate). Also for stuff like HMM we really need to be able to do this, or things get real dicey. Consequence is that any critical path necessary to get to a dma_fence_signal for a fence must never a) call dma_resv_lock nor b) allocate memory with GFP_KERNEL. Also by implication of dma_resv_lock(), no userspace faulting allowed. That's some supremely obnoxious limitations, which is why we need to sprinkle the right annotations to all relevant paths. The one big locking context we're leaving out here is mmu notifiers, added in commit 23b68395c7c78a764e8963fc15a7cfd318bf187f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Aug 26 22:14:21 2019 +0200 mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end that one covers a lot of other callsites, and it's also allowed to wait on dma-fences from mmu notifiers. But there's no ready-made functions exposed to prime this, so I've left it out for now. v2: Also track against mmu notifier context. v3: kerneldoc to spec the cross-driver contract. Note that currently i915 throws in a hard-coded 10s timeout on foreign fences (not sure why that was done, but it's there), which is why that rule is worded with SHOULD instead of MUST. Also some of the mmu_notifier/shrinker rules might surprise SoC drivers, I haven't fully audited them all. Which is infeasible anyway, we'll need to run them with lockdep and dma-fence annotations and see what goes boom. v4: A spelling fix from Mika v5: #ifdef for CONFIG_MMU_NOTIFIER. Reported by 0day. Unfortunately this means lockdep enforcement is slightly inconsistent, it won't spot GFP_NOIO and GFP_NOFS allocations in the wrong spot if CONFIG_MMU_NOTIFIER is disabled in the kernel config. Oh well. v5: Note that only drivers/gpu has a reasonable (or at least historical) excuse to use dma_fence_wait() from shrinker and mmu notifier callbacks. Everyone else should either have a better memory manager model, or better hardware. This reflects discussions with Jason Gunthorpe. Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: kernel test robot <lkp@intel.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> (v4) Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-3-daniel.vetter@ffwll.ch
2020-07-21dma-fence: basic lockdep annotationsDaniel Vetter
Design is similar to the lockdep annotations for workers, but with some twists: - We use a read-lock for the execution/worker/completion side, so that this explicit annotation can be more liberally sprinkled around. With read locks lockdep isn't going to complain if the read-side isn't nested the same way under all circumstances, so ABBA deadlocks are ok. Which they are, since this is an annotation only. - We're using non-recursive lockdep read lock mode, since in recursive read lock mode lockdep does not catch read side hazards. And we _very_ much want read side hazards to be caught. For full details of this limitation see commit e91498589746065e3ae95d9a00b068e525eec34f Author: Peter Zijlstra <peterz@infradead.org> Date: Wed Aug 23 13:13:11 2017 +0200 locking/lockdep/selftests: Add mixed read-write ABBA tests - To allow nesting of the read-side explicit annotations we explicitly keep track of the nesting. lock_is_held() allows us to do that. - The wait-side annotation is a write lock, and entirely done within dma_fence_wait() for everyone by default. - To be able to freely annotate helper functions I want to make it ok to call dma_fence_begin/end_signalling from soft/hardirq context. First attempt was using the hardirq locking context for the write side in lockdep, but this forces all normal spinlocks nested within dma_fence_begin/end_signalling to be spinlocks. That bollocks. The approach now is to simple check in_atomic(), and for these cases entirely rely on the might_sleep() check in dma_fence_wait(). That will catch any wrong nesting against spinlocks from soft/hardirq contexts. The idea here is that every code path that's critical for eventually signalling a dma_fence should be annotated with dma_fence_begin/end_signalling. The annotation ideally starts right after a dma_fence is published (added to a dma_resv, exposed as a sync_file fd, attached to a drm_syncobj fd, or anything else that makes the dma_fence visible to other kernel threads), up to and including the dma_fence_wait(). Examples are irq handlers, the scheduler rt threads, the tail of execbuf (after the corresponding fences are visible), any workers that end up signalling dma_fences and really anything else. Not annotated should be code paths that only complete fences opportunistically as the gpu progresses, like e.g. shrinker/eviction code. The main class of deadlocks this is supposed to catch are: Thread A: mutex_lock(A); mutex_unlock(A); dma_fence_signal(); Thread B: mutex_lock(A); dma_fence_wait(); mutex_unlock(A); Thread B is blocked on A signalling the fence, but A never gets around to that because it cannot acquire the lock A. Note that dma_fence_wait() is allowed to be nested within dma_fence_begin/end_signalling sections. To allow this to happen the read lock needs to be upgraded to a write lock, which means that any other lock is acquired between the dma_fence_begin_signalling() call and the call to dma_fence_wait(), and still held, this will result in an immediate lockdep complaint. The only other option would be to not annotate such calls, defeating the point. Therefore these annotations cannot be sprinkled over the code entirely mindless to avoid false positives. Originally I hope that the cross-release lockdep extensions would alleviate the need for explicit annotations: https://lwn.net/Articles/709849/ But there's a few reasons why that's not an option: - It's not happening in upstream, since it got reverted due to too many false positives: commit e966eaeeb623f09975ef362c2866fae6f86844f9 Author: Ingo Molnar <mingo@kernel.org> Date: Tue Dec 12 12:31:16 2017 +0100 locking/lockdep: Remove the cross-release locking checks This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y), while it found a number of old bugs initially, was also causing too many false positives that caused people to disable lockdep - which is arguably a worse overall outcome. - cross-release uses the complete() call to annotate the end of critical sections, for dma_fence that would be dma_fence_signal(). But we do not want all dma_fence_signal() calls to be treated as critical, since many are opportunistic cleanup of gpu requests. If these get stuck there's still the main completion interrupt and workers who can unblock everyone. Automatically annotating all dma_fence_signal() calls would hence cause false positives. - cross-release had some educated guesses for when a critical section starts, like fresh syscall or fresh work callback. This would again cause false positives without explicit annotations, since for dma_fence the critical sections only starts when we publish a fence. - Furthermore there can be cases where a thread never does a dma_fence_signal, but is still critical for reaching completion of fences. One example would be a scheduler kthread which picks up jobs and pushes them into hardware, where the interrupt handler or another completion thread calls dma_fence_signal(). But if the scheduler thread hangs, then all the fences hang, hence we need to manually annotate it. cross-release aimed to solve this by chaining cross-release dependencies, but the dependency from scheduler thread to the completion interrupt handler goes through hw where cross-release code can't observe it. In short, without manual annotations and careful review of the start and end of critical sections, cross-relese dependency tracking doesn't work. We need explicit annotations. v2: handle soft/hardirq ctx better against write side and dont forget EXPORT_SYMBOL, drivers can't use this otherwise. v3: Kerneldoc. v4: Some spelling fixes from Mika v5: Amend commit message to explain in detail why cross-release isn't the solution. v6: Pull out misplaced .rst hunk. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dave Airlie <airlied@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-2-daniel.vetter@ffwll.ch
2020-07-20qed: add support for the extended speed and FEC modesAlexander Lobakin
Add all necessary code (NVM parsing, MFW and Ethtool reports etc.) to support extended speed and FEC modes. These new modes are supported by the new boards revisions and newer MFW versions. Misc: correct port type for MEDIA_KR. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: add missing loopback modesAlexander Lobakin
These modes are relevant only for several boards, but may be reported by MFW as well as the others. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: add support for Forward Error CorrectionAlexander Lobakin
Add all necessary routines for reading supported FEC modes from NVM and querying FEC control to the MFW (if the running version supports it). Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed: reformat several structures a bitAlexander Lobakin
Prior to adding new fields and bitfields, reformat the related structures according to the Linux style (spaces to tabs, lowercase hex, indentation etc.). Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20qed, qede, qedf: convert link mode from u32 to ETHTOOL_LINK_MODEAlexander Lobakin
Currently qed driver already ran out of 32 bits to store link modes, and this doesn't allow to add and support more speeds. Convert custom link mode to generic Ethtool bitmap and definitions (convenient Phylink shorthands are used for elegance and readability). This allowed us to drop all conversions/mappings between the driver and Ethtool. This involves changes in qede and qedf as well, as they used definitions from shared "qed_if.h". Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20linkmode: introduce linkmode_intersects()Alexander Lobakin
Add a new helper to find intersections between Ethtool link modes, linkmode_intersects(), similar to the other linkmode helpers. Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20sched: sch_api: add missing rcu read lock to silence the warningJiri Pirko
In case the qdisc_match_from_root function() is called from non-rcu path with rtnl mutex held, a suspiciout rcu usage warning appears: [ 241.504354] ============================= [ 241.504358] WARNING: suspicious RCU usage [ 241.504366] 5.8.0-rc4-custom-01521-g72a7c7d549c3 #32 Not tainted [ 241.504370] ----------------------------- [ 241.504378] net/sched/sch_api.c:270 RCU-list traversed in non-reader section!! [ 241.504382] other info that might help us debug this: [ 241.504388] rcu_scheduler_active = 2, debug_locks = 1 [ 241.504394] 1 lock held by tc/1391: [ 241.504398] #0: ffffffff85a27850 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x49a/0xbd0 [ 241.504431] stack backtrace: [ 241.504440] CPU: 0 PID: 1391 Comm: tc Not tainted 5.8.0-rc4-custom-01521-g72a7c7d549c3 #32 [ 241.504446] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [ 241.504453] Call Trace: [ 241.504465] dump_stack+0x100/0x184 [ 241.504482] lockdep_rcu_suspicious+0x153/0x15d [ 241.504499] qdisc_match_from_root+0x293/0x350 Fix this by passing the rtnl held lockdep condition down to hlist_for_each_entry_rcu() Reported-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20Merge branch 'clk-doc' into clk-nextStephen Boyd
* clk-doc: clk: <linux/clk-provider.h>: drop a duplicated word
2020-07-20clk: <linux/clk-provider.h>: drop a duplicated wordRandy Dunlap
Drop the repeated word "not" in a comment. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: linux-clk@vger.kernel.org Link: https://lore.kernel.org/r/20200719002830.20319-1-rdunlap@infradead.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-07-20block: remove blk_queue_stack_limitsChristoph Hellwig
This function is just a tiny wrapper around blk_stack_limits. Open code it int the two callers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-20block: remove bdev_stack_limitsChristoph Hellwig
This function is just a tiny wrapper around blk_stack_limit and has two callers. Simplify the stack a bit by open coding it in the two callers. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-20block: inherit the zoned characteristics in blk_stack_limitsChristoph Hellwig
Lift the code from device mapper into blk_stack_limits to inherity the stacking limitations. This ensures we do the right thing for all stacked zoned block devices. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Tested-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-20Merge branch 'for-5.9/drivers' into for-5.9/block-mergeJens Axboe
* for-5.9/drivers: (38 commits) block: add max_active_zones to blk-sysfs block: add max_open_zones to blk-sysfs s390/dasd: Use struct_size() helper s390/dasd: fix inability to use DASD with DIAG driver md-cluster: fix wild pointer of unlock_all_bitmaps() md/raid5-cache: clear MD_SB_CHANGE_PENDING before flushing stripes md: fix deadlock causing by sysfs_notify md: improve io stats accounting md: raid0/linear: fix dereference before null check on pointer mddev rsxx: switch from 'pci_free_consistent()' to 'dma_free_coherent()' nvme: remove ns->disk checks nvme-pci: use standard block status symbolic names nvme-pci: use the consistent return type of nvme_pci_iod_alloc_size() nvme-pci: add a blank line after declarations nvme-pci: fix some comments issues nvme-pci: remove redundant segment validation nvme: document quirked Intel models nvme: expose reconnect_delay and ctrl_loss_tmo via sysfs nvme: support for zoned namespaces nvme: support for multiple Command Sets Supported and Effects log pages ...
2020-07-20Merge branch 'for-5.9/block' into for-5.9/block-mergeJens Axboe
* for-5.9/block: (124 commits) blk-cgroup: show global disk stats in root cgroup io.stat blk-cgroup: make iostat functions visible to stat printing block: improve discard bio alignment in __blkdev_issue_discard() block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers block: defer flush request no matter whether we have elevator block: make blk_timeout_init() static block: remove retry loop in ioc_release_fn() block: remove unnecessary ioc nested locking block: integrate bd_start_claiming into __blkdev_get block: use bd_prepare_to_claim directly in the loop driver block: refactor bd_start_claiming block: simplify the restart case in __blkdev_get Revert "blk-rq-qos: remove redundant finish_wait to rq_qos_wait." block: always remove partitions from blk_drop_partitions() block: relax jiffies rounding for timeouts blk-mq: remove redundant validation in __blk_mq_end_request() blk-mq: Remove unnecessary local variable writeback: remove bdi->congested_fn writeback: remove struct bdi_writeback_congested writeback: remove {set,clear}_wb_congested ...
2020-07-20ima: Support additional conditionals in the KEXEC_CMDLINE hook functionTyler Hicks
Take the properties of the kexec kernel's inode and the current task ownership into consideration when matching a KEXEC_CMDLINE operation to the rules in the IMA policy. This allows for some uniformity when writing IMA policy rules for KEXEC_KERNEL_CHECK, KEXEC_INITRAMFS_CHECK, and KEXEC_CMDLINE operations. Prior to this patch, it was not possible to write a set of rules like this: dont_measure func=KEXEC_KERNEL_CHECK obj_type=foo_t dont_measure func=KEXEC_INITRAMFS_CHECK obj_type=foo_t dont_measure func=KEXEC_CMDLINE obj_type=foo_t measure func=KEXEC_KERNEL_CHECK measure func=KEXEC_INITRAMFS_CHECK measure func=KEXEC_CMDLINE The inode information associated with the kernel being loaded by a kexec_kernel_load(2) syscall can now be included in the decision to measure or not Additonally, the uid, euid, and subj_* conditionals can also now be used in KEXEC_CMDLINE rules. There was no technical reason as to why those conditionals weren't being considered previously other than ima_match_rules() didn't have a valid inode to use so it immediately bailed out for KEXEC_CMDLINE operations rather than going through the full list of conditional comparisons. Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20uuid: remove unused uuid_le_to_bin() definitionAndy Shevchenko
There is no more user, so remove it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-20sched_clock: Expose struct clock_read_dataPeter Zijlstra
In order to support perf_event_mmap_page::cap_time features, an architecture needs, aside from a userspace readable counter register, to expose the exact clock data so that userspace can convert the counter register into a correct timestamp. Provide struct clock_read_data and two (seqcount) helpers so that architectures (arm64 in specific) can expose the numbers to userspace. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Link: https://lore.kernel.org/r/20200716051130.4359-2-leo.yan@linaro.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20backlight: backlight: Make of_find_backlight staticSam Ravnborg
There are no external users of of_find_backlight, as they have all changed to use the managed version. Make of_find_backlight static to prevent new external users. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>