summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-04arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS supportCatalin Marinas
Add support for bulk setting/getting of the MTE tags in a tracee's address space at 'addr' in the ptrace() syscall prototype. 'data' points to a struct iovec in the tracer's address space with iov_base representing the address of a tracer's buffer of length iov_len. The tags to be copied to/from the tracer's buffer are stored as one tag per byte. On successfully copying at least one tag, ptrace() returns 0 and updates the tracer's iov_len with the number of tags copied. In case of error, either -EIO or -EFAULT is returned, trying to follow the ptrace() man page. Note that the tag copying functions are not performance critical, therefore they lack optimisations found in typical memory copy routines. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Alan Hayward <Alan.Hayward@arm.com> Cc: Luis Machado <luis.machado@linaro.org> Cc: Omair Javaid <omair.javaid@linaro.org>
2020-09-04arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasksCatalin Marinas
In preparation for ptrace() access to the prctl() value, allow calling these functions on non-current tasks. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: mte: Restore the GCR_EL1 register after a suspendCatalin Marinas
The CPU resume/suspend routines only take care of the common system registers. Restore GCR_EL1 in addition via the __cpu_suspend_exit() function. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2020-09-04arm64: mte: Allow user control of the generated random tags via prctl()Catalin Marinas
The IRG, ADDG and SUBG instructions insert a random tag in the resulting address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap when, for example, the user wants a certain colour for freed buffers. Since the GCR_EL1 register is not accessible at EL0, extend the prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in the first argument for controlling which tags can be generated by the above instruction (an include rather than exclude mask). Note that by default all non-zero tags are excluded. This setting is per-thread. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: mte: Allow user control of the tag check mode via prctl()Catalin Marinas
By default, even if PROT_MTE is set on a memory range, there is no tag check fault reporting (SIGSEGV). Introduce a set of option to the exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag check fault mode: PR_MTE_TCF_NONE - no reporting (default) PR_MTE_TCF_SYNC - synchronous tag check fault reporting PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting These options translate into the corresponding SCTLR_EL1.TCF0 bitfield, context-switched by the kernel. Note that the kernel accesses to the user address space (e.g. read() system call) are not checked if the user thread tag checking mode is PR_MTE_TCF_NONE or PR_MTE_TCF_ASYNC. If the tag checking mode is PR_MTE_TCF_SYNC, the kernel makes a best effort to check its user address accesses, however it cannot always guarantee it. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04mm: Allow arm64 mmap(PROT_MTE) on RAM-based filesCatalin Marinas
Since arm64 memory (allocation) tags can only be stored in RAM, mapping files with PROT_MTE is not allowed by default. RAM-based files like those in a tmpfs mount or memfd_create() can support memory tagging, so update the vm_flags accordingly in shmem_mmap(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-09-04arm64: mte: Validate the PROT_MTE request via arch_validate_flags()Catalin Marinas
Make use of the newly introduced arch_validate_flags() hook to sanity-check the PROT_MTE request passed to mmap() and mprotect(). If the mapping does not support MTE, these syscalls will return -EINVAL. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04mm: Introduce arch_validate_flags()Catalin Marinas
Similarly to arch_validate_prot() called from do_mprotect_pkey(), an architecture may need to sanity-check the new vm_flags. Define a dummy function always returning true. In addition to do_mprotect_pkey(), also invoke it from mmap_region() prior to updating vma->vm_page_prot to allow the architecture code to veto potentially inconsistent vm_flags. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>
2020-09-04arm64: mte: Add PROT_MTE support to mmap() and mprotect()Catalin Marinas
To enable tagging on a memory range, the user must explicitly opt in via a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new memory type in the AttrIndx field of a pte, simplify the or'ing of these bits over the protection_map[] attributes by making MT_NORMAL index 0. There are two conditions for arch_vm_get_page_prot() to return the MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE, registered as VM_MTE in the vm_flags, and (2) the vma supports MTE, decided during the mmap() call (only) and registered as VM_MTE_ALLOWED. arch_calc_vm_prot_bits() is responsible for registering the user request as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its mmap() file ops call. In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on stack or brk area. The Linux mmap() syscall currently ignores unknown PROT_* flags. In the presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE will not report an error and the memory will not be mapped as Normal Tagged. For consistency, mprotect(PROT_MTE) will not report an error either if the memory range does not support MTE. Two subsequent patches in the series will propose tightening of this behaviour. Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04mm: Introduce arch_calc_vm_flag_bits()Kevin Brodsky
Similarly to arch_calc_vm_prot_bits(), introduce a dummy arch_calc_vm_flag_bits() invoked from calc_vm_flag_bits(). This macro can be overridden by architectures to insert specific VM_* flags derived from the mmap() MAP_* flags. Signed-off-by: Kevin Brodsky <Kevin.Brodsky@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04arm64: mte: Tags-aware aware memcmp_pages() implementationCatalin Marinas
When the Memory Tagging Extension is enabled, two pages are identical only if both their data and tags are identical. Make the generic memcmp_pages() a __weak function and add an arm64-specific implementation which returns non-zero if any of the two pages contain valid MTE tags (PG_mte_tagged set). There isn't much benefit in comparing the tags of two pages since these are normally used for heap allocations and likely to differ anyway. Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: Avoid unnecessary clear_user_page() indirectionCatalin Marinas
Since clear_user_page() calls clear_page() directly, avoid the unnecessary indirection. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: mte: Tags-aware copy_{user_,}highpage() implementationsVincenzo Frascino
When the Memory Tagging Extension is enabled, the tags need to be preserved across page copy (e.g. for copy-on-write, page migration). Introduce MTE-aware copy_{user_,}highpage() functions to copy tags to the destination if the source page has the PG_mte_tagged flag set. copy_user_page() does not need to handle tag copying since, with this patch, it is only called by the DAX code where there is no source page structure (and no source tags). Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTECatalin Marinas
Pages allocated by the kernel are not guaranteed to have the tags zeroed, especially as the kernel does not (yet) use MTE itself. To ensure the user can still access such pages when mapped into its address space, clear the tags via set_pte_at(). A new page flag - PG_mte_tagged (PG_arch_2) - is used to track pages with valid allocation tags. Since the zero page is mapped as pte_special(), it won't be covered by the above set_pte_at() mechanism. Clear its tags during early MTE initialisation. Co-developed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04mm: Preserve the PG_arch_2 flag in __split_huge_page_tail()Catalin Marinas
When a huge page is split into normal pages, part of the head page flags are transferred to the tail pages. However, the PG_arch_* flags are not part of the preserved set. PG_arch_2 is used by the arm64 MTE support to mark pages that have valid tags. The absence of such flag would cause the arm64 set_pte_at() to clear the tags in order to avoid stale tags exposed to user or the swapping out hooks to ignore the tags. Not preserving PG_arch_2 on huge page splitting leads to tag corruption in the tail pages. Preserve the newly added PG_arch_2 flag in __split_huge_page_tail(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04mm: Add PG_arch_2 page flagSteven Price
For arm64 MTE support it is necessary to be able to mark pages that contain user space visible tags that will need to be saved/restored e.g. when swapped out. To support this add a new arch specific flag (PG_arch_2). This flag is only available on 64-bit architectures due to the limited number of spare page flags on the 32-bit ones. Signed-off-by: Steven Price <steven.price@arm.com> [catalin.marinas@arm.com: use CONFIG_64BIT for guarding this new flag] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04arm64: mte: Handle synchronous and asynchronous tag check faultsVincenzo Frascino
The Memory Tagging Extension has two modes of notifying a tag check fault at EL0, configurable through the SCTLR_EL1.TCF0 field: 1. Synchronous raising of a Data Abort exception with DFSC 17. 2. Asynchronous setting of a cumulative bit in TFSRE0_EL1. Add the exception handler for the synchronous exception and handling of the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in do_notify_resume(). On a tag check failure in user-space, whether synchronous or asynchronous, a SIGSEGV will be raised on the faulting thread. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: mte: Add specific SIGSEGV codesVincenzo Frascino
Add MTE-specific SIGSEGV codes to siginfo.h and update the x86 BUILD_BUG_ON(NSIGSEGV != 7) compile check. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> [catalin.marinas@arm.com: renamed precise/imprecise to sync/async] [catalin.marinas@arm.com: dropped #ifdef __aarch64__, renumbered] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Will Deacon <will@kernel.org>
2020-09-04arm64: kvm: mte: Hide the MTE CPUID information from the guestsCatalin Marinas
KVM does not support MTE in guests yet, so clear the corresponding field in the ID_AA64PFR1_EL1 register. In addition, inject an undefined exception in the guest if it accesses one of the GCR_EL1, RGSR_EL1, TFSR_EL1 or TFSRE0_EL1 registers. While the emulate_sys_reg() function already injects an undefined exception, this patch prevents the unnecessary printk. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Price <steven.price@arm.com> Acked-by: Marc Zyngier <maz@kernel.org>
2020-09-04drm/virtio: drop virtio_gpu_output->enabledGerd Hoffmann
Not needed, already tracked by drm_crtc_state->active. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20200818072511.6745-3-kraxel@redhat.com (cherry picked from commit 1174c8a0f33c1e5c442ac40381fe124248c08b3a)
2020-09-04Merge tag 'phy-fixes-5.9' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy into char-misc-linus Vinod writes: phy: fixes for 5.9 *) platform_no_drv_owner.cocci and return value check qcom ipq806x-usb driver *) correcting register programming for ipq8074 phy *) disable PHY charger detect for omap-usb2-phy * tag 'phy-fixes-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: omap-usb2-phy: disable PHY charger detect phy: qcom-qmp: Use correct values for ipq8074 PCIe Gen2 PHY init phy: qualcomm: fix return value check in qcom_ipq806x_usb_phy_probe() phy: qualcomm: fix platform_no_drv_owner.cocci warnings
2020-09-04Merge tag 'soundwire-5.9-fixes' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire into char-misc-linus Vinod writes: soundwire fixes for v5.8 This contains two fixes to sdw core for dangling pointer and a typo for INTSTAT register * tag 'soundwire-5.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: fix double free of dangling pointer soundwire: bus: fix typo in comment on INTSTAT registers
2020-09-04iommu/vt-d: Handle 36bit addressing for x86-32Chris Wilson
Beware that the address size for x86-32 may exceed unsigned long. [ 0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14 [ 0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int' If we don't handle the wide addresses, the pages are mismapped and the device read/writes go astray, detected as DMAR faults and leading to device failure. The behaviour changed (from working to broken) in commit fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer"), but the error looks older. Fixes: fa954e683178 ("iommu/vt-d: Delegate the dma domain to upper layer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: James Sewart <jamessewart@arista.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: <stable@vger.kernel.org> # v5.3+ Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.uk Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Do not use IOMMUv2 functionality when SME is activeJoerg Roedel
When memory encryption is active the device is likely not in a direct mapped domain. Forbid using IOMMUv2 functionality for now until finer grained checks for this have been implemented. Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200824105415.21000-3-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Do not force direct mapping when SME is activeJoerg Roedel
Do not force devices supporting IOMMUv2 to be direct mapped when memory encryption is active. This might cause them to be unusable because their DMA mask does not include the encryption bit. Signed-off-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20200824105415.21000-2-joro@8bytes.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04KVM: arm64: Update page shift if stage 2 block mapping not supportedAlexandru Elisei
Commit 196f878a7ac2e (" KVM: arm/arm64: Signal SIGBUS when stage2 discovers hwpoison memory") modifies user_mem_abort() to send a SIGBUS signal when the fault IPA maps to a hwpoisoned page. Commit 1559b7583ff6 ("KVM: arm/arm64: Re-check VMA on detecting a poisoned page") changed kvm_send_hwpoison_signal() to use the page shift instead of the VMA because at that point the code had already released the mmap lock, which means userspace could have modified the VMA. If userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to map the guest fault IPA using block mappings in stage 2. That is not always possible, if, for example, userspace uses dirty page logging for the VM. Update the page shift appropriately in those cases when we downgrade the stage 2 entry from a block mapping to a page. Fixes: 1559b7583ff6 ("KVM: arm/arm64: Re-check VMA on detecting a poisoned page") Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Link: https://lore.kernel.org/r/20200901133357.52640-2-alexandru.elisei@arm.com
2020-09-04KVM: arm64: Fix address truncation in tracesMarc Zyngier
Owing to their ARMv7 origins, the trace events are truncating most address values to 32bits. That's not really helpful. Expand the printing of such values to their full glory. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-04iommu/amd: Use cmpxchg_double() when updating 128-bit IRTESuravee Suthikulpanit
When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode), current driver disables interrupt remapping when it updates the IRTE so that the upper and lower 64-bit values can be updated safely. However, this creates a small window, where the interrupt could arrive and result in IO_PAGE_FAULT (for interrupt) as shown below. IOMMU Driver Device IRQ ============ =========== irte.RemapEn=0 ... change IRTE IRQ from device ==> IO_PAGE_FAULT !! ... irte.RemapEn=1 This scenario has been observed when changing irq affinity on a system running I/O-intensive workload, in which the destination APIC ID in the IRTE is updated. Instead, use cmpxchg_double() to update the 128-bit IRTE at once without disabling the interrupt remapping. However, this means several features, which require GA (128-bit IRTE) support will also be affected if cmpxchg16b is not supported (which is unprecedented for AMD processors w/ IOMMU). Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure") Reported-by: Sean Osborne <sean.m.osborne@oracle.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Tested-by: Erik Rockstrom <erik.rockstrom@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200903093822.52012-3-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/amd: Restore IRTE.RemapEn bit after programming IRTESuravee Suthikulpanit
Currently, the RemapEn (valid) bit is accidentally cleared when programming IRTE w/ guestMode=0. It should be restored to the prior state. Fixes: b9fc6b56f478 ("iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices") Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20200903093822.52012-2-suravee.suthikulpanit@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04thermal: core: Fix use-after-free in thermal_zone_device_unregister()Dmitry Osipenko
The user-after-free bug in thermal_zone_device_unregister() is reported by KASAN. It happens because struct thermal_zone_device is released during of device_unregister() invocation, and hence the "tz" variable shouldn't be touched by thermal_notify_tz_delete(tz->id). Fixes: 55cdf0a283b8 ("thermal: core: Add notifications call in the framework") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200817235854.26816-1-digetx@gmail.com
2020-09-04thermal: qcom-spmi-temp-alarm: Don't suppress negative tempVeera Vegivada
Currently driver is suppressing the negative temperature readings from the vadc. Consumers of the thermal zones need to read the negative temperature too. Don't suppress the readings. Fixes: c610afaa21d3c6e ("thermal: Add QPNP PMIC temperature alarm driver") Signed-off-by: Veera Vegivada <vvegivad@codeaurora.org> Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/944856eb819081268fab783236a916257de120e4.1596040416.git.gurus@codeaurora.org
2020-09-04KVM: arm64: Do not try to map PUDs when they are folded into PMDMarc Zyngier
For the obscure cases where PMD and PUD are the same size (64kB pages with 42bit VA, for example, which results in only two levels of page tables), we can't map anything as a PUD, because there is... erm... no PUD to speak of. Everything is either a PMD or a PTE. So let's only try and map a PUD when its size is different from that of a PMD. Cc: stable@vger.kernel.org Fixes: b8e0ba7c8bea ("KVM: arm64: Add support for creating PUD hugepages at stage 2") Reported-by: Gavin Shan <gshan@redhat.com> Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Gavin Shan <gshan@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-04thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430Tony Lindgren
We can sometimes get bogus thermal shutdowns on omap4430 at least with droid4 running idle with a battery charger connected: thermal thermal_zone0: critical temperature reached (143 C), shutting down Dumping out the register values shows we can occasionally get a 0x7f value that is outside the TRM listed values in the ADC conversion table. And then we get a normal value when reading again after that. Reading the register multiple times does not seem help avoiding the bogus values as they stay until the next sample is ready. Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we should have values from 13 to 107 listed with a total of 95 values. But looking at the omap4430_adc_to_temp array, the values are off, and the end values are missing. And it seems that the 4430 ADC table is similar to omap3630 rather than omap4460. Let's fix the issue by using values based on the omap3630 table and just ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has the missing values added while the TRM table only shows every second value. Note that sometimes the ADC register values within the valid table can also be way off for about 1 out of 10 values. But it seems that those just show about 25 C too low values rather than too high values. So those do not cause a bogus thermal shutdown. Fixes: 1a31270e54d7 ("staging: omap-thermal: add OMAP4 data structures") Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.com
2020-09-04iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set()Lu Baolu
The dev_iommu_priv_set() must be called after probe_device(). This fixes a NULL pointer deference bug when booting a system with kernel cmdline "intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused. The following stacktrace was produced: Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off ... DMAR: Host address width 39 DMAR: DRHD base: 0x000000fed90000 flags: 0x0 DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e DMAR: DRHD base: 0x000000fed91000 flags: 0x1 DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff DMAR: No ATSR found BUG: kernel NULL pointer dereference, address: 0000000000000038 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2 Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018 RIP: 0010:intel_iommu_init+0xed0/0x1136 Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48 03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1 e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0 RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282 RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0 R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0 Call Trace: ? _raw_spin_unlock_irqrestore+0x1f/0x30 ? call_rcu+0x10e/0x320 ? trace_hardirqs_on+0x2c/0xd0 ? rdinit_setup+0x2c/0x2c ? e820__memblock_setup+0x8b/0x8b pci_iommu_init+0x16/0x3f do_one_initcall+0x46/0x1e4 kernel_init_freeable+0x169/0x1b2 ? rest_init+0x9f/0x9f kernel_init+0xa/0x101 ret_from_fork+0x22/0x30 Modules linked in: CR2: 0000000000000038 ---[ end trace 3653722a6f936f18 ]--- Fixes: 01b9d4e21148c ("iommu/vt-d: Use dev_iommu_priv_get/set()") Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Reported-by: Wendy Wang <wendy.wang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/ Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04iommu/vt-d: Serialize IOMMU GCMD register modificationsLu Baolu
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. It also checks the status register before writing command register to avoid unnecessary register write. Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers moveLukas Bulwahn
Commit e86d1aa8b60f ("iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory") moved drivers/iommu/qcom_iommu.c to drivers/iommu/arm/arm-smmu/qcom_iommu.c amongst other moves, adjusted some sections in MAINTAINERS, but missed adjusting the QUALCOMM IOMMU section. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains: warning: no file matches F: drivers/iommu/qcom_iommu.c Update the file entry in MAINTAINERS to the new location. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200825053828.4166-1-lukas.bulwahn@gmail.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04drm/sun4i: backend: Disable alpha on the lowest plane on the A20Maxime Ripard
Unlike we previously thought, the per-pixel alpha is just as broken on the A20 as it is on the A10. Remove the quirk that says we can use it. Fixes: dcf496a6a608 ("drm/sun4i: sun4i: Introduce a quirk for lowest plane alpha support") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728134810.883457-2-maxime@cerno.tech
2020-09-04drm/sun4i: backend: Support alpha property on lowest planeMaxime Ripard
Unlike what we previously thought, only the per-pixel alpha is broken on the lowest plane and the per-plane alpha isn't. Remove the check on the alpha property being set on the lowest plane to reject a mode. Fixes: dcf496a6a608 ("drm/sun4i: sun4i: Introduce a quirk for lowest plane alpha support") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728134810.883457-1-maxime@cerno.tech
2020-09-04drm/sun4i: Fix DE2 YVU handlingJernej Skrabec
Function sun8i_vi_layer_get_csc_mode() is supposed to return CSC mode but due to inproper return type (bool instead of u32) it returns just 0 or 1. Colors are wrong for YVU formats because of that. Fixes: daab3d0e8e2b ("drm/sun4i: de2: csc_mode in de2 format struct is mostly redundant") Reported-by: Roman Stratiienko <r.stratiienko@gmail.com> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Tested-by: Roman Stratiienko <r.stratiienko@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200901220305.6809-1-jernej.skrabec@siol.net
2020-09-04x86/entry/64: Do not include inst.h in calling.hUros Bizjak
inst.h was included in calling.h solely to instantiate the RDPID macro. The usage of RDPID was removed in 6a3ea3e68b8a ("x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM") so remove the include. Fixes: 6a3ea3e68b8a ("x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM") Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200827171735.93825-1-ubizjak@gmail.com
2020-09-04xen: add helpers to allocate unpopulated memoryRoger Pau Monne
To be used in order to create foreign mappings. This is based on the ZONE_DEVICE facility which is used by persistent memory devices in order to create struct pages and kernel virtual mappings for the IOMEM areas of such devices. Note that on kernels without support for ZONE_DEVICE Xen will fallback to use ballooned pages in order to create foreign mappings. The newly added helpers use the same parameters as the existing {alloc/free}_xenballooned_pages functions, which allows for in-place replacement of the callers. Once a memory region has been added to be used as scratch mapping space it will no longer be released, and pages returned are kept in a linked list. This allows to have a buffer of pages and prevents resorting to frequent additions and removals of regions. If enabled (because ZONE_DEVICE is supported) the usage of the new functionality untangles Xen balloon and RAM hotplug from the usage of unpopulated physical memory ranges to map foreign pages, which is the correct thing to do in order to avoid mappings of foreign pages depend on memory hotplug. Note the driver is currently not enabled on Arm platforms because it would interfere with the identity mapping required on some platforms. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20200901083326.21264-4-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-09-04memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERICRoger Pau Monne
This is in preparation for the logic behind MEMORY_DEVICE_DEVDAX also being used by non DAX devices. No functional change intended. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: https://lore.kernel.org/r/20200901083326.21264-3-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-09-04xen/balloon: add header guardRoger Pau Monne
In order to protect against the header being included multiple times on the same compilation unit. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20200901083326.21264-2-roger.pau@citrix.com Signed-off-by: Juergen Gross <jgross@suse.com>
2020-09-04padata: fix possible padata_works_lock deadlockDaniel Jordan
syzbot reports, WARNING: inconsistent lock state 5.9.0-rc2-syzkaller #0 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. syz-executor.0/26715 takes: (padata_works_lock){+.?.}-{2:2}, at: padata_do_parallel kernel/padata.c:220 {IN-SOFTIRQ-W} state was registered at: spin_lock include/linux/spinlock.h:354 [inline] padata_do_parallel kernel/padata.c:220 ... __do_softirq kernel/softirq.c:298 ... sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1091 asm_sysvec_apic_timer_interrupt arch/x86/include/asm/idtentry.h:581 Possible unsafe locking scenario: CPU0 ---- lock(padata_works_lock); <Interrupt> lock(padata_works_lock); padata_do_parallel() takes padata_works_lock with softirqs enabled, so a deadlock is possible if, on the same CPU, the lock is acquired in process context and then softirq handling done in an interrupt leads to the same path. Fix by leaving softirqs disabled while do_parallel holds padata_works_lock. Reported-by: syzbot+f4b9f49e38e25eb4ef52@syzkaller.appspotmail.com Fixes: 4611ce2246889 ("padata: allocate work structures for parallel jobs from a pool") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-04x86, fakenuma: Fix invalid starting node IDHuang Ying
Commit: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") uses "-1" as the starting node ID, which causes the strange kernel log as follows, when "numa=fake=32G" is added to the kernel command line: Faking node -1 at [mem 0x0000000000000000-0x0000000893ffffff] (35136MB) Faking node 0 at [mem 0x0000001840000000-0x000000203fffffff] (32768MB) Faking node 1 at [mem 0x0000000894000000-0x000000183fffffff] (64192MB) Faking node 2 at [mem 0x0000002040000000-0x000000283fffffff] (32768MB) Faking node 3 at [mem 0x0000002840000000-0x000000303fffffff] (32768MB) And finally the kernel crashes: BUG: Bad page state in process swapper pfn:00011 page:(____ptrval____) refcount:0 mapcount:1 mapping:(____ptrval____) index:0x55cd7e44b270 pfn:0x11 failed to read mapping contents, not a valid kernel address? flags: 0x5(locked|uptodate) raw: 0000000000000005 000055cd7e44af30 000055cd7e44af50 0000000100000006 raw: 000055cd7e44b270 000055cd7e44b290 0000000000000000 000055cd7e44b510 page dumped because: page still charged to cgroup page->mem_cgroup:000055cd7e44b510 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc2 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 Call Trace: dump_stack+0x57/0x80 bad_page.cold+0x63/0x94 __free_pages_ok+0x33f/0x360 memblock_free_all+0x127/0x195 mem_init+0x23/0x1f5 start_kernel+0x219/0x4f5 secondary_startup_64+0xb6/0xc0 Fix this bug via using 0 as the starting node ID. This restores the original behavior before cc9aec03e58f. [ mingo: Massaged the changelog. ] Fixes: cc9aec03e58f ("x86/numa_emulation: Introduce uniform split capability") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200904061047.612950-1-ying.huang@intel.com
2020-09-03Merge tag 'perf-tools-fixes-for-v5.9-2020-09-03' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools fixes from Arnaldo Carvalho de Melo: - Use uintptr_t when casting numbers to pointers - Keep output expected by 3rd parties: Turn off summary for interval mode by default. - BPF is in kernel space, make sure do_validate_kcore_modules() knows about that. - Explicitly call out event modifiers in the documentation. - Fix jevents() allocation of space for regular expressions. - Address libtraceevent build warnings on 32-bit arches. - Fix checking of functions returns using ERR_PTR() in 'perf bench'. * tag 'perf-tools-fixes-for-v5.9-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Add bpf image check to __map__is_kmodule perf record/stat: Explicitly call out event modifiers in the documentation perf bench: The do_run_multi_threaded() function must use IS_ERR(perf_session__new()) perf stat: Turn off summary for interval mode by default libtraceevent: Fix build warning on 32-bit arches perf jevents: Fix suspicious code in fixregex() perf parse-events: Use uintptr_t when casting numbers to pointers
2020-09-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi Kivilinna. 2) Fix loss of RTT samples in rxrpc, from David Howells. 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu. 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka. 5) We disable BH for too lokng in sctp_get_port_local(), add a cond_resched() here as well, from Xin Long. 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu. 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from Yonghong Song. 8) Missing of_node_put() in mt7530 DSA driver, from Sumera Priyadarsini. 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan. 10) Fix geneve tunnel checksumming bug in hns3, from Yi Li. 11) Memory leak in rxkad_verify_response, from Dinghao Liu. 12) In tipc, don't use smp_processor_id() in preemptible context. From Tuong Lien. 13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu. 14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter. 15) Fix ABI mismatch between driver and firmware in nfp, from Louis Peens. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits) net/smc: fix sock refcounting in case of termination net/smc: reset sndbuf_desc if freed net/smc: set rx_off for SMCR explicitly net/smc: fix toleration of fake add_link messages tg3: Fix soft lockup when tg3_reset_task() fails. doc: net: dsa: Fix typo in config code sample net: dp83867: Fix WoL SecureOn password nfp: flower: fix ABI mismatch between driver and firmware tipc: fix shutdown() of connectionless socket ipv6: Fix sysctl max for fib_multipath_hash_policy drivers/net/wan/hdlc: Change the default of hard_header_len to 0 net: gemini: Fix another missing clk_disable_unprepare() in probe net: bcmgenet: fix mask check in bcmgenet_validate_flow() amd-xgbe: Add support for new port mode net: usb: dm9601: Add USB ID of Keenetic Plus DSL vhost: fix typo in error message net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() pktgen: fix error message with wrong function name net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode cxgb4: fix thermal zone device registration ...
2020-09-03Merge branch 'gate-page-refcount' (patches from Dave Hansen)Linus Torvalds
Merge gate page refcount fix from Dave Hansen: "During the conversion over to pin_user_pages(), gate pages were missed. The fix is pretty simple, and is accompanied by a new test from Andy which probably would have caught this earlier" * emailed patches from Dave Hansen <dave.hansen@linux.intel.com>: selftests/x86/test_vsyscall: Improve the process_vm_readv() test mm: fix pin vs. gup mismatch with gate pages
2020-09-03selftests/x86/test_vsyscall: Improve the process_vm_readv() testAndy Lutomirski
The existing code accepted process_vm_readv() success or failure as long as it didn't return garbage. This is too weak: if the vsyscall page is readable, then process_vm_readv() should succeed and, if the page is not readable, then it should fail. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-03mm: fix pin vs. gup mismatch with gate pagesDave Hansen
Gate pages were missed when converting from get to pin_user_pages(). This can lead to refcount imbalances. This is reliably and quickly reproducible running the x86 selftests when vsyscall=emulate is enabled (the default). Fix by using try_grab_page() with appropriate flags passed. The long story: Today, pin_user_pages() and get_user_pages() are similar interfaces for manipulating page reference counts. However, "pins" use a "bias" value and manipulate the actual reference count by 1024 instead of 1 used by plain "gets". That means that pin_user_pages() must be matched with unpin_user_pages() and can't be mixed with a plain put_user_pages() or put_page(). Enter gate pages, like the vsyscall page. They are pages usually in the kernel image, but which are mapped to userspace. Userspace is allowed access to them, including interfaces using get/pin_user_pages(). The refcount of these kernel pages is manipulated just like a normal user page on the get/pin side so that the put/unpin side can work the same for normal user pages or gate pages. get_gate_page() uses try_get_page() which only bumps the refcount by 1, not 1024, even if called in the pin_user_pages() path. If someone pins a gate page, this happens: pin_user_pages() get_gate_page() try_get_page() // bump refcount +1 ... some time later unpin_user_pages() page_ref_sub_and_test(page, 1024)) ... and boom, we get a refcount off by 1023. This is reliably and quickly reproducible running the x86 selftests when booted with vsyscall=emulate (the default). The selftests use ptrace(), but I suspect anything using pin_user_pages() on gate pages could hit this. To fix it, simply use try_grab_page() instead of try_get_page(), and pass 'gup_flags' in so that FOLL_PIN can be respected. This bug traces back to the very beginning of the FOLL_PIN support in commit 3faa52c03f44 ("mm/gup: track FOLL_PIN pages"), which showed up in the 5.7 release. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages") Reported-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: x86@kernel.org Cc: Jann Horn <jannh@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>