summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2020-09-28nl80211: support S1G capability overrides in assocThomas Pedersen
NL80211_ATTR_S1G_CAPABILITY can be passed along with NL80211_ATTR_S1G_CAPABILITY_MASK to NL80211_CMD_ASSOCIATE to indicate S1G capabilities which should override the hardware capabilities in eg. the association request. Signed-off-by: Thomas Pedersen <thomas@adapt-ip.com> Link: https://lore.kernel.org/r/20200922022818.15855-4-thomas@adapt-ip.com [johannes: always require both attributes together, commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-09-28kgdb: Honour the kprobe blocklist when setting breakpointsDaniel Thompson
Currently kgdb has absolutely no safety rails in place to discourage or prevent a user from placing a breakpoint in dangerous places such as the debugger's own trap entry/exit and other places where it is not safe to take synchronous traps. Introduce a new config symbol KGDB_HONOUR_BLOCKLIST and modify the default implementation of kgdb_validate_break_address() so that we use the kprobe blocklist to prohibit instrumentation of critical functions if the config symbol is set. The config symbol dependencies are set to ensure that the blocklist will be enabled by default if we enable KGDB and are compiling for an architecture where we HAVE_KPROBES. Suggested-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20200927211531.1380577-2-daniel.thompson@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-09-28Merge branch 'irq/ipi-as-irq', remote-tracking branches 'origin/irq/dw' and ↵Marc Zyngier
'origin/irq/owl' into irq/irqchip-next Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-09-28Merge branch 'fixes' into nextUlf Hansson
2020-09-28memstick: Skip allocating card when removing hostKai-Heng Feng
After commit 6827ca573c03 ("memstick: rtsx_usb_ms: Support runtime power management"), removing module rtsx_usb_ms will be stuck. The deadlock is caused by powering on and powering off at the same time, the former one is when memstick_check() is flushed, and the later is called by memstick_remove_host(). Soe let's skip allocating card to prevent this issue. Fixes: 6827ca573c03 ("memstick: rtsx_usb_ms: Support runtime power management") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20200925084952.13220-1-kai.heng.feng@canonical.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-28hv: hyperv.h: Introduce some hvpfn helper functionsBoqun Feng
When a guest communicate with the hypervisor, it must use HV_HYP_PAGE to calculate PFN, so introduce a few hvpfn helper functions as the counterpart of the page helper functions. This is the preparation for supporting guest whose PAGE_SIZE is not 4k. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20200916034817.30282-7-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-09-28Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv headerBoqun Feng
There will be more places other than vmbus where we need to calculate the Hyper-V page PFN from a virtual address, so move virt_to_hvpfn() to hyperv generic header. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20200916034817.30282-6-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-09-28Drivers: hv: vmbus: Introduce types of GPADLBoqun Feng
This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The types of GPADL are purely the concept in the guest, IOW the hypervisor treat them as the same. The reason of introducing the types for GPADL is to support guests whose page size is not 4k (the page size of Hyper-V hypervisor). In these guests, both the headers and the data parts of the ringbuffers need to be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be mapped into userspace and 2) we use "double mapping" mechanism to support fast wrap-around, and "double mapping" relies on ringbuffers being page-aligned. However, the Hyper-V hypervisor only uses 4k (HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make the headers of ringbuffers take one guest page and when GPADL is established between the guest and hypervisor, the only first 4k of header is used. To handle this special case, we need the types of GPADL to differ different guest memory usage for GPADL. Type enum is introduced along with several general interfaces to describe the differences between normal buffer GPADL and ringbuffer GPADL. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20200916034817.30282-4-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-09-27ptp: add stub function for ptp_get_msgtype()Yangbo Lu
Added the missing stub function for ptp_get_msgtype(). Fixes: 036c508ba95e ("ptp: Add generic ptp message type function") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-27mm/fork: Pass new vma pointer into copy_page_range()Peter Xu
This prepares for the future work to trigger early cow on pinned pages during fork(). No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-27mm: Introduce mm_struct.has_pinnedPeter Xu
(Commit message majorly collected from Jason Gunthorpe) Reduce the chance of false positive from page_maybe_dma_pinned() by keeping track if the mm_struct has ever been used with pin_user_pages(). This allows cases that might drive up the page ref_count to avoid any penalty from handling dma_pinned pages. Future work is planned, to provide a more sophisticated solution, likely to turn it into a real counter. For now, make it atomic_t but use it as a boolean for simplicity. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26fs: remove KSTAT_QUERY_FLAGSChristoph Hellwig
KSTAT_QUERY_FLAGS expands to AT_STATX_SYNC_TYPE, which itself already is a mask. Remove the double name, especially given that the prefix is a little confusing vs the normal AT_* flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26fs: move vfs_fstatat out of lineChristoph Hellwig
This allows to keep vfs_statx static in fs/stat.c to prepare for the following changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatatChristoph Hellwig
Go through vfs_fstatat instead of duplicating the *stat to statx mapping three times. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26fs: remove vfs_statx_fdChristoph Hellwig
vfs_statx_fd is only used to implement vfs_fstat. Remove vfs_statx_fd and just implement vfs_fstat directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26Merge tag 'drivers_soc_for_5.10' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into arm/drivers ARM: soc: TI driver updates for v5.10 Consist of: - Add Ring accelerator support for AM65x - Add TI PRUSS platform driver and enable it on available platforms - Extend PRUSS driver for CORECLK_MUX/IEPCLK_MUX support - UDMA rx ring pair fix - Add socinfo entry for J7200 * tag 'drivers_soc_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone: Add missing '#' to fix schema errors: soc: ti: Convert to DEFINE_SHOW_ATTRIBUTE dmaengine: ti: k3-udma-glue: Fix parameters for rx ring pair request soc: ti: k3-socinfo: Add entry for J7200 soc: ti: pruss: support CORECLK_MUX and IEPCLK_MUX dt-bindings: soc: ti: Update TI PRUSS bindings regarding clock-muxes firmware: ti_sci: allow frequency change for disabled clocks by default soc: ti: ti_sci_pm_domains: switch to use multiple genpds instead of one soc: ti: pruss: Enable support for ICSSG subsystems on K3 J721E SoCs soc: ti: pruss: Enable support for ICSSG subsystems on K3 AM65x SoCs soc: ti: pruss: Add support for PRU-ICSS subsystems on 66AK2G SoC soc: ti: pruss: Add support for PRU-ICSS subsystems on AM57xx SoCs soc: ti: pruss: Add support for PRU-ICSSs on AM437x SoCs soc: ti: pruss: Add a platform driver for PRUSS in TI SoCs dt-bindings: soc: ti: Add TI PRUSS bindings bindings: soc: ti: soc: ringacc: remove ti,dma-ring-reset-quirk soc: ti: k3: ringacc: add am65x sr2.0 support Link: https://lore.kernel.org/r/1600656828-29267-1-git-send-email-santosh.shilimkar@oracle.com Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes: one in drivers (lpfc) and two for zoned block devices. The latter also impinges on the block layer but only to introduce a new block API for setting the zone model rather than fiddling with the queue directly in the zoned block driver" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: sd_zbc: Fix ZBC disk initialization scsi: sd: sd_zbc: Fix handling of host-aware ZBC disks scsi: lpfc: Fix initial FLOGI failure due to BBSCN not supported
2020-09-26Merge tag 'block-5.9-2020-09-25' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "NVMe pull request from Christoph, and removal of a dead define. - fix error during controller probe that cause double free irqs (Keith Busch) - FC connection establishment fix (James Smart) - properly handle completions for invalid tags (Xianting Tian) - pass the correct nsid to the command effects and supported log (Chaitanya Kulkarni)" * tag 'block-5.9-2020-09-25' of git://git.kernel.dk/linux-block: block: remove unused BLK_QC_T_EAGAIN flag nvme-core: don't use NVME_NSID_ALL for command effects and supported log nvme-fc: fail new connections to a deleted host or remote port nvme-pci: fix NULL req in completion handler nvme: return errors for hwmon init
2020-09-26mm: don't rely on system state to detect hot-plug operationsLaurent Dufour
In register_mem_sect_under_node() the system_state's value is checked to detect whether the call is made during boot time or during an hot-plug operation. Unfortunately, that check against SYSTEM_BOOTING is wrong because regular memory is registered at SYSTEM_SCHEDULING state. In addition, memory hot-plug operation can be triggered at this system state by the ACPI [1]. So checking against the system state is not enough. The consequence is that on system with interleaved node's ranges like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] This can be seen on PowerPC LPAR after multiple memory hot-plug and hot-unplug operations are done. At the next reboot the node's memory ranges can be interleaved and since the call to link_mem_sections() is made in topology_init() while the system is in the SYSTEM_SCHEDULING state, the node's id is not checked, and the sections registered to multiple nodes: $ ls -l /sys/devices/system/memory/memory21/node* total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 In that case, the system is able to boot but if later one of theses memory blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is detected and this is triggering a BUG_ON(): kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This patch addresses the root cause by not relying on the system_state value to detect whether the call is due to a hot-plug operation. An extra parameter is added to link_mem_sections() detailing whether the operation is due to a hot-plug operation. [1] According to Oscar Salvador, using this qemu command line, ACPI memory hotplug operations are raised at SYSTEM_SCHEDULING state: $QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \ -m size=$MEM,slots=255,maxmem=4294967296k \ -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \ -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \ -object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \ -object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \ -object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \ -object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \ -object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \ -object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \ Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26mm: replace memmap_context by meminit_contextLaurent Dufour
Patch series "mm: fix memory to node bad links in sysfs", v3. Sometimes, firmware may expose interleaved memory layout like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] In that case, we can see memory blocks assigned to multiple nodes in sysfs: $ ls -l /sys/devices/system/memory/memory21 total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 -rw-r--r-- 1 root root 65536 Aug 24 05:27 online -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index drwxr-xr-x 2 root root 0 Aug 24 05:27 power -r--r--r-- 1 root root 65536 Aug 24 05:27 removable -rw-r--r-- 1 root root 65536 Aug 24 05:27 state lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones The same applies in the node's directory with a memory21 link in both the node1 and node2's directory. This is wrong but doesn't prevent the system to run. However when later, one of these memory blocks is hot-unplugged and then hot-plugged, the system is detecting an inconsistency in the sysfs layout and a BUG_ON() is raised: kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This has been seen on PowerPC LPAR. The root cause of this issue is that when node's memory is registered, the range used can overlap another node's range, thus the memory block is registered to multiple nodes in sysfs. There are two issues here: (a) The sysfs memory and node's layouts are broken due to these multiple links (b) The link errors in link_mem_sections() should not lead to a system panic. To address (a) register_mem_sect_under_node should not rely on the system state to detect whether the link operation is triggered by a hot plug operation or not. This is addressed by the patches 1 and 2 of this series. Issue (b) will be addressed separately. This patch (of 2): The memmap_context enum is used to detect whether a memory operation is due to a hot-add operation or happening at boot time. Make it general to the hotplug operation and rename it as meminit_context. There is no functional change introduced by this patch Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J . Wysocki" <rafael@kernel.org> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26mm/gup: fix gup_fast with dynamic page table foldingVasily Gorbik
Currently to make sure that every page table entry is read just once gup_fast walks perform READ_ONCE and pass pXd value down to the next gup_pXd_range function by value e.g.: static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) ... pudp = pud_offset(&p4d, addr); This function passes a reference on that local value copy to pXd_offset, and might get the very same pointer in return. This happens when the level is folded (on most arches), and that pointer should not be iterated. On s390 due to the fact that each task might have different 5,4 or 3-level address translation and hence different levels folded the logic is more complex and non-iteratable pointer to a local copy leads to severe problems. Here is an example of what happens with gup_fast on s390, for a task with 3-level paging, crossing a 2 GB pud boundary: // addr = 0x1007ffff000, end = 0x10080001000 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) { unsigned long next; pud_t *pudp; // pud_offset returns &p4d itself (a pointer to a value on stack) pudp = pud_offset(&p4d, addr); do { // on second iteratation reading "random" stack value pud_t pud = READ_ONCE(*pudp); // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390 next = pud_addr_end(addr, end); ... } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack return 1; } This happens since s390 moved to common gup code with commit d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") and commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code"). s390 tried to mimic static level folding by changing pXd_offset primitives to always calculate top level page table offset in pgd_offset and just return the value passed when pXd_offset has to act as folded. What is crucial for gup_fast and what has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly. And the latter is not possible with dynamic folding. To fix the issue in addition to pXd values pass original pXdp pointers down to gup_pXd_range functions. And introduce pXd_offset_lockless helpers, which take an additional pXd entry value parameter. This has already been discussed in https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.2+] Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-25bpf: Add comment to document BTF type PTR_TO_BTF_ID_OR_NULLJohn Fastabend
The meaning of PTR_TO_BTF_ID_OR_NULL differs slightly from other types denoted with the *_OR_NULL type. For example the types PTR_TO_SOCKET and PTR_TO_SOCKET_OR_NULL can be used for branch analysis because the type PTR_TO_SOCKET is guaranteed to _not_ have a null value. In contrast PTR_TO_BTF_ID and BTF_TO_BTF_ID_OR_NULL have slightly different meanings. A PTR_TO_BTF_TO_ID may be a pointer to NULL value, but it is safe to read this pointer in the program context because the program context will handle any faults. The fallout is for PTR_TO_BTF_ID the verifier can assume reads are safe, but can not use the type in branch analysis. Additionally, authors need to be extra careful when passing PTR_TO_BTF_ID into helpers. In general helpers consuming type PTR_TO_BTF_ID will need to assume it may be null. Seeing the above is not obvious to readers without the back knowledge lets add a comment in the type definition. Editorial comment, as networking and tracing programs get closer and more tightly merged we may need to consider a new type that we can ensure is non-null for branch analysis and also passing into helpers. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com>
2020-09-25net: stmmac: Add option for VLAN filter fail queue enableChuah, Kim Tatt
Add option in plat_stmmacenet_data struct to enable VLAN Filter Fail Queuing. This option allows packets that fail VLAN filter to be routed to a specific Rx queue when Receive All is also set. When this option is enabled: - Enable VFFQ only when entering promiscuous mode, because Receive All will pass up all rx packets that failed address filtering (similar to promiscuous mode). - VLAN-promiscuous mode is never entered to allow rx packet to fail VLAN filters and get routed to selected VFFQ Rx queue. Reviewed-by: Voon Weifeng <weifeng.voon@intel.com> Reviewed-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Chuah, Kim Tatt <kim.tatt.chuah@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25mm/page_ref: Convert the open coded tracepoint enabled to the new helperSteven Rostedt (VMware)
As more use cases of checking if a tracepoint is enabled in a header are coming to fruition, a helper macro, tracepoint_enabled(), has been added to check if a tracepoint is enabled or not, and can be used with minimal header requirements (avoid "include hell"). Convert the page_ref logic over to the new helper macro. Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-25tracepoints: Add helper to test if tracepoint is enabled in a headerSteven Rostedt (VMware)
As tracepoints are discouraged from being added in a header because it can cause side effects if other tracepoints are in headers, as well as bloat the kernel as the trace_<tracepoint>() function is not a small inline, the common workaround is to add a function call that calls a wrapper function in a C file that then calls the tracepoint. But as function calls add overhead, this function should only be called when the tracepoint in question is enabled. To get around this overhead, a static_branch can be used to only have the tracepoint wrapper get called when the tracepoint is enabled. Add a tracepoint_enabled(tp) macro that gets passed the name of the tracepoint, and this becomes a static_branch that is enabled when the tracepoint is enabled and is a nop when the tracepoint is disabled. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-25SUNRPC/NFSD: Implement xdr_reserve_space_vec()Anna Schumaker
Reserving space for a large READ payload requires special handling when reserving space in the xdr buffer pages. One problem we can have is use of the scratch buffer, which is used to get a pointer to a contiguous region of data up to PAGE_SIZE. When using the scratch buffer, calls to xdr_commit_encode() shift the data to it's proper alignment in the xdr buffer. If we've reserved several pages in a vector, then this could potentially invalidate earlier pointers and result in incorrect READ data being sent to the client. I get around this by looking at the amount of space left in the current page, and never reserve more than that for each entry in the read vector. This lets us place data directly where it needs to go in the buffer pages. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report itArd Biesheuvel
Incorporate the definition of EFI_MEMORY_CPU_CRYPTO from the UEFI specification v2.8, and wire it into our memory map dumping routine as well. To make a bit of space in the output buffer, which is provided by the various callers, shorten the descriptive names of the memory types. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-25bpf: Enable bpf_skc_to_* sock casting helper to networking prog typeMartin KaFai Lau
There is a constant need to add more fields into the bpf_tcp_sock for the bpf programs running at tc, sock_ops...etc. A current workaround could be to use bpf_probe_read_kernel(). However, other than making another helper call for reading each field and missing CO-RE, it is also not as intuitive to use as directly reading "tp->lsndtime" for example. While already having perfmon cap to do bpf_probe_read_kernel(), it will be much easier if the bpf prog can directly read from the tcp_sock. This patch tries to do that by using the existing casting-helpers bpf_skc_to_*() whose func_proto returns a btf_id. For example, the func_proto of bpf_skc_to_tcp_sock returns the btf_id of the kernel "struct tcp_sock". These helpers are also added to is_ptr_cast_function(). It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id. That will keep the ref-tracking works properly. The bpf_skc_to_* helpers are made available to most of the bpf prog types in filter.c. The bpf_skc_to_* helpers will be limited by perfmon cap. This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON. The helper accepting this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON]) or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON). The bpf_skc_to_*() helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will accept pointer obtained from skb->sk. Instead of specifying both arg_type and arg_btf_id in the same func_proto which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the compatible_reg_types[] in verifier.c. The reason is the arg_btf_id is always the same. Discussion in this thread: https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/ The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is expecting a PTR_TO_BTF_ID which could be NULL. This is the same behavior as the existing helper taking ARG_PTR_TO_BTF_ID. The _SOCK_COMMON part means the helper is also expecting the legacy SOCK_COMMON pointer. By excluding the _OR_NULL part, the bpf prog cannot call helper with a literal NULL which doesn't make sense in most cases. e.g. bpf_skc_to_tcp_sock(NULL) will be rejected. All PTR_TO_*_OR_NULL reg has to do a NULL check first before passing into the helper or else the bpf prog will be rejected. This behavior is nothing new and consistent with the current expectation during bpf-prog-load. [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace ARG_PTR_TO_SOCK* of other existing helpers later such that those existing helpers can take the PTR_TO_BTF_ID returned by the bpf_skc_to_*() helpers. The only special case is bpf_sk_lookup_assign() which can accept a literal NULL ptr. It has to be handled specially in another follow up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ] [ When converting the older helpers that take ARG_PTR_TO_SOCK* in the later patch, if the kernel does not support BTF, ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON because no reg->type could have PTR_TO_BTF_ID in this case. It is not a concern for the newer-btf-only helper like the bpf_skc_to_*() here though because these helpers must require BTF vmlinux to begin with. ] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200925000350.3855720-1-kafai@fb.com
2020-09-25compat.h: fix a spelling error in <linux/compat.h>Christoph Hellwig
There is no compat_sys_readv64v2 syscall, only a compat_sys_preadv64v2 one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-25genirq: Add stub for set_handle_irq() when !GENERIC_IRQ_MULTI_HANDLERZhen Lei
In order to avoid compilation errors when a driver references set_handle_irq(), but that the architecture doesn't select GENERIC_IRQ_MULTI_HANDLER, add a stub function that will just WARN_ON_ONCE() if ever used. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [maz: commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200924071754.4509-2-thunder.leizhen@huawei.com
2020-09-25soc: mediatek: cmdq: add clear option in cmdq_pkt_wfe apiDennis YC Hsieh
Add clear parameter to let client decide if event should be clear to 0 after GCE receive it. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/1594136714-11650-9-git-send-email-dennis-yc.hsieh@mediatek.com [mb: fix commit message] Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25soc: mediatek: cmdq: add jump functionDennis YC Hsieh
Add jump function so that client can jump to any address which contains instruction. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-8-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25soc: mediatek: cmdq: add write_s_mask value functionDennis YC Hsieh
add write_s_mask_value function in cmdq helper functions which writes a constant value to address with mask and large dma access support. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-7-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25soc: mediatek: cmdq: add write_s value functionDennis YC Hsieh
add write_s function in cmdq helper functions which writes a constant value to address with large dma access support. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-6-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25soc: mediatek: cmdq: add read_s functionDennis YC Hsieh
Add read_s function in cmdq helper functions which support read value from register or dma physical address into gce internal register. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-5-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25soc: mediatek: cmdq: add write_s_mask functionDennis YC Hsieh
add write_s_mask function in cmdq helper functions which writes value contains in internal register to address with mask and large dma access support. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-4-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25soc: mediatek: cmdq: add write_s functionDennis YC Hsieh
add write_s function in cmdq helper functions which writes value contains in internal register to address with large dma access support. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-3-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25USB: correct API of usb_control_msg_send/recvOliver Neukum
They need to specify how memory is to be allocated, as control messages need to work in contexts that require GFP_NOIO. Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://lore.kernel.org/r/20200923134348.23862-9-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-25iopoll: update kerneldoc of read_poll_timeout_atomic()Chunfeng Yun
Arguments description of read_poll_timeout_atomic() is out of date, update it. Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Link: https://lore.kernel.org/r/1600668815-12135-11-git-send-email-chunfeng.yun@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-25dm: add support for REQ_NOWAIT and enable it for linear targetKonstantin Khlebnikov
Add DM target feature flag DM_TARGET_NOWAIT which advertises that target works with REQ_NOWAIT bios. Add dm_table_supports_nowait() and update dm_table_set_restrictions() to set/clear QUEUE_FLAG_NOWAIT accordingly. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25block: add QUEUE_FLAG_NOWAITMike Snitzer
Add QUEUE_FLAG_NOWAIT to allow a block device to advertise support for REQ_NOWAIT. Bio-based devices may set QUEUE_FLAG_NOWAIT where applicable. Update QUEUE_FLAG_MQ_DEFAULT to include QUEUE_FLAG_NOWAIT. Also update submit_bio_checks() to verify it is set for REQ_NOWAIT bios. Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25block: add a bdev_is_partition helperChristoph Hellwig
Add a littler helper to make the somewhat arcane bd_contains checks a little more obvious. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25block: remove unused BLK_QC_T_EAGAIN flagJeffle Xu
commit 7b6620d7db56 ("block: remove REQ_NOWAIT_INLINE") removed the REQ_NOWAIT_INLINE related code, but the diff wasn't applied to blk_types.h somehow. Then commit 2771cefeac49 ("block: remove the REQ_NOWAIT_INLINE flag") removed the REQ_NOWAIT_INLINE flag while the BLK_QC_T_EAGAIN flag still remains. Fixes: 7b6620d7db56 ("block: remove REQ_NOWAIT_INLINE") Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQPeter Oskolkov
This patchset is based on Google-internal RSEQ work done by Paul Turner and Andrew Hunter. When working with per-CPU RSEQ-based memory allocations, it is sometimes important to make sure that a global memory location is no longer accessed from RSEQ critical sections. For example, there can be two per-CPU lists, one is "active" and accessed per-CPU, while another one is inactive and worked on asynchronously "off CPU" (e.g. garbage collection is performed). Then at some point the two lists are swapped, and a fast RCU-like mechanism is required to make sure that the previously active list is no longer accessed. This patch introduces such a mechanism: in short, membarrier() syscall issues an IPI to a CPU, restarting a potentially active RSEQ critical section on the CPU. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lkml.kernel.org/r/20200923233618.2572849-1-posk@google.com
2020-09-25ACPI: Remove three unused inline functionsYueHaibing
There is no callers in tree. Signed-off-by: YueHaibing <yuehaibing@huawei.com> [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-25Fonts: Support FONT_EXTRA_WORDS macros for built-in fontsPeilin Ye
syzbot has reported an issue in the framebuffer layer, where a malicious user may overflow our built-in font data buffers. In order to perform a reliable range check, subsystems need to know `FONTDATAMAX` for each built-in font. Unfortunately, our font descriptor, `struct console_font` does not contain `FONTDATAMAX`, and is part of the UAPI, making it infeasible to modify it. For user-provided fonts, the framebuffer layer resolves this issue by reserving four extra words at the beginning of data buffers. Later, whenever a function needs to access them, it simply uses the following macros: Recently we have gathered all the above macros to <linux/font.h>. Let us do the same thing for built-in fonts, prepend four extra words (including `FONTDATAMAX`) to their data buffers, so that subsystems can use these macros for all fonts, no matter built-in or user-provided. This patch depends on patch "fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h". Cc: stable@vger.kernel.org Link: https://syzkaller.appspot.com/bug?id=08b8be45afea11888776f897895aef9ad1c3ecfd Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/ef18af00c35fb3cc826048a5f70924ed6ddce95b.1600953813.git.yepeilin.cs@gmail.com
2020-09-25fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.hPeilin Ye
drivers/video/console/newport_con.c is borrowing FONT_EXTRA_WORDS macros from drivers/video/fbdev/core/fbcon.h. To keep things simple, move all definitions into <linux/font.h>. Since newport_con now uses four extra words, initialize the fourth word in newport_set_font() properly. Cc: stable@vger.kernel.org Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/7fb8bc9b0abc676ada6b7ac0e0bd443499357267.1600953813.git.yepeilin.cs@gmail.com
2020-09-25X.509: support OSCCA certificate parseTianjia Zhang
The digital certificate format based on SM2 crypto algorithm as specified in GM/T 0015-2012. It was published by State Encryption Management Bureau, China. This patch adds the OID object identifier defined by OSCCA. The x509 certificate supports SM2-with-SM3 type certificate parsing. It uses the standard elliptic curve public key, and the sm2 algorithm signs the hash generated by sm3. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Tested-by: Xufeng Zhang <yunbo.xufeng@linux.alibaba.com> Reviewed-by: Vitaly Chikunov <vt@altlinux.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-25lib/mpi: Introduce ec implementation to MPI libraryTianjia Zhang
The implementation of EC is introduced from libgcrypt as the basic algorithm of elliptic curve, which can be more perfectly integrated with MPI implementation. Some other algorithms will be developed based on mpi ecc, such as SM2. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Tested-by: Xufeng Zhang <yunbo.xufeng@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-25lib/mpi: Extend the MPI libraryTianjia Zhang
Expand the mpi library based on libgcrypt, and the ECC algorithm of mpi based on libgcrypt requires these functions. Some other algorithms will be developed based on mpi ecc, such as SM2. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Tested-by: Xufeng Zhang <yunbo.xufeng@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>