git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-09-27	mm/fork: Pass new vma pointer into copy_page_range()	Peter Xu
	This prepares for the future work to trigger early cow on pinned pages during fork(). No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-27	mm: Introduce mm_struct.has_pinned	Peter Xu
	(Commit message majorly collected from Jason Gunthorpe) Reduce the chance of false positive from page_maybe_dma_pinned() by keeping track if the mm_struct has ever been used with pin_user_pages(). This allows cases that might drive up the page ref_count to avoid any penalty from handling dma_pinned pages. Future work is planned, to provide a more sophisticated solution, likely to turn it into a real counter. For now, make it atomic_t but use it as a boolean for simplicity. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-27	Merge tag 'timers-v5.10' of ↵	Thomas Gleixner
	https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clocksource/event updates from Daniel Lezcano: - Add DT binding documentation to support the r8a7742 and r8a774e1 platforms (Lad Prabhakar) - Add sp804 variant support for the Hisilicon platforms (Kefeng Wang)
2020-09-26	fs: remove KSTAT_QUERY_FLAGS	Christoph Hellwig
	KSTAT_QUERY_FLAGS expands to AT_STATX_SYNC_TYPE, which itself already is a mask. Remove the double name, especially given that the prefix is a little confusing vs the normal AT_* flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26	fs: move vfs_fstatat out of line	Christoph Hellwig
	This allows to keep vfs_statx static in fs/stat.c to prepare for the following changes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26	fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat	Christoph Hellwig
	Go through vfs_fstatat instead of duplicating the *stat to statx mapping three times. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26	fs: remove vfs_statx_fd	Christoph Hellwig
	vfs_statx_fd is only used to implement vfs_fstat. Remove vfs_statx_fd and just implement vfs_fstat directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-26	net: dsa: point out the tail taggers	Vladimir Oltean
	The Marvell 88E6060 uses tag_trailer.c and the KSZ8795, KSZ9477 and KSZ9893 switches also use tail tags. Tell that to the DSA core, since this makes a difference for the flow dissector. Most switches break the parsing of frame headers, but these ones don't, so no flow dissector adjustment needs to be done for them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26	net: dsa: add a generic procedure for the flow dissector	Vladimir Oltean
	For all DSA formats that don't use tail tags, it looks like behind the obscure number crunching they're all doing the same thing: locating the real EtherType behind the DSA tag. Nonetheless, this is not immediately obvious, so create a generic helper for those DSA taggers that put the header before the EtherType. Another assumption for the generic function is that the DSA tags are of equal length on RX and on TX. Prior to the previous patch, this was not true for ocelot and for gswip. The problem was resolved for ocelot, but for gswip it still remains, so that can't use this helper yet. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26	net: dsa: make the .flow_dissect tagger callback return void	Vladimir Oltean
	There is no tagger that returns anything other than zero, so just change the return type appropriately. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26	net: dsa: tag_ocelot: use a short prefix on both ingress and egress	Vladimir Oltean
	There are 2 goals that we follow: - Reduce the header size - Make the header size equal between RX and TX The issue that required long prefix on RX was the fact that the ocelot DSA tag, being put before Ethernet as it is, would overlap with the area that a DSA master uses for RX filtering (destination MAC address mainly). Now that we can ask DSA to put the master in promiscuous mode, in theory we could remove the prefix altogether and call it a day, but it looks like we can't. Using no prefix on ingress, some packets (such as ICMP) would be received, while others (such as PTP) would not be received. This is because the DSA master we use (enetc) triggers parse errors ("MAC rx frame errors") presumably because it sees Ethernet frames with a bad length. And indeed, when using no prefix, the EtherType (bytes 12-13 of the frame, bits 96-111) falls over the REW_VAL field from the extraction header, aka the PTP timestamp. When turning the short (32-bit) prefix on, the EtherType overlaps with bits 64-79 of the extraction header, which are a reserved area transmitted as zero by the switch. The packets are not dropped by the DSA master with a short prefix. Actually, the frames look like this in tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 - added by a downstream sja1105). 89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \ dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \ ctrl 0x0004: Information, send seq 2, rcv seq 0, \ Flags [Response], length 78 0x0000: 8880 000a 001d 890c a9f2 0100 0000 100f ................ 0x0010: 0400 0000 0180 c200 000e 001f 7b63 0248 ............{c.H 0x0020: dadb 0482 88f7 1202 0036 0000 0000 0000 .........6...... 0x0030: 0000 0000 0000 0000 0000 001f 7bff fe63 ............{..c 0x0040: 0248 0001 1f81 0500 0000 0000 0000 0000 .H.............. 0x0050: 0000 0000 0000 0000 0000 0000 ............ So the short prefix is our new default: we've shortened our RX frames by 12 octets, increased TX by 4, and headers are now equal between RX and TX. Note that we still need promiscuous mode for the DSA master to not drop it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26	net: dsa: allow drivers to request promiscuous mode on master	Vladimir Oltean
	Currently DSA assumes that taggers don't mess with the destination MAC address of the frames on RX. That is not always the case. Some DSA headers are placed before the Ethernet header (ocelot), and others simply mangle random bytes from the destination MAC address (sja1105 with its incl_srcpt option). Currently the DSA master goes to promiscuous mode automatically when the slave devices go too (such as when enslaved to a bridge), but in standalone mode this is a problem that needs to be dealt with. So give drivers the possibility to signal that their tagging protocol will get randomly dropped otherwise, and let DSA deal with fixing that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26	net: mscc: ocelot: move NPI port configuration to DSA	Vladimir Oltean
	Remove the ocelot_configure_cpu() function, which was in fact bringing up 2 ports: the CPU port module, which both switchdev and DSA have, and the NPI port, which only DSA has. The (non-Ethernet) CPU port module is at a fixed index in the analyzer, whereas the NPI port is selected through the "ethernet" property in the device tree. Therefore, the function to set up an NPI port is DSA-specific, so we move it there, simplifying the ocelot switch library a little bit. Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: UNGLinuxDriver <UNGLinuxDriver@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-26	Merge tag 'drivers_soc_for_5.10' of ↵	Olof Johansson
	git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into arm/drivers ARM: soc: TI driver updates for v5.10 Consist of: - Add Ring accelerator support for AM65x - Add TI PRUSS platform driver and enable it on available platforms - Extend PRUSS driver for CORECLK_MUX/IEPCLK_MUX support - UDMA rx ring pair fix - Add socinfo entry for J7200 * tag 'drivers_soc_for_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone: Add missing '#' to fix schema errors: soc: ti: Convert to DEFINE_SHOW_ATTRIBUTE dmaengine: ti: k3-udma-glue: Fix parameters for rx ring pair request soc: ti: k3-socinfo: Add entry for J7200 soc: ti: pruss: support CORECLK_MUX and IEPCLK_MUX dt-bindings: soc: ti: Update TI PRUSS bindings regarding clock-muxes firmware: ti_sci: allow frequency change for disabled clocks by default soc: ti: ti_sci_pm_domains: switch to use multiple genpds instead of one soc: ti: pruss: Enable support for ICSSG subsystems on K3 J721E SoCs soc: ti: pruss: Enable support for ICSSG subsystems on K3 AM65x SoCs soc: ti: pruss: Add support for PRU-ICSS subsystems on 66AK2G SoC soc: ti: pruss: Add support for PRU-ICSS subsystems on AM57xx SoCs soc: ti: pruss: Add support for PRU-ICSSs on AM437x SoCs soc: ti: pruss: Add a platform driver for PRUSS in TI SoCs dt-bindings: soc: ti: Add TI PRUSS bindings bindings: soc: ti: soc: ringacc: remove ti,dma-ring-reset-quirk soc: ti: k3: ringacc: add am65x sr2.0 support Link: https://lore.kernel.org/r/1600656828-29267-1-git-send-email-santosh.shilimkar@oracle.com Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26	Merge tag 'tegra-for-5.10-soc' of ↵	Olof Johansson
	git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/drivers soc/tegra: Changes for v5.10-rc1 These changes contain a bit of cleanup and chip support for the upcoming Tegra234 SoC. * tag 'tegra-for-5.10-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: soc/tegra: pmc: Add Tegra234 support soc/tegra: pmc: Reorder reset sources/levels definitions soc/tegra: misc: Add Tegra234 support soc/tegra: fuse: Add Tegra234 support soc/tegra: fuse: Implement tegra_is_silicon() soc/tegra: fuse: Extract tegra_get_platform() Link: https://lore.kernel.org/r/20200918150303.3938852-2-thierry.reding@gmail.com Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26	Merge tag 'renesas-drivers-for-v5.10-tag2' of ↵	Olof Johansson
	git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/drivers Renesas driver updates for v5.10 (take two) - Add core support for the R-Car V3U (R8A779A0) SoC, including System Controller (SYSC) and Reset (RST) support, - Various Kconfig cleanups. * tag 'renesas-drivers-for-v5.10-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: soc: renesas: r8a779a0-sysc: Add r8a779a0 support soc: renesas: rcar-rst: Add support for R-Car V3U soc: renesas: Identify R-Car V3U soc: renesas: Sort driver description title soc: renesas: Use ARM32/ARM64 for menu description dt-bindings: clock: Add r8a779a0 CPG Core Clock Definitions dt-bindings: power: Add r8a779a0 SYSC power domain definitions Link: https://lore.kernel.org/r/20200918124800.15555-4-geert+renesas@glider.be Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes: one in drivers (lpfc) and two for zoned block devices. The latter also impinges on the block layer but only to introduce a new block API for setting the zone model rather than fiddling with the queue directly in the zoned block driver" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: sd_zbc: Fix ZBC disk initialization scsi: sd: sd_zbc: Fix handling of host-aware ZBC disks scsi: lpfc: Fix initial FLOGI failure due to BBSCN not supported
2020-09-26	Merge tag 'block-5.9-2020-09-25' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: "NVMe pull request from Christoph, and removal of a dead define. - fix error during controller probe that cause double free irqs (Keith Busch) - FC connection establishment fix (James Smart) - properly handle completions for invalid tags (Xianting Tian) - pass the correct nsid to the command effects and supported log (Chaitanya Kulkarni)" * tag 'block-5.9-2020-09-25' of git://git.kernel.dk/linux-block: block: remove unused BLK_QC_T_EAGAIN flag nvme-core: don't use NVME_NSID_ALL for command effects and supported log nvme-fc: fail new connections to a deleted host or remote port nvme-pci: fix NULL req in completion handler nvme: return errors for hwmon init
2020-09-26	mm: don't rely on system state to detect hot-plug operations	Laurent Dufour
	In register_mem_sect_under_node() the system_state's value is checked to detect whether the call is made during boot time or during an hot-plug operation. Unfortunately, that check against SYSTEM_BOOTING is wrong because regular memory is registered at SYSTEM_SCHEDULING state. In addition, memory hot-plug operation can be triggered at this system state by the ACPI [1]. So checking against the system state is not enough. The consequence is that on system with interleaved node's ranges like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] This can be seen on PowerPC LPAR after multiple memory hot-plug and hot-unplug operations are done. At the next reboot the node's memory ranges can be interleaved and since the call to link_mem_sections() is made in topology_init() while the system is in the SYSTEM_SCHEDULING state, the node's id is not checked, and the sections registered to multiple nodes: $ ls -l /sys/devices/system/memory/memory21/node* total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 In that case, the system is able to boot but if later one of theses memory blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is detected and this is triggering a BUG_ON(): kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This patch addresses the root cause by not relying on the system_state value to detect whether the call is due to a hot-plug operation. An extra parameter is added to link_mem_sections() detailing whether the operation is due to a hot-plug operation. [1] According to Oscar Salvador, using this qemu command line, ACPI memory hotplug operations are raised at SYSTEM_SCHEDULING state: $QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \ -m size=$MEM,slots=255,maxmem=4294967296k \ -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \ -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \ -object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \ -object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \ -object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \ -object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \ -object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \ -object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \ Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26	mm: replace memmap_context by meminit_context	Laurent Dufour
	Patch series "mm: fix memory to node bad links in sysfs", v3. Sometimes, firmware may expose interleaved memory layout like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] In that case, we can see memory blocks assigned to multiple nodes in sysfs: $ ls -l /sys/devices/system/memory/memory21 total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 -rw-r--r-- 1 root root 65536 Aug 24 05:27 online -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index drwxr-xr-x 2 root root 0 Aug 24 05:27 power -r--r--r-- 1 root root 65536 Aug 24 05:27 removable -rw-r--r-- 1 root root 65536 Aug 24 05:27 state lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones The same applies in the node's directory with a memory21 link in both the node1 and node2's directory. This is wrong but doesn't prevent the system to run. However when later, one of these memory blocks is hot-unplugged and then hot-plugged, the system is detecting an inconsistency in the sysfs layout and a BUG_ON() is raised: kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This has been seen on PowerPC LPAR. The root cause of this issue is that when node's memory is registered, the range used can overlap another node's range, thus the memory block is registered to multiple nodes in sysfs. There are two issues here: (a) The sysfs memory and node's layouts are broken due to these multiple links (b) The link errors in link_mem_sections() should not lead to a system panic. To address (a) register_mem_sect_under_node should not rely on the system state to detect whether the link operation is triggered by a hot plug operation or not. This is addressed by the patches 1 and 2 of this series. Issue (b) will be addressed separately. This patch (of 2): The memmap_context enum is used to detect whether a memory operation is due to a hot-add operation or happening at boot time. Make it general to the hotplug operation and rename it as meminit_context. There is no functional change introduced by this patch Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J . Wysocki" <rafael@kernel.org> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26	mm/gup: fix gup_fast with dynamic page table folding	Vasily Gorbik
	Currently to make sure that every page table entry is read just once gup_fast walks perform READ_ONCE and pass pXd value down to the next gup_pXd_range function by value e.g.: static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page *pages, int nr) ... pudp = pud_offset(&p4d, addr); This function passes a reference on that local value copy to pXd_offset, and might get the very same pointer in return. This happens when the level is folded (on most arches), and that pointer should not be iterated. On s390 due to the fact that each task might have different 5,4 or 3-level address translation and hence different levels folded the logic is more complex and non-iteratable pointer to a local copy leads to severe problems. Here is an example of what happens with gup_fast on s390, for a task with 3-level paging, crossing a 2 GB pud boundary: // addr = 0x1007ffff000, end = 0x10080001000 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page *pages, int nr) { unsigned long next; pud_t pudp; // pud_offset returns &p4d itself (a pointer to a value on stack) pudp = pud_offset(&p4d, addr); do { // on second iteratation reading "random" stack value pud_t pud = READ_ONCE(pudp); // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390 next = pud_addr_end(addr, end); ... } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack return 1; } This happens since s390 moved to common gup code with commit d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") and commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code"). s390 tried to mimic static level folding by changing pXd_offset primitives to always calculate top level page table offset in pgd_offset and just return the value passed when pXd_offset has to act as folded. What is crucial for gup_fast and what has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly. And the latter is not possible with dynamic folding. To fix the issue in addition to pXd values pass original pXdp pointers down to gup_pXd_range functions. And introduce pXd_offset_lockless helpers, which take an additional pXd entry value parameter. This has already been discussed in https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.2+] Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-26	Merge tag 'tegra-for-5.10-arm64-dt' of ↵	Olof Johansson
	git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/dt arm64: tegra: Changes for v5.10-rc1 This set of changes fixes some minor issues in existing device trees and adds ID EEPROMs on the Jetson Xavier NX. All ID EEPROMs are now labelled to allow them to be detected by software. It also adds support for the Tegra234 VDK board, which is a pre-silicon platform for the upcoming Orin SoC. * tag 'tegra-for-5.10-arm64-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: arm64: tegra: Initial Tegra234 VDK support arm64: tegra: Populate EEPROMs for Jetson Xavier NX arm64: tegra: Add label properties for EEPROMs arm64: tegra: Add DT binding for AHUB components arm64: tegra: Enable ACONNECT, ADMA and AGIC on Jetson Nano arm64: tegra: Properly size register regions for GPU on Tegra194 arm64: tegra: Use valid PWM period for VDD_GPU on Tegra210 arm64: tegra: Describe display controller outputs for Tegra210 arm64: tegra: Disable SD card write-protection on Jetson Nano arm64: tegra: Add VBUS supply for micro USB port on Jetson Nano arm64: tegra: Wire up pinctrl states for all DPAUX controllers arm64: tegra: Add ID EEPROMs on Jetson AGX Xavier Link: https://lore.kernel.org/r/20200918150303.3938852-5-thierry.reding@gmail.com Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26	Merge tag 'tegra-for-5.10-dt-bindings' of ↵	Olof Johansson
	git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/dt dt-bindings: Changes for v5.10-rc1 This set of changes adds compatible strings for Tegra234 to existing device tree bindings. * tag 'tegra-for-5.10-dt-bindings' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: dt-bindings: power: supply: Add device-tree binding for Summit SMB3xx dt-bindings: tegra: pmc: Add Tegra234 support dt-bindings: fuse: tegra: Add Tegra234 support dt-bindings: tegra: Add Tegra234 VDK compatible dt-bindings: misc: tegra186-misc: Add Tegra234 support dt-bindings: misc: tegra186-misc: Add missing compatible string dt-bindings: misc: tegra-apbmisc: Add missing compatible strings Link: https://lore.kernel.org/r/20200918150303.3938852-1-thierry.reding@gmail.com Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26	Merge tag 'renesas-arm-dt-for-v5.10-tag2' of ↵	Olof Johansson
	git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/dt Renesas ARM DT updates for v5.10 (take two) - PCIe endpoint support for the RZ/G2H SoC, - SATA support for the HopeRun HiHope RZ/G2H board, - Increase support (CAN, LED, SPI NOR, VIN, VSP) for the RZ/G1H SoC on the iWave Qseven board (G21D), and its camera add-on board, - Initial support for the R-Car V3U SoC on the Falcon CPU and BreakOut boards, - HDMI display and sound support for the R-Car M3-W+ SoC on the Salvator-XS board, - Digital Radio Interface (DRIF) support for the R-Car E3 SoC, - Minor fixes and cleanups. * tag 'renesas-arm-dt-for-v5.10-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: (24 commits) arm64: dts: renesas: r8a774c0: Fix MSIOF1 DMA channels arm64: dts: renesas: r8a77990: Fix MSIOF1 DMA channels arm64: dts: renesas: r8a77990: Add DRIF support ARM: dts: r8a7742-iwg21d-q7-dbcm-ca: Add can0 support to camera DB ARM: dts: r8a7742: Add VSP support arm64: dts: renesas: Drop superfluous pin configuration containers arm64: dts: renesas: r8a77961: salvator-xs: Add HDMI Sound support arm64: dts: renesas: r8a77961: salvator-xs: Add HDMI Display support arm64: dts: renesas: r8a77961: Add HDMI device nodes arm64: dts: renesas: r8a77961: Add DU device nodes arm64: dts: renesas: r8a77961: Add VSP device nodes arm64: dts: renesas: r8a77961: Add FCP device nodes arm64: dts: renesas: Fix pin controller node names ARM: dts: renesas: Fix pin controller node names arm64: dts: renesas: Add Renesas Falcon boards support arm64: dts: renesas: Add Renesas R8A779A0 SoC support ARM: dts: r8a7742-iwg21d-q7: Enable SD2 LED indication ARM: dts: r8a7742-iwg21d-q7: Add can1 support to carrier board ARM: dts: r8a7742-iwg21d-q7: Add SPI NOR support ARM: dts: r8a7742: Add VIN DT nodes ... Link: https://lore.kernel.org/r/20200918124800.15555-2-geert+renesas@glider.be Signed-off-by: Olof Johansson <olof@lixom.net>
2020-09-26	media: v4l2: extend the CSC API to subdevice.	Dafna Hirschfeld
	This patch extends the CSC API in video devices to be supported also on sub-devices. The flag V4L2_MBUS_FRAMEFMT_SET_CSC set by the application when calling VIDIOC_SUBDEV_S_FMT ioctl. The flags: V4L2_SUBDEV_MBUS_CODE_CSC_COLORSPACE, V4L2_SUBDEV_MBUS_CODE_CSC_XFER_FUNC, V4L2_SUBDEV_MBUS_CODE_CSC_YCBCR_ENC/V4L2_SUBDEV_MBUS_CODE_CSC_HSV_ENC V4L2_SUBDEV_MBUS_CODE_CSC_QUANTIZATION are set by the driver in the VIDIOC_SUBDEV_ENUM_MBUS_CODE ioctl. New 'flags' fields were added to the structs v4l2_subdev_mbus_code_enum, v4l2_mbus_framefmt which are borrowed from the 'reserved' field The patch also replaces the 'ycbcr_enc' field in 'struct v4l2_mbus_framefmt' with a union that includes 'hsv_enc' Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-26	media: vivid: Add support to the CSC API	Dafna Hirschfeld
	The CSC API (Colorspace conversion) allows userspace to try to configure the colorspace, transfer function, Y'CbCr/HSV encoding and the quantization for capture devices. This patch adds support to the CSC API in vivid. Using the CSC API, userspace is allowed to do the following: - Set the colorspace. - Set the xfer_func. - Set the ycbcr_enc function for YUV formats. - Set the hsv_enc function for HSV formats - Set the quantization for YUV and RGB formats. Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-26	media: v4l2: add support for colorspace conversion API (CSC) for video capture	Dafna Hirschfeld
	For video capture it is the driver that reports the colorspace, transfer function, Y'CbCr/HSV encoding and quantization range used by the video, and there is no way to request something different, even though many HDTV receivers have some sort of colorspace conversion capabilities. For output video this feature already exists since the application specifies this information for the video format it will send out, and the transmitter will enable any available CSC if a format conversion has to be performed in order to match the capabilities of the sink. For video capture we propose adding new v4l2_pix_format flag: V4L2_PIX_FMT_FLAG_SET_CSC. The flag is set by the application, the driver will interpret the colorspace, xfer_func, ycbcr_enc/hsv_enc and quantization fields as the requested colorspace information and will attempt to do the conversion it supports. Drivers set the flags V4L2_FMT_FLAG_CSC_COLORSPACE, V4L2_FMT_FLAG_CSC_XFER_FUNC, V4L2_FMT_FLAG_CSC_YCBCR_ENC/V4L2_FMT_FLAG_CSC_HSV_ENC, V4L2_FMT_FLAG_CSC_QUANTIZATION, in the flags field of the struct v4l2_fmtdesc during enumeration to indicate that they support colorspace conversion for the respective field. Drivers do not have to actually look at the flags. If the flags are not set, then the fields 'colorspace', 'xfer_func', 'ycbcr_enc/hsv_enc', and 'quantization' are set to the default values by the core, i.e. just pass on the received format without conversion. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-09-25	devlink: introduce flash update overwrite mask	Jacob Keller
	Sections of device flash may contain settings or device identifying information. When performing a flash update, it is generally expected that these settings and identifiers are not overwritten. However, it may sometimes be useful to allow overwriting these fields when performing a flash update. Some examples include, 1) customizing the initial device config on first programming, such as overwriting default device identifying information, or 2) reverting a device configuration to known good state provided in the new firmware image, or 3) in case it is suspected that current firmware logic for managing the preservation of fields during an update is broken. Although some devices are able to completely separate these types of settings and fields into separate components, this is not true for all hardware. To support controlling this behavior, a new DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK is defined. This is an nla_bitfield32 which will define what subset of fields in a component should be overwritten during an update. If no bits are specified, or of the overwrite mask is not provided, then an update should not overwrite anything, and should maintain the settings and identifiers as they are in the previous image. If the overwrite mask has the DEVLINK_FLASH_OVERWRITE_SETTINGS bit set, then the device should be configured to overwrite any of the settings in the requested component with settings found in the provided image. Similarly, if the DEVLINK_FLASH_OVERWRITE_IDENTIFIERS bit is set, the device should be configured to overwrite any device identifiers in the requested component with the identifiers from the image. Multiple overwrite modes may be combined to indicate that a combination of the set of fields that should be overwritten. Drivers which support the new overwrite mask must set the DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK in the supported_flash_update_params field of their devlink_ops. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25	devlink: convert flash_update to use params structure	Jacob Keller
	The devlink core recently gained support for checking whether the driver supports a flash_update parameter, via `supported_flash_update_params`. However, parameters are specified as function arguments. Adding a new parameter still requires modifying the signature of the .flash_update callback in all drivers. Convert the .flash_update function to take a new `struct devlink_flash_update_params` instead. By using this structure, and the `supported_flash_update_params` bit field, a new parameter to flash_update can be added without requiring modification to existing drivers. As before, all parameters except file_name will require driver opt-in. Because file_name is a necessary field to for the flash_update to make sense, no "SUPPORTED" bitflag is provided and it is always considered valid. All future additional parameters will require a new bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25	devlink: check flash_update parameter support in net core	Jacob Keller
	When implementing .flash_update, drivers which do not support per-component update are manually checking the component parameter to verify that it is NULL. Without this check, the driver might accept an update request with a component specified even though it will not honor such a request. Instead of having each driver check this, move the logic into net/core/devlink.c, and use a new `supported_flash_update_params` field in the devlink_ops. Drivers which will support per-component update must now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in the supported_flash_update_params in their devlink_ops. This helps ensure that drivers do not forget to check for a NULL component if they do not support per-component update. This also enables a slightly better error message by enabling the core stack to set the netlink bad attribute message to indicate precisely the unsupported attribute in the message. Going forward, any new additional parameter to flash update will require a bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Bin Luo <luobin9@huawei.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Ido Schimmel <idosch@mellanox.com> Cc: Danielle Ratson <danieller@mellanox.com> Cc: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25	bpf: Add comment to document BTF type PTR_TO_BTF_ID_OR_NULL	John Fastabend
	The meaning of PTR_TO_BTF_ID_OR_NULL differs slightly from other types denoted with the *_OR_NULL type. For example the types PTR_TO_SOCKET and PTR_TO_SOCKET_OR_NULL can be used for branch analysis because the type PTR_TO_SOCKET is guaranteed to _not_ have a null value. In contrast PTR_TO_BTF_ID and BTF_TO_BTF_ID_OR_NULL have slightly different meanings. A PTR_TO_BTF_TO_ID may be a pointer to NULL value, but it is safe to read this pointer in the program context because the program context will handle any faults. The fallout is for PTR_TO_BTF_ID the verifier can assume reads are safe, but can not use the type in branch analysis. Additionally, authors need to be extra careful when passing PTR_TO_BTF_ID into helpers. In general helpers consuming type PTR_TO_BTF_ID will need to assume it may be null. Seeing the above is not obvious to readers without the back knowledge lets add a comment in the type definition. Editorial comment, as networking and tracing programs get closer and more tightly merged we may need to consider a new type that we can ensure is non-null for branch analysis and also passing into helpers. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com>
2020-09-25	net: stmmac: Add option for VLAN filter fail queue enable	Chuah, Kim Tatt
	Add option in plat_stmmacenet_data struct to enable VLAN Filter Fail Queuing. This option allows packets that fail VLAN filter to be routed to a specific Rx queue when Receive All is also set. When this option is enabled: - Enable VFFQ only when entering promiscuous mode, because Receive All will pass up all rx packets that failed address filtering (similar to promiscuous mode). - VLAN-promiscuous mode is never entered to allow rx packet to fail VLAN filters and get routed to selected VFFQ Rx queue. Reviewed-by: Voon Weifeng <weifeng.voon@intel.com> Reviewed-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Chuah, Kim Tatt <kim.tatt.chuah@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-25	mm/page_ref: Convert the open coded tracepoint enabled to the new helper	Steven Rostedt (VMware)
	As more use cases of checking if a tracepoint is enabled in a header are coming to fruition, a helper macro, tracepoint_enabled(), has been added to check if a tracepoint is enabled or not, and can be used with minimal header requirements (avoid "include hell"). Convert the page_ref logic over to the new helper macro. Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-25	tracepoints: Add helper to test if tracepoint is enabled in a header	Steven Rostedt (VMware)
	As tracepoints are discouraged from being added in a header because it can cause side effects if other tracepoints are in headers, as well as bloat the kernel as the trace_<tracepoint>() function is not a small inline, the common workaround is to add a function call that calls a wrapper function in a C file that then calls the tracepoint. But as function calls add overhead, this function should only be called when the tracepoint in question is enabled. To get around this overhead, a static_branch can be used to only have the tracepoint wrapper get called when the tracepoint is enabled. Add a tracepoint_enabled(tp) macro that gets passed the name of the tracepoint, and this becomes a static_branch that is enabled when the tracepoint is enabled and is a nop when the tracepoint is disabled. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-09-25	SUNRPC/NFSD: Implement xdr_reserve_space_vec()	Anna Schumaker
	Reserving space for a large READ payload requires special handling when reserving space in the xdr buffer pages. One problem we can have is use of the scratch buffer, which is used to get a pointer to a contiguous region of data up to PAGE_SIZE. When using the scratch buffer, calls to xdr_commit_encode() shift the data to it's proper alignment in the xdr buffer. If we've reserved several pages in a vector, then this could potentially invalidate earlier pointers and result in incorrect READ data being sent to the client. I get around this by looking at the amount of space left in the current page, and never reserve more than that for each entry in the read vector. This lets us place data directly where it needs to go in the buffer pages. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25	efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it	Ard Biesheuvel
	Incorporate the definition of EFI_MEMORY_CPU_CRYPTO from the UEFI specification v2.8, and wire it into our memory map dumping routine as well. To make a bit of space in the output buffer, which is provided by the various callers, shorten the descriptive names of the memory types. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-25	bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON	Martin KaFai Lau
	This patch changes the bpf_sk_assign() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL". Meaning it specifically takes a literal NULL. ARG_PTR_TO_BTF_ID_SOCK_COMMON does not allow a literal NULL, so another ARG type is required for this purpose and another follow-up patch can be used if there is such need. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000415.3857374-1-kafai@fb.com
2020-09-25	bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON	Martin KaFai Lau
	This patch changes the bpf_tcp__syncookie() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_() helpers also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000409.3856725-1-kafai@fb.com
2020-09-25	bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON	Martin KaFai Lau
	This patch changes the bpf_sk_storage_() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_() helpers also. A micro benchmark has been done on a "cgroup_skb/egress" bpf program which does a bpf_sk_storage_get(). It was driven by netperf doing a 4096 connected UDP_STREAM test with 64bytes packet. The stats from "kernel.bpf_stats_enabled" shows no meaningful difference. The sk_storage_get_btf_proto, sk_storage_delete_btf_proto, btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are no longer needed, so they are removed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000402.3856307-1-kafai@fb.com
2020-09-25	bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ↵	Martin KaFai Lau
	ARG_PTR_TO_BTF_ID_SOCK_COMMON The previous patch allows the networking bpf prog to use the bpf_skc_to_() helpers to get a PTR_TO_BTF_ID socket pointer, e.g. "struct tcp_sock ". It allows the bpf prog to read all the fields of the tcp_sock. This patch changes the bpf_sk_release() and bpf_sk_cgroup_id() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_() helpers also. For example, the following will work: sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0); if (!sk) return; tp = bpf_skc_to_tcp_sock(sk); if (!tp) { bpf_sk_release(sk); return; } lsndtime = tp->lsndtime; /* Pass tp to bpf_sk_release() will also work / bpf_sk_release(tp); Since PTR_TO_BTF_ID could be NULL, the helper taking ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime. A btf_id of "struct sock" may not always mean a fullsock. Regardless the helper's running context may get a non-fullsock or not, considering fullsock check/handling is pretty cheap, it is better to keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID will be able to handle the minisock situation. In the bpf_sk_cgroup_id() case, it will try to get a fullsock by using sk_to_full_sk() as its skb variant bpf_sk"b"_cgroup_id() has already been doing. bpf_sk_release can already handle minisock, so nothing special has to be done. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000356.3856047-1-kafai@fb.com
2020-09-25	bpf: Enable bpf_skc_to_* sock casting helper to networking prog type	Martin KaFai Lau
	There is a constant need to add more fields into the bpf_tcp_sock for the bpf programs running at tc, sock_ops...etc. A current workaround could be to use bpf_probe_read_kernel(). However, other than making another helper call for reading each field and missing CO-RE, it is also not as intuitive to use as directly reading "tp->lsndtime" for example. While already having perfmon cap to do bpf_probe_read_kernel(), it will be much easier if the bpf prog can directly read from the tcp_sock. This patch tries to do that by using the existing casting-helpers bpf_skc_to_() whose func_proto returns a btf_id. For example, the func_proto of bpf_skc_to_tcp_sock returns the btf_id of the kernel "struct tcp_sock". These helpers are also added to is_ptr_cast_function(). It ensures the returning reg (BPF_REF_0) will also carries the ref_obj_id. That will keep the ref-tracking works properly. The bpf_skc_to_ helpers are made available to most of the bpf prog types in filter.c. The bpf_skc_to_* helpers will be limited by perfmon cap. This patch adds a ARG_PTR_TO_BTF_ID_SOCK_COMMON. The helper accepting this arg can accept a btf-id-ptr (PTR_TO_BTF_ID + &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON]) or a legacy-ctx-convert-skc-ptr (PTR_TO_SOCK_COMMON). The bpf_skc_to_() helpers are changed to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will accept pointer obtained from skb->sk. Instead of specifying both arg_type and arg_btf_id in the same func_proto which is how the current ARG_PTR_TO_BTF_ID does, the arg_btf_id of the new ARG_PTR_TO_BTF_ID_SOCK_COMMON is specified in the compatible_reg_types[] in verifier.c. The reason is the arg_btf_id is always the same. Discussion in this thread: https://lore.kernel.org/bpf/20200922070422.1917351-1-kafai@fb.com/ The ARG_PTR_TO_BTF_ID_ part gives a clear expectation that the helper is expecting a PTR_TO_BTF_ID which could be NULL. This is the same behavior as the existing helper taking ARG_PTR_TO_BTF_ID. The _SOCK_COMMON part means the helper is also expecting the legacy SOCK_COMMON pointer. By excluding the _OR_NULL part, the bpf prog cannot call helper with a literal NULL which doesn't make sense in most cases. e.g. bpf_skc_to_tcp_sock(NULL) will be rejected. All PTR_TO__OR_NULL reg has to do a NULL check first before passing into the helper or else the bpf prog will be rejected. This behavior is nothing new and consistent with the current expectation during bpf-prog-load. [ ARG_PTR_TO_BTF_ID_SOCK_COMMON will be used to replace ARG_PTR_TO_SOCK* of other existing helpers later such that those existing helpers can take the PTR_TO_BTF_ID returned by the bpf_skc_to_() helpers. The only special case is bpf_sk_lookup_assign() which can accept a literal NULL ptr. It has to be handled specially in another follow up patch if there is a need (e.g. by renaming ARG_PTR_TO_SOCKET_OR_NULL to ARG_PTR_TO_BTF_ID_SOCK_COMMON_OR_NULL). ] [ When converting the older helpers that take ARG_PTR_TO_SOCK in the later patch, if the kernel does not support BTF, ARG_PTR_TO_BTF_ID_SOCK_COMMON will behave like ARG_PTR_TO_SOCK_COMMON because no reg->type could have PTR_TO_BTF_ID in this case. It is not a concern for the newer-btf-only helper like the bpf_skc_to_*() here though because these helpers must require BTF vmlinux to begin with. ] Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200925000350.3855720-1-kafai@fb.com
2020-09-25	Bluetooth: L2CAP: Fix calling sk_filter on non-socket based channel	Luiz Augusto von Dentz
	Only sockets will have the chan->data set to an actual sk, channels like A2MP would have its own data which would likely cause a crash when calling sk_filter, in order to fix this a new callback has been introduced so channels can implement their own filtering if necessary. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-09-25	ACPI: battery: include linux/power_supply.h	Barnabás Pőcze
	acpi/battery.h uses 'struct power_supply *', but fails to include/create any declaration of the type. Include linux/ power_supply.h to fix that. Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-25	compat.h: fix a spelling error in <linux/compat.h>	Christoph Hellwig
	There is no compat_sys_readv64v2 syscall, only a compat_sys_preadv64v2 one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-25	genirq: Add stub for set_handle_irq() when !GENERIC_IRQ_MULTI_HANDLER	Zhen Lei
	In order to avoid compilation errors when a driver references set_handle_irq(), but that the architecture doesn't select GENERIC_IRQ_MULTI_HANDLER, add a stub function that will just WARN_ON_ONCE() if ever used. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [maz: commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200924071754.4509-2-thunder.leizhen@huawei.com
2020-09-25	soc: mediatek: cmdq: add clear option in cmdq_pkt_wfe api	Dennis YC Hsieh
	Add clear parameter to let client decide if event should be clear to 0 after GCE receive it. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/1594136714-11650-9-git-send-email-dennis-yc.hsieh@mediatek.com [mb: fix commit message] Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25	soc: mediatek: cmdq: add jump function	Dennis YC Hsieh
	Add jump function so that client can jump to any address which contains instruction. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-8-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25	soc: mediatek: cmdq: add write_s_mask value function	Dennis YC Hsieh
	add write_s_mask_value function in cmdq helper functions which writes a constant value to address with mask and large dma access support. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-7-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25	soc: mediatek: cmdq: add write_s value function	Dennis YC Hsieh
	add write_s function in cmdq helper functions which writes a constant value to address with large dma access support. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-6-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
2020-09-25	soc: mediatek: cmdq: add read_s function	Dennis YC Hsieh
	Add read_s function in cmdq helper functions which support read value from register or dma physical address into gce internal register. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Link: https://lore.kernel.org/r/1594136714-11650-5-git-send-email-dennis-yc.hsieh@mediatek.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>