summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. This batch contains connection tracking updates for the cleanup iteration path, patches from Florian Westphal: X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set dying bit to let the CPU release them. X) Add nf_ct_iterate_destroy() to be used on module removal, to kill conntrack from all namespace. X) Restart iteration on hashtable resizing, since both may occur at the same time. X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT mapping on module removal. X) Use nf_ct_iterate_destroy() to remove conntrack entries helper module removal, from Liping Zhang. X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension if user requests this, also from Liping. X) Add net_ns_barrier() and use it from FTP helper, so make sure no concurrent namespace removal happens at the same time while the helper module is being removed. X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce module size. Same thing in nf_tables. Updates for the nf_tables infrastructure: X) Prepare usage of the extended ACK reporting infrastructure for nf_tables. X) Remove unnecessary forward declaration in nf_tables hash set. X) Skip set size estimation if number of element is not specified. X) Changes to accomodate a (faster) unresizable hash set implementation, for anonymous sets and dynamic size fixed sets with no timeouts. X) Faster lookup function for unresizable hash table for 2 and 4 bytes key. And, finally, a bunch of asorted small updates and cleanups: X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe to device events and look up for index from the packet path, this is fixing an issue that is present since the very beginning, patch from Xin Long. X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal. X) Use ebt_invalid_target() whenever possible in the ebtables tree, from Gao Feng. X) Calm down compilation warning in nf_dup infrastructure, patch from stephen hemminger. X) Statify functions in nftables rt expression, also from stephen. X) Update Makefile to use canonical method to specify nf_tables-objs. From Jike Song. X) Use nf_conntrack_helpers_register() in amanda and H323. X) Space cleanup for ctnetlink, from linzhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30ASoC: dapm: Add new widget type for constructing DAPM graphs on DSPs.Liam Girdwood
Add some DAPM widget types to better support the construction of DAPM graphs within DSPs. Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-30Merge tag 'kvmarm-for-4.13' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.13 - vcpu request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups Conflicts: arch/s390/include/asm/kvm_host.h
2017-06-30nanosleep: Use get_timespec64() and put_timespec64()Deepa Dinamani
Usage of these apis and their compat versions makes the syscalls: clock_nanosleep and nanosleep and their compat implementations simpler. This is a preparatory patch to isolate data conversions to struct timespec64 at userspace boundaries. This helps contain the changes needed to transition to new y2038 safe types. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-30ieee80211: update public action codesPeter Oh
Update Public Action field values as updated in IEEE Std 802.11-2016, so that modules/drivers can refer it. Signed-off-by: Peter Oh <peter.oh@bowerswilkins.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-06-30nl80211: Don't verify owner_nlportid on NAN commandsAndrei Otcheretianski
If NAN interface is created with NL80211_ATTR_SOCKET_OWNER, the socket that is used to create the interface is used for all NAN operations and reporting NAN events. However, it turns out that sending commands and receiving events on the same socket is not possible in a completely race-free way: If the socket buffer is overflowed by the events, the command response will not be sent. In that case the caller will block forever on recv. Using non-blocking socket for commands is more complicated and still the command response or ack may not be received. So, keep unicasting NAN events to the interface creator, but allow using a different socket for commands. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-06-29iov_iter/hardening: move object size checks to inlined partAl Viro
There we actually have useful information about object sizes. Note: this patch has them done for all iov_iter flavours. Right now we do them twice in iovec case, but that'll change very shortly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29copy_{to,from}_user(): consolidate object size checksAl Viro
... and move them into thread_info.h, next to check_object_size() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29copy_{from,to}_user(): move kasan checks and might_fault() out-of-lineAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29fs: implement vfs_iter_write using do_iter_writeChristoph Hellwig
De-dupliate some code and allow for passing the flags argument to vfs_iter_write. Additionally it now properly updates timestamps. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29fs: implement vfs_iter_read using do_iter_readChristoph Hellwig
De-dupliate some code and allow for passing the flags argument to vfs_iter_read. Additional it properly updates atime now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Need to access netdev->num_rx_queues behind an accessor in netvsc driver otherwise the build breaks with some configs, from Arnd Bergmann. 2) Add dummy xfrm_dev_event() so that build doesn't fail when CONFIG_XFRM_OFFLOAD is not set. From Hangbin Liu. 3) Don't OOPS when pfkey_msg2xfrm_state() signals an erros, from Dan Carpenter. 4) Fix MCDI command size for filter operations in sfc driver, from Martin Habets. 5) Fix UFO segmenting so that we don't calculate incorrect checksums, from Michal Kubecek. 6) When ipv6 datagram connects fail, reset destination address and port. From Wei Wang. 7) TCP disconnect must reset the cached receive DST, from WANG Cong. 8) Fix sign extension bug on 32-bit in dev_get_stats(), from Eric Dumazet. 9) fman driver has to depend on HAS_DMA, from Madalin Bucur. 10) Fix bpf pointer leak with xadd in verifier, from Daniel Borkmann. 11) Fix negative page counts with GFO, from Michal Kubecek. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) sfc: fix attempt to translate invalid filter ID net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() bpf: prevent leaking pointer via xadd on unpriviledged arcnet: com20020-pci: add missing pdev setup in netdev structure arcnet: com20020-pci: fix dev_id calculation arcnet: com20020: remove needless base_addr assignment Trivial fix to spelling mistake in arc_printk message arcnet: change irq handler to lock irqsave rocker: move dereference before free mlxsw: spectrum_router: Fix NULL pointer dereference net: sched: Fix one possible panic when no destroy callback virtio-net: serialize tx routine during reset net: usb: asix88179_178a: Add support for the Belkin B2B128 fsl/fman: add dependency on HAS_DMA net: prevent sign extension in dev_get_stats() tcp: reset sk_rx_dst in tcp_disconnect() net: ipv6: reset daddr and dport in sk if connect() fails bnx2x: Don't log mc removal needlessly bnxt_en: Fix netpoll handling. bnxt_en: Add missing logic to handle TPA end error conditions. ...
2017-06-29libnvdimm, btt: BTT updates for UEFI 2.7 formatVishal Verma
The UEFI 2.7 specification defines an updated BTT metadata format, bumping the revision to 2.0. Add support for the new format, while retaining compatibility for the old 1.1 format. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29bpf: Add syscall lookup support for fd array and htabMartin KaFai Lau
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on BPF_MAP_TYPE_PROG_ARRAY, BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS. The lookup returns a prog-id or map-id to the userspace. The userspace can then use the BPF_PROG_GET_FD_BY_ID or BPF_MAP_GET_FD_BY_ID to get a fd. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29drm/amdgpu: Fix the exported always on CU bitmapFlora Cui
Newer asics with 4 SEs are not able to fit the entire bitmask in the original field, use an array instead. v2: keep cu_ao_mask for backward compatibility. Signed-off-by: Flora Cui <Flora.Cui@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-29Merge tag 'mlx5-updates-2017-06-27' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-06-27 (Innova IPsec offload support) This patchset adds support for Innova IPSec network interface card. About Innova device: -------------------- Innova is a network card with a ConnectX chip and an FPGA chip as a bump-on-the-wire. Internal +----------+ Link +-----------------+ | +--------------+ FPGA | +------+ | ConnectX | | Shell +--+ QSFP | | +--------------+ +-------+ | | Port | +----------+ I2C | | SBU | | +------+ | +-------+ | +--+----------+---+ | | +--+--+ +---+---+ | DDR | | Flash | +-----+ +-------+ The FPGA synthesized logic is loaded from dedicated flash storage and has access to its own dedicated DDR RAM. The ConnectX chip firmware programs the FPGA by accessing its configuration space over either the slow internal I2C link or the high-speed internal link. The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU). mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality, while other components may handle the various SBU functionalities. The driver opens high-speed reliable communication channels with the shell and the SBU over the internal link. These channels may be used for high-bandwidth configuration or for SBU-specific out-of-band data paths. About Innova IPSec device: -------------------------- Innova IPSec is a network card that allows offloading IPSec cryptography operations from the host CPU to the NIC. It is an Innova card with an IPSec SBU. The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's DDR memory. Internal +----------+ Link +-----------------+ | +--------------+ FPGA | +------+ | ConnectX | | Shell +--+ QSFP | | +--------------+ +-------+ | | Port | +----------+ Internal I2C | | IPSec | | +------+ | | SBU | | | +-------+ | +--+----------+---+ | | +--+--+ +---+---+ | DDR | | | | | | Flash | |SADB | | | +-----+ +-------+ Modes and ciphers: Currently the following modes and ciphers are supported: IPv4 and IPv6 ESP tunnel and transport modes AES 128 and 256 bit encryption, with GCM authentication (RFC4106) IV is generated using seqiv, in sync with Linux's geniv. More modes and ciphers may be added later. Notes: In the future similar functionality will be included in a single-chip NIC. About the driver: ----------------- Patches 1-4 prepare some existing driver code for the new feature: * Add support for reserved GIDs in the hardware GID table * Allow multiple modules to enable hardware RoCE support independently Patches 5-6 define structs and helper functions for QP work-queues. Patches 7-11 add various FPGA-related features required for Innova. IPSec. Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices. atches 13-16 add IPSec offload support to the mlx5 netdevice. This driver services the new IPSec offload API introduced in commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Configuration Path: If Innova IPSec device is detected, the mlx5e netdevice gets the new NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload capabilities, and also the matching TX checksum and GSO features. The driver configures offloaded Security Associations (SAs) by sending an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR. These messages and their responses are sent over a high-speed channel. Counters for ethtool are retrieved by the driver from the SBU. Data path: On receive path, the SBU decrypts ESP packets which match the offloaded SADB, but keeps them encapsulated. The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload has taken place, the SA with which it was done, and the authentication result. The ConnectX chip performs RX checksum offload on the packet, and RSS using the ESP SPI value. The driver detects the special ethertype, and attaches a struct secpath to the RX SKB, including flags to indicate that crypto offload took place, the authentication result, and which xfrm_state was used for decryption, in the olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate patchset will add support for that in the xfrm stack. On transmit path, the stack encapsulates the packet but does not encrypt it, and indicates in the SKB's secpath that crypto offload is to be performed and the SA to use to do so. The driver avoids performing crypto-offload for ESP fragments, and packets with IP options, as the SBU cannot currently do that. For eligible packets, the driver prepends a special ethertype with metadata instructing the hardware to perform crypto offload. The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer. The driver trims it off, because the SBU automatically appends the trailer for offloaded packets. The ConnectX chip performs TX checksum offload on inner UDP or TCP packets, and GSO for TCP packets (duplicating the prepended metadata). The segmented packets then undergo encryption in the SBU before going on the wire. Performance: We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz Using AES-NI with ESP GSO we get constant 4.1 Gbps. Using crypto offload we get constant 18 Gbps. Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately. - Ilan Tayari ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile regionDan Williams
The pmem driver attaches to both persistent and volatile memory ranges advertised by the ACPI NFIT. When the region is volatile it is redundant to spend cycles flushing caches at fsync(). Check if the hosting region is volatile and do not set dax_write_cache() if it is. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29libnvdimm, pmem, dax: export a cache control attributeDan Williams
The dax_flush() operation can be turned into a nop on platforms where firmware arranges for cpu caches to be flushed on a power-fail event. The ACPI 6.2 specification defines a mechanism for the platform to indicate this capability so the kernel can select the proper default. However, for other platforms, the administrator must toggle this setting manually. Given this flush setting is a dax-specific mechanism we advertise it through a 'dax' attribute group hanging off a host device. For example, a 'pmem0' block-device gets a 'dax' sysfs-subdirectory with a 'write_cache' attribute to control response to dax cache flush requests. This is similar to the 'queue/write_cache' attribute that appears under block devices. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29Merge tag 'actions-drivers-for-4.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/drivers Pull "Actions Semi SoC drivers for 4.13" from Andreas Färber: This adds clock source and power domain drivers for S500/S900. * tag 'actions-drivers-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions: soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating soc: actions: Add Owl SPS dt-bindings: power: Add Owl SPS power domains clocksource: owl: Add S900 support clocksource: Add Owl timer
2017-06-29Merge tag 'actions-arm-soc+sps-for-4.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/soc Pull "Actions Semi ARM SoC for v4.13 #2" from Andreas Färber: This adds SMP code to bring up the remaining S500 CPU cores by reusing a helper factored out of the SPS power domains driver. * tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions: ARM: owl: smp: Implement SPS power-gating for CPU2 and CPU3 soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating soc: actions: Add Owl SPS dt-bindings: power: Add Owl SPS power domains
2017-06-29Merge tag 'amlogic-dt64-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into next/dt64 Pull "Amlogic 64-bit DT changes for v4.13 (round 2)" from Kevin Hilman: - support new SPI controller driver - several more leaf clocks exposed to DT - New board: S905x LibreTech CC board * tag 'amlogic-dt64-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: ARM64: dts: meson-gxl: Add Libre Technology CC support dt-bindings: arm: amlogic: Add Libre Technology CC board dt-bindings: add Libre Technology vendor prefix ARM64: dts: meson-gx: Add SPICC nodes clk: meson-gxbb: un-export the CPU clock clk: meson-gxbb: expose UART clocks clk: meson-gxbb: expose SPICC gate clk: meson-gxbb: expose spdif master clock clk: meson-gxbb: expose i2s master clock clk: meson-gxbb: expose spdif clock gates
2017-06-29Merge tag 'amlogic-dt-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into next/dt Merge "Amlogic 32-bit DT changes for v4.13 (round 2)" from Kevin Hilman: - greatly expands DT clock support for meson8b * tag 'amlogic-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: (22 commits) ARM: dts: meson: use the real ethernet clock on Meson8 and Meson8b ARM: dts: meson8b: add the SCU device node ARM: dts: meson: add USB support on Meson8 and Meson8b ARM: dts: meson: add the hardware random number generator ARM: dts: meson8: add reserved memory zones ARM: dts: meson: add the SAR ADC ARM: dts: meson8: add the pins for the SDIO controller ARM: dts: meson8: add the PWM_E and PWM_F pins ARM: dts: meson: use GIC_SPI and IRQ_TYPE_EDGE_RISING macros ARM: dts: meson: use C preprocessor friendly include syntax ARM: dts: meson8: fix the IR receiver pins clk: meson8b: export the ethernet gate clock clk: meson8b: export the USB clocks clk: meson8b: export the gate clock for the HW random number generator clk: meson8b: export the SDIO clock clk: meson8b: export the SAR ADC clocks clk: meson-gxbb: un-export the CPU clock clk: meson-gxbb: expose UART clocks clk: meson-gxbb: expose SPICC gate clk: meson-gxbb: expose spdif master clock ...
2017-06-29sd: add support for TCG OPAL self encrypting disksChristoph Hellwig
Just wire up the generic TCG OPAL infrastructure to the SCSI disk driver and the Security In/Out commands. Note that I don't know of any actual SCSI disks that do support TCG OPAL, but this is required to support ATA disks through libata. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-29Merge tag 'sh-pfc-for-v4.13-tag2' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into devel pinctrl: sh-pfc: Updates for v4.13 (take two) - Add SCIF1 and SCIF2 pin groups for R-Car V2H, - Add EtherAVB, DU parallel RGB output, and PWM pin groups for R-Car H3 ES2.0, - Add pin and gpio controller support for RZ/A1.
2017-06-29pinctrl: generic: Add output-enable propertyJacopo Mondi
Add output-enable generic pin configuration property. This properties allows enabling/disabling pin's output capabilities without actually driving any value on the line. Acked-by: Rob Herring <robh@kernel.org> [Added inline elaborations on buffer enabling/disabling] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-06-29Merge tag 'v4.12-rc7' into develLinus Walleij
Linux 4.12-rc7
2017-06-29ras: mark stub functions as 'inline'Arnd Bergmann
With CONFIG_RAS disabled, we get two harmless warnings about unused functions: include/linux/ras.h:37:13: error: 'log_arm_hw_error' defined but not used [-Werror=unused-function] static void log_arm_hw_error(struct cper_sec_proc_arm *err) { return; } include/linux/ras.h:33:13: error: 'log_non_standard_event' defined but not used [-Werror=unused-function] static void log_non_standard_event(const guid_t *sec_type, Clearly these are meant to be 'inline', like the other stubs in the same header. Fixes: 297b64c74385 ("ras: acpi / apei: generate trace event for unrecognized CPER section") Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event") Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-28block: provide bio_uninit() free freeing integrity/task associationsJens Axboe
Wen reports significant memory leaks with DIF and O_DIRECT: "With nvme devive + T10 enabled, On a system it has 256GB and started logging /proc/meminfo & /proc/slabinfo for every minute and in an hour it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute leaking. /proc/meminfo | grep SUnreclaim... SUnreclaim: 6752128 kB SUnreclaim: 6874880 kB SUnreclaim: 7238080 kB .... SUnreclaim: 22307264 kB SUnreclaim: 22485888 kB SUnreclaim: 22720256 kB When testcases with T10 enabled call into __blkdev_direct_IO_simple, code doesn't free memory allocated by bio_integrity_alloc. The patch fixes the issue. HTX has been run with +60 hours without failure." Since __blkdev_direct_IO_simple() allocates the bio on the stack, it doesn't go through the regular bio free. This means that any ancillary data allocated with the bio through the stack is not freed. Hence, we can leak the integrity data associated with the bio, if the device is using DIF/DIX. Fix this by providing a bio_uninit() and export it, so that we can use it to free this data. Note that this is a minimal fix for this issue. Any current user of bio's that are allocated outside of bio_alloc_bioset() suffers from this issue, most notably some drivers. We will fix those in a more comprehensive patch for 4.13. This also means that the commit marked as being fixed by this isn't the real culprit, it's just the most obvious one out there. Fixes: 542ff7bf18c6 ("block: new direct I/O implementation") Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28blk-mq: Create hctx for each present CPUChristoph Hellwig
Currently we only create hctx for online CPUs, which can lead to a lot of churn due to frequent soft offline / online operations. Instead allocate one for each present CPU to avoid this and dramatically simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: linux-block@vger.kernel.org Cc: linux-nvme@lists.infradead.org Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-28PCI: Make pci_register_host_bridge() PCI core internalLorenzo Pieralisi
With the introduction of pci_scan_root_bus_bridge() there is no need to export pci_register_host_bridge() to other kernel subsystems other than the PCI compilation unit that needs it. Make pci_register_host_bridge() static to its compilation unit and convert the existing drivers usage over to pci_scan_root_bus_bridge(). Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add pci_scan_root_bus_bridge() interfaceLorenzo Pieralisi
The current pci_scan_root_bus() interface is made up of two main code paths: - pci_create_root_bus() - pci_scan_child_bus() pci_create_root_bus() is a wrapper function that allows to create a struct pci_host_bridge structure, initialize it with the passed parameters and register it with the kernel. As the struct pci_host_bridge require additional struct members, pci_create_root_bus() parameters list has grown in time, making it unwieldy to add further parameters to it in case the struct pci_host_bridge gains more members fields to augment its functionality. Since PCI core code provides functions to allocate struct pci_host_bridge, instead of forcing the pci_create_root_bus() interface to add new parameters to cater for new struct pci_host_bridge functionality, it is more suitable to add an interface in PCI core code to scan a PCI bus straight from a struct pci_host_bridge created and customized by each specific PCI host controller driver. Add a pci_scan_root_bus_bridge() function to allow PCI host controller drivers to create and initialize struct pci_host_bridge and scan the resulting bus. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add devm_pci_alloc_host_bridge() interfaceLorenzo Pieralisi
Struct pci_host_bridge can be allocated by PCI host bridge drivers which usually allocate and map memory through devm managed interfaces. Add a devm version for the pci_alloc_host_bridge() interface to simplify PCI host controller driver porting and simplify the driver failure paths. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add pci_free_host_bridge() interfaceLorenzo Pieralisi
Commit a52d1443bba1 ("PCI: Export host bridge registration interface") exported the pci_alloc_host_bridge() interface so that PCI host controllers drivers can make use of it. Introduce pci_alloc_host_bridge() kernel counterpart to free the host bridge data structures, pci_free_host_bridge(), export it and update kernel functions releasing host bridge objects allocated memory to make use of it. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28vfio: New external user group/file matchAlex Williamson
At the point where the kvm-vfio pseudo device wants to release its vfio group reference, we can't always acquire a new reference to make that happen. The group can be in a state where we wouldn't allow a new reference to be added. This new helper function allows a caller to match a file to a group to facilitate this. Given a file and group, report if they match. Thus the caller needs to already have a group reference to match to the file. This allows the deletion of a group without acquiring a new reference. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Cc: stable@vger.kernel.org
2017-06-28regmap: irq: add chip option mask_writeonlyMichael Grzeschik
Some irq controllers have writeonly/multipurpose register layouts. In those cases we read invalid data back. Here we add the option mask_writeonly as masking option. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28cgroup: implement "nsdelegate" mount optionTejun Heo
Currently, cgroup only supports delegation to !root users and cgroup namespaces don't get any special treatments. This limits the usefulness of cgroup namespaces as they by themselves can't be safe delegation boundaries. A process inside a cgroup can change the resource control knobs of the parent in the namespace root and may move processes in and out of the namespace if cgroups outside its namespace are visible somehow. This patch adds a new mount option "nsdelegate" which makes cgroup namespaces delegation boundaries. If set, cgroup behaves as if write permission based delegation took place at namespace boundaries - writes to the resource control knobs from the namespace root are denied and migration crossing the namespace boundary aren't allowed from inside the namespace. This allows cgroup namespace to function as a delegation boundary by itself. v2: Silently ignore nsdelegate specified on !init mounts. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Aravind Anbudurai <aru7@fb.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Eric Biederman <ebiederm@xmission.com>
2017-06-28svcrdma: Remove svc_rdma_marshal.cChuck Lever
svc_rdma_marshal.c has one remaining exported function -- svc_rdma_xdr_decode_req -- and it has a single call site. Take the same approach as the sendto path, and move this function into the source file where it is called. This is a refactoring change only. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-06-28ASoC: dwc: Added a quirk DW_I2S_QUIRK_16BIT_IDX_OVERRIDE to dwc driverVijendar Mukunda
Added quirk DW_I2S_QUIRK_16BIT_IDX_OVERRIDE to Designware driver. This quirk will set idx value to 1. By setting this quirk, it will override supported format as 16 bit resolution and bus width as 2 Bytes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28Merge tag 'v4.12-rc5' into nfsd treeJ. Bruce Fields
Update to get f0c3192ceee3 "virtio_net: lower limit on buffer size". That bug was interfering with my nfsd testing.
2017-06-28ASoC: rt5645: add inv_jd1_1 flagBard Liao
The flag will invert jd1_1 status. Which will be used if the jack connector is normal closed. Signed-off-by: Bard Liao <bardliao@realtek.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28ASoC: rt5645: rename jd_invert flag in platform dataBard Liao
The jd_invert flag is actually used for level triggered IRQ. Rename it to let code more readable. Signed-off-by: Bard Liao <bardliao@realtek.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28locking/refcount: Create unchecked atomic_t implementationKees Cook
Many subsystems will not use refcount_t unless there is a way to build the kernel so that there is no regression in speed compared to atomic_t. This adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation which has the validation but is slightly slower. When not enabled, refcount_t uses the basic unchecked atomic_t routines, which results in no code changes compared to just using atomic_t directly. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Windsor <dwindsor@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-28nvme: use a single NVME_AQ_DEPTH and relax it to 32Sagi Grimberg
No need to differentiate fabrics from pci/loop, also lower it to 32 as we don't really need 256 inflight admin commands. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrsChristoph Hellwig
dmam_alloc_noncoherent is a trivial wrapper around dmam_alloc_attrs, that hardcodes one particular flag. Make the devres code more flexible by allowing the callers to pass arbitrary flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
2017-06-28dma-mapping: remove dmam_free_noncoherentChristoph Hellwig
This function was never used since it was added. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
2017-06-28dma-mapping: remove the set_dma_mask methodChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28dma-mapping: remove HAVE_ARCH_DMA_SUPPORTEDChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28dma-mapping: remove DMA_ERROR_CODEChristoph Hellwig
And update the documentation - dma_mapping_error has been supported everywhere for a long time. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28spin loop primitives for busy waitingNicholas Piggin
Current busy-wait loops are implemented by repeatedly calling cpu_relax() to give an arch option for a low-latency option to improve power and/or SMT resource contention. This poses some difficulties for powerpc, which has SMT priority setting instructions (priorities determine how ifetch cycles are apportioned). powerpc's cpu_relax() is implemented by setting a low priority then setting normal priority. This has several problems: - Changing thread priority can have some execution cost and potential impact to other threads in the core. It's inefficient to execute them every time around a busy-wait loop. - Depending on implementation details, a `low ; medium` sequence may not have much if any affect. Some software with similar pattern actually inserts a lot of nops between, in order to cause a few fetch cycles with the low priority. - The busy-wait loop runs with regular priority. This might only be a few fetch cycles, but if there are several threads running such loops, they could cause a noticable impact on a non-idle thread. Implement spin_begin, spin_end primitives that can be used around busy wait loops, which default to no-ops. And spin_cpu_relax which defaults to cpu_relax. This will allow architectures to hook the entry and exit of busy-wait loops, and will allow powerpc to set low SMT priority at entry, and normal priority at exit. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', ↵Joerg Roedel
'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next