summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-06-29libnvdimm, btt: BTT updates for UEFI 2.7 formatVishal Verma
The UEFI 2.7 specification defines an updated BTT metadata format, bumping the revision to 2.0. Add support for the new format, while retaining compatibility for the old 1.1 format. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29bpf: Add syscall lookup support for fd array and htabMartin KaFai Lau
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on BPF_MAP_TYPE_PROG_ARRAY, BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS. The lookup returns a prog-id or map-id to the userspace. The userspace can then use the BPF_PROG_GET_FD_BY_ID or BPF_MAP_GET_FD_BY_ID to get a fd. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29drm/amdgpu: Fix the exported always on CU bitmapFlora Cui
Newer asics with 4 SEs are not able to fit the entire bitmask in the original field, use an array instead. v2: keep cu_ao_mask for backward compatibility. Signed-off-by: Flora Cui <Flora.Cui@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-29Merge tag 'mlx5-updates-2017-06-27' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-06-27 (Innova IPsec offload support) This patchset adds support for Innova IPSec network interface card. About Innova device: -------------------- Innova is a network card with a ConnectX chip and an FPGA chip as a bump-on-the-wire. Internal +----------+ Link +-----------------+ | +--------------+ FPGA | +------+ | ConnectX | | Shell +--+ QSFP | | +--------------+ +-------+ | | Port | +----------+ I2C | | SBU | | +------+ | +-------+ | +--+----------+---+ | | +--+--+ +---+---+ | DDR | | Flash | +-----+ +-------+ The FPGA synthesized logic is loaded from dedicated flash storage and has access to its own dedicated DDR RAM. The ConnectX chip firmware programs the FPGA by accessing its configuration space over either the slow internal I2C link or the high-speed internal link. The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU). mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality, while other components may handle the various SBU functionalities. The driver opens high-speed reliable communication channels with the shell and the SBU over the internal link. These channels may be used for high-bandwidth configuration or for SBU-specific out-of-band data paths. About Innova IPSec device: -------------------------- Innova IPSec is a network card that allows offloading IPSec cryptography operations from the host CPU to the NIC. It is an Innova card with an IPSec SBU. The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's DDR memory. Internal +----------+ Link +-----------------+ | +--------------+ FPGA | +------+ | ConnectX | | Shell +--+ QSFP | | +--------------+ +-------+ | | Port | +----------+ Internal I2C | | IPSec | | +------+ | | SBU | | | +-------+ | +--+----------+---+ | | +--+--+ +---+---+ | DDR | | | | | | Flash | |SADB | | | +-----+ +-------+ Modes and ciphers: Currently the following modes and ciphers are supported: IPv4 and IPv6 ESP tunnel and transport modes AES 128 and 256 bit encryption, with GCM authentication (RFC4106) IV is generated using seqiv, in sync with Linux's geniv. More modes and ciphers may be added later. Notes: In the future similar functionality will be included in a single-chip NIC. About the driver: ----------------- Patches 1-4 prepare some existing driver code for the new feature: * Add support for reserved GIDs in the hardware GID table * Allow multiple modules to enable hardware RoCE support independently Patches 5-6 define structs and helper functions for QP work-queues. Patches 7-11 add various FPGA-related features required for Innova. IPSec. Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices. atches 13-16 add IPSec offload support to the mlx5 netdevice. This driver services the new IPSec offload API introduced in commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Configuration Path: If Innova IPSec device is detected, the mlx5e netdevice gets the new NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload capabilities, and also the matching TX checksum and GSO features. The driver configures offloaded Security Associations (SAs) by sending an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR. These messages and their responses are sent over a high-speed channel. Counters for ethtool are retrieved by the driver from the SBU. Data path: On receive path, the SBU decrypts ESP packets which match the offloaded SADB, but keeps them encapsulated. The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload has taken place, the SA with which it was done, and the authentication result. The ConnectX chip performs RX checksum offload on the packet, and RSS using the ESP SPI value. The driver detects the special ethertype, and attaches a struct secpath to the RX SKB, including flags to indicate that crypto offload took place, the authentication result, and which xfrm_state was used for decryption, in the olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate patchset will add support for that in the xfrm stack. On transmit path, the stack encapsulates the packet but does not encrypt it, and indicates in the SKB's secpath that crypto offload is to be performed and the SA to use to do so. The driver avoids performing crypto-offload for ESP fragments, and packets with IP options, as the SBU cannot currently do that. For eligible packets, the driver prepends a special ethertype with metadata instructing the hardware to perform crypto offload. The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer. The driver trims it off, because the SBU automatically appends the trailer for offloaded packets. The ConnectX chip performs TX checksum offload on inner UDP or TCP packets, and GSO for TCP packets (duplicating the prepended metadata). The segmented packets then undergo encryption in the SBU before going on the wire. Performance: We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz Using AES-NI with ESP GSO we get constant 4.1 Gbps. Using crypto offload we get constant 18 Gbps. Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately. - Ilan Tayari ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile regionDan Williams
The pmem driver attaches to both persistent and volatile memory ranges advertised by the ACPI NFIT. When the region is volatile it is redundant to spend cycles flushing caches at fsync(). Check if the hosting region is volatile and do not set dax_write_cache() if it is. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29libnvdimm, pmem, dax: export a cache control attributeDan Williams
The dax_flush() operation can be turned into a nop on platforms where firmware arranges for cpu caches to be flushed on a power-fail event. The ACPI 6.2 specification defines a mechanism for the platform to indicate this capability so the kernel can select the proper default. However, for other platforms, the administrator must toggle this setting manually. Given this flush setting is a dax-specific mechanism we advertise it through a 'dax' attribute group hanging off a host device. For example, a 'pmem0' block-device gets a 'dax' sysfs-subdirectory with a 'write_cache' attribute to control response to dax cache flush requests. This is similar to the 'queue/write_cache' attribute that appears under block devices. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29Merge tag 'actions-drivers-for-4.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/drivers Pull "Actions Semi SoC drivers for 4.13" from Andreas Färber: This adds clock source and power domain drivers for S500/S900. * tag 'actions-drivers-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions: soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating soc: actions: Add Owl SPS dt-bindings: power: Add Owl SPS power domains clocksource: owl: Add S900 support clocksource: Add Owl timer
2017-06-29Merge tag 'actions-arm-soc+sps-for-4.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/soc Pull "Actions Semi ARM SoC for v4.13 #2" from Andreas Färber: This adds SMP code to bring up the remaining S500 CPU cores by reusing a helper factored out of the SPS power domains driver. * tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions: ARM: owl: smp: Implement SPS power-gating for CPU2 and CPU3 soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating soc: actions: Add Owl SPS dt-bindings: power: Add Owl SPS power domains
2017-06-29Merge tag 'amlogic-dt64-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into next/dt64 Pull "Amlogic 64-bit DT changes for v4.13 (round 2)" from Kevin Hilman: - support new SPI controller driver - several more leaf clocks exposed to DT - New board: S905x LibreTech CC board * tag 'amlogic-dt64-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: ARM64: dts: meson-gxl: Add Libre Technology CC support dt-bindings: arm: amlogic: Add Libre Technology CC board dt-bindings: add Libre Technology vendor prefix ARM64: dts: meson-gx: Add SPICC nodes clk: meson-gxbb: un-export the CPU clock clk: meson-gxbb: expose UART clocks clk: meson-gxbb: expose SPICC gate clk: meson-gxbb: expose spdif master clock clk: meson-gxbb: expose i2s master clock clk: meson-gxbb: expose spdif clock gates
2017-06-29Merge tag 'amlogic-dt-2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into next/dt Merge "Amlogic 32-bit DT changes for v4.13 (round 2)" from Kevin Hilman: - greatly expands DT clock support for meson8b * tag 'amlogic-dt-2' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: (22 commits) ARM: dts: meson: use the real ethernet clock on Meson8 and Meson8b ARM: dts: meson8b: add the SCU device node ARM: dts: meson: add USB support on Meson8 and Meson8b ARM: dts: meson: add the hardware random number generator ARM: dts: meson8: add reserved memory zones ARM: dts: meson: add the SAR ADC ARM: dts: meson8: add the pins for the SDIO controller ARM: dts: meson8: add the PWM_E and PWM_F pins ARM: dts: meson: use GIC_SPI and IRQ_TYPE_EDGE_RISING macros ARM: dts: meson: use C preprocessor friendly include syntax ARM: dts: meson8: fix the IR receiver pins clk: meson8b: export the ethernet gate clock clk: meson8b: export the USB clocks clk: meson8b: export the gate clock for the HW random number generator clk: meson8b: export the SDIO clock clk: meson8b: export the SAR ADC clocks clk: meson-gxbb: un-export the CPU clock clk: meson-gxbb: expose UART clocks clk: meson-gxbb: expose SPICC gate clk: meson-gxbb: expose spdif master clock ...
2017-06-29sd: add support for TCG OPAL self encrypting disksChristoph Hellwig
Just wire up the generic TCG OPAL infrastructure to the SCSI disk driver and the Security In/Out commands. Note that I don't know of any actual SCSI disks that do support TCG OPAL, but this is required to support ATA disks through libata. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2017-06-29Merge tag 'sh-pfc-for-v4.13-tag2' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into devel pinctrl: sh-pfc: Updates for v4.13 (take two) - Add SCIF1 and SCIF2 pin groups for R-Car V2H, - Add EtherAVB, DU parallel RGB output, and PWM pin groups for R-Car H3 ES2.0, - Add pin and gpio controller support for RZ/A1.
2017-06-29pinctrl: generic: Add output-enable propertyJacopo Mondi
Add output-enable generic pin configuration property. This properties allows enabling/disabling pin's output capabilities without actually driving any value on the line. Acked-by: Rob Herring <robh@kernel.org> [Added inline elaborations on buffer enabling/disabling] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-06-29Merge tag 'v4.12-rc7' into develLinus Walleij
Linux 4.12-rc7
2017-06-29ras: mark stub functions as 'inline'Arnd Bergmann
With CONFIG_RAS disabled, we get two harmless warnings about unused functions: include/linux/ras.h:37:13: error: 'log_arm_hw_error' defined but not used [-Werror=unused-function] static void log_arm_hw_error(struct cper_sec_proc_arm *err) { return; } include/linux/ras.h:33:13: error: 'log_non_standard_event' defined but not used [-Werror=unused-function] static void log_non_standard_event(const guid_t *sec_type, Clearly these are meant to be 'inline', like the other stubs in the same header. Fixes: 297b64c74385 ("ras: acpi / apei: generate trace event for unrecognized CPER section") Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event") Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-28block: provide bio_uninit() free freeing integrity/task associationsJens Axboe
Wen reports significant memory leaks with DIF and O_DIRECT: "With nvme devive + T10 enabled, On a system it has 256GB and started logging /proc/meminfo & /proc/slabinfo for every minute and in an hour it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute leaking. /proc/meminfo | grep SUnreclaim... SUnreclaim: 6752128 kB SUnreclaim: 6874880 kB SUnreclaim: 7238080 kB .... SUnreclaim: 22307264 kB SUnreclaim: 22485888 kB SUnreclaim: 22720256 kB When testcases with T10 enabled call into __blkdev_direct_IO_simple, code doesn't free memory allocated by bio_integrity_alloc. The patch fixes the issue. HTX has been run with +60 hours without failure." Since __blkdev_direct_IO_simple() allocates the bio on the stack, it doesn't go through the regular bio free. This means that any ancillary data allocated with the bio through the stack is not freed. Hence, we can leak the integrity data associated with the bio, if the device is using DIF/DIX. Fix this by providing a bio_uninit() and export it, so that we can use it to free this data. Note that this is a minimal fix for this issue. Any current user of bio's that are allocated outside of bio_alloc_bioset() suffers from this issue, most notably some drivers. We will fix those in a more comprehensive patch for 4.13. This also means that the commit marked as being fixed by this isn't the real culprit, it's just the most obvious one out there. Fixes: 542ff7bf18c6 ("block: new direct I/O implementation") Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28blk-mq: Create hctx for each present CPUChristoph Hellwig
Currently we only create hctx for online CPUs, which can lead to a lot of churn due to frequent soft offline / online operations. Instead allocate one for each present CPU to avoid this and dramatically simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: linux-block@vger.kernel.org Cc: linux-nvme@lists.infradead.org Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-28PCI: Make pci_register_host_bridge() PCI core internalLorenzo Pieralisi
With the introduction of pci_scan_root_bus_bridge() there is no need to export pci_register_host_bridge() to other kernel subsystems other than the PCI compilation unit that needs it. Make pci_register_host_bridge() static to its compilation unit and convert the existing drivers usage over to pci_scan_root_bus_bridge(). Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add pci_scan_root_bus_bridge() interfaceLorenzo Pieralisi
The current pci_scan_root_bus() interface is made up of two main code paths: - pci_create_root_bus() - pci_scan_child_bus() pci_create_root_bus() is a wrapper function that allows to create a struct pci_host_bridge structure, initialize it with the passed parameters and register it with the kernel. As the struct pci_host_bridge require additional struct members, pci_create_root_bus() parameters list has grown in time, making it unwieldy to add further parameters to it in case the struct pci_host_bridge gains more members fields to augment its functionality. Since PCI core code provides functions to allocate struct pci_host_bridge, instead of forcing the pci_create_root_bus() interface to add new parameters to cater for new struct pci_host_bridge functionality, it is more suitable to add an interface in PCI core code to scan a PCI bus straight from a struct pci_host_bridge created and customized by each specific PCI host controller driver. Add a pci_scan_root_bus_bridge() function to allow PCI host controller drivers to create and initialize struct pci_host_bridge and scan the resulting bus. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add devm_pci_alloc_host_bridge() interfaceLorenzo Pieralisi
Struct pci_host_bridge can be allocated by PCI host bridge drivers which usually allocate and map memory through devm managed interfaces. Add a devm version for the pci_alloc_host_bridge() interface to simplify PCI host controller driver porting and simplify the driver failure paths. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add pci_free_host_bridge() interfaceLorenzo Pieralisi
Commit a52d1443bba1 ("PCI: Export host bridge registration interface") exported the pci_alloc_host_bridge() interface so that PCI host controllers drivers can make use of it. Introduce pci_alloc_host_bridge() kernel counterpart to free the host bridge data structures, pci_free_host_bridge(), export it and update kernel functions releasing host bridge objects allocated memory to make use of it. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28vfio: New external user group/file matchAlex Williamson
At the point where the kvm-vfio pseudo device wants to release its vfio group reference, we can't always acquire a new reference to make that happen. The group can be in a state where we wouldn't allow a new reference to be added. This new helper function allows a caller to match a file to a group to facilitate this. Given a file and group, report if they match. Thus the caller needs to already have a group reference to match to the file. This allows the deletion of a group without acquiring a new reference. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Cc: stable@vger.kernel.org
2017-06-28regmap: irq: add chip option mask_writeonlyMichael Grzeschik
Some irq controllers have writeonly/multipurpose register layouts. In those cases we read invalid data back. Here we add the option mask_writeonly as masking option. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28cgroup: implement "nsdelegate" mount optionTejun Heo
Currently, cgroup only supports delegation to !root users and cgroup namespaces don't get any special treatments. This limits the usefulness of cgroup namespaces as they by themselves can't be safe delegation boundaries. A process inside a cgroup can change the resource control knobs of the parent in the namespace root and may move processes in and out of the namespace if cgroups outside its namespace are visible somehow. This patch adds a new mount option "nsdelegate" which makes cgroup namespaces delegation boundaries. If set, cgroup behaves as if write permission based delegation took place at namespace boundaries - writes to the resource control knobs from the namespace root are denied and migration crossing the namespace boundary aren't allowed from inside the namespace. This allows cgroup namespace to function as a delegation boundary by itself. v2: Silently ignore nsdelegate specified on !init mounts. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Aravind Anbudurai <aru7@fb.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Eric Biederman <ebiederm@xmission.com>
2017-06-28svcrdma: Remove svc_rdma_marshal.cChuck Lever
svc_rdma_marshal.c has one remaining exported function -- svc_rdma_xdr_decode_req -- and it has a single call site. Take the same approach as the sendto path, and move this function into the source file where it is called. This is a refactoring change only. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-06-28ASoC: dwc: Added a quirk DW_I2S_QUIRK_16BIT_IDX_OVERRIDE to dwc driverVijendar Mukunda
Added quirk DW_I2S_QUIRK_16BIT_IDX_OVERRIDE to Designware driver. This quirk will set idx value to 1. By setting this quirk, it will override supported format as 16 bit resolution and bus width as 2 Bytes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28Merge tag 'v4.12-rc5' into nfsd treeJ. Bruce Fields
Update to get f0c3192ceee3 "virtio_net: lower limit on buffer size". That bug was interfering with my nfsd testing.
2017-06-28ASoC: rt5645: add inv_jd1_1 flagBard Liao
The flag will invert jd1_1 status. Which will be used if the jack connector is normal closed. Signed-off-by: Bard Liao <bardliao@realtek.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28ASoC: rt5645: rename jd_invert flag in platform dataBard Liao
The jd_invert flag is actually used for level triggered IRQ. Rename it to let code more readable. Signed-off-by: Bard Liao <bardliao@realtek.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28locking/refcount: Create unchecked atomic_t implementationKees Cook
Many subsystems will not use refcount_t unless there is a way to build the kernel so that there is no regression in speed compared to atomic_t. This adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation which has the validation but is slightly slower. When not enabled, refcount_t uses the basic unchecked atomic_t routines, which results in no code changes compared to just using atomic_t directly. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Windsor <dwindsor@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-28nvme: use a single NVME_AQ_DEPTH and relax it to 32Sagi Grimberg
No need to differentiate fabrics from pci/loop, also lower it to 32 as we don't really need 256 inflight admin commands. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrsChristoph Hellwig
dmam_alloc_noncoherent is a trivial wrapper around dmam_alloc_attrs, that hardcodes one particular flag. Make the devres code more flexible by allowing the callers to pass arbitrary flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
2017-06-28dma-mapping: remove dmam_free_noncoherentChristoph Hellwig
This function was never used since it was added. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
2017-06-28dma-mapping: remove the set_dma_mask methodChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28dma-mapping: remove HAVE_ARCH_DMA_SUPPORTEDChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28dma-mapping: remove DMA_ERROR_CODEChristoph Hellwig
And update the documentation - dma_mapping_error has been supported everywhere for a long time. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28spin loop primitives for busy waitingNicholas Piggin
Current busy-wait loops are implemented by repeatedly calling cpu_relax() to give an arch option for a low-latency option to improve power and/or SMT resource contention. This poses some difficulties for powerpc, which has SMT priority setting instructions (priorities determine how ifetch cycles are apportioned). powerpc's cpu_relax() is implemented by setting a low priority then setting normal priority. This has several problems: - Changing thread priority can have some execution cost and potential impact to other threads in the core. It's inefficient to execute them every time around a busy-wait loop. - Depending on implementation details, a `low ; medium` sequence may not have much if any affect. Some software with similar pattern actually inserts a lot of nops between, in order to cause a few fetch cycles with the low priority. - The busy-wait loop runs with regular priority. This might only be a few fetch cycles, but if there are several threads running such loops, they could cause a noticable impact on a non-idle thread. Implement spin_begin, spin_end primitives that can be used around busy wait loops, which default to no-ops. And spin_cpu_relax which defaults to cpu_relax. This will allow architectures to hook the entry and exit of busy-wait loops, and will allow powerpc to set low SMT priority at entry, and normal priority at exit. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', ↵Joerg Roedel
'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next
2017-06-27scsi: sas: scsi_queue_work can fail, so make callers awareJohannes Thumshirn
libsas uses scsi_queue_work() to queue its internal event notifications. scsi_queue_work() can return -EINVAL if the work queue doesn't exist and it does call queue_work() which can return false if the work is already queued. Make the SAS event code capable of returning errors up to the caller, which is handy when changing to dynamically allocated work in libsas as well, as discussed here: https://lkml.org/lkml/2017/6/14/121. [mkp: fixed typo] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-28PM / core: Drop run_wake flag from struct dev_pm_infoRafael J. Wysocki
The run_wake flag in struct dev_pm_info is used to indicate whether or not the device is capable of generating remote wakeup signals at run time (or in the system working state), but the distinction between runtime remote wakeup and system wakeup signaling has always been rather artificial. The only practical reason for it to exist at the core level was that ACPI and PCI treated those two cases differently, but that's not the case any more after recent changes. For this reason, get rid of the run_wake flag and, when applicable, use device_set_wakeup_capable() and device_can_wakeup() instead of device_set_run_wake() and device_run_wake(), respectively. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28PCI / PM: Simplify device wakeup settings codeRafael J. Wysocki
After previous changes it is not necessary to distinguish between device wakeup for run time and device wakeup from system sleep states any more, so rework the PCI device wakeup settings code accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28PCI / PM: Drop pme_interrupt flag from struct pci_devRafael J. Wysocki
The pme_interrupt flag in struct pci_dev is set when PMEs generated by the device are going to be signaled via root port PME interrupts. Ironically enough, that information is only used by the code setting up device wakeup through ACPI which returns as soon as it sees the pme_interrupt flag set while setting up "remote runtime wakeup". That is questionable, however, because in theory there may be PCIe devices using out-of-band PME signaling under root ports handled by the native PME code or devices requiring wakeup power setup to be carried out by AML. For such devices, ACPI wakeup should be invoked regardless of whether or not native PME signaling is used in general. For this reason, drop the pme_interrupt flag and rework the code using it which then allows the ACPI-based device wakeup handling in PCI to be consolidated to use one code path for both "runtime remote wakeup" and system wakeup (from sleep states). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28ACPI / PM: Consolidate device wakeup settings codeRafael J. Wysocki
Currently, there are two separate ways of handling device wakeup settings in the ACPI core, depending on whether this is runtime wakeup or system wakeup (from sleep states). However, after the previous commit eliminating the run_wake ACPI device wakeup flag, there is no difference between the two any more at the ACPI level, so they can be combined. For this reason, introduce acpi_pm_set_device_wakeup() to replace both acpi_pm_device_run_wake() and acpi_pm_device_sleep_wake() and make it check the ACPI device object's wakeup.valid flag to determine whether or not the device can be set up to generate wakeup signals. Also notice that zpodd_enable/disable_run_wake() only call device_set_run_wake() because acpi_pm_device_run_wake() called device_run_wake(), which is not done by acpi_pm_set_device_wakeup(), so drop the now redundant device_set_run_wake() calls from there. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28ACPI / PM: Drop run_wake from struct acpi_device_wakeup_flagsRafael J. Wysocki
The run_wake flag in struct acpi_device_wakeup_flags stores the information on whether or not the device can generate wakeup signals at run time, but in ACPI that really is equivalent to being able to generate wakeup signals at all. In fact, run_wake will always be set after successful executeion of acpi_setup_gpe_for_wake(), but if that fails, the device will not be able to use a wakeup GPE at all, so it won't be able to wake up the systems from sleep states too. Hence, run_wake actually means that the device is capable of triggering wakeup and so it is equivalent to the valid flag. For this reason, drop run_wake from struct acpi_device_wakeup_flags and make sure that the valid flag is only set if acpi_setup_gpe_for_wake() has been successful. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-27dax: remove default copy_from_iter fallbackDan Williams
Require all dax-drivers to register a ->copy_from_iter() operation so that it is clear which dax_operations are optional and which must be implemented for filesystem-dax to operate. Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-27x86, libnvdimm, pmem: remove global pmem apiDan Williams
Now that all callers of the pmem api have been converted to dax helpers that call back to the pmem driver, we can remove include/linux/pmem.h and asm/pmem.h. Cc: <x86@kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-27x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimmDan Williams
Kill this globally defined wrapper and move to libnvdimm so that we can ultimately remove include/linux/pmem.h and asm/pmem.h. Cc: <x86@kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-27switchtec: Add "running" status flag to fw partition info ioctlLogan Gunthorpe
This flag lets userspace know which firmware partitions are currently in use as opposed to just active. "Active" means they will be in use for the next reboot, whereas "running" means they are currently in use. If an old kernel is in use, or the firmware doesn't support these fields, the new flag will not be set in the output. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
2017-06-27ACPICA: Use designated initializersKees Cook
The struct layout randomization plugin detects and randomizes any structs that contain only function pointers. Once layout is randomized, all initialization must be designated or the compiler will misalign the assignments. This switches all the ACPICA function pointer struct to use designated initializers, using the proposed upstream ACPICA macro: https://github.com/acpica/acpica/pull/248/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-27Merge back ACPICA material for v4.13.Rafael J. Wysocki