summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2017-07-01net: convert nf_bridge_info.use from atomic_t to refcount_tReshetova, Elena
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30hashtable: remove repeated phrase from a commentJakub Kicinski
"in a rcu enabled hashtable" is repeated twice in a comment. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-30task_struct: Allow randomized layoutKees Cook
This marks most of the layout of task_struct as randomizable, but leaves thread_info and scheduler state untouched at the start, and thread_struct untouched at the end. Other parts of the kernel use unnamed structures, but the 0-day builder using gcc-4.4 blows up on static initializers. Officially, it's documented as only working on gcc 4.6 and later, which further confuses me: https://gcc.gnu.org/wiki/C11Status The structure layout randomization already requires gcc 4.7, but instead of depending on the plugin being enabled, just check the gcc versions for wider build testing. At Linus's suggestion, the marking is hidden in a macro to reduce how ugly it looks. Additionally, indenting is left unchanged since it would make things harder to read. Randomization of task_struct is modified from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-06-30randstruct: Mark various structs for randomizationKees Cook
This marks many critical kernel structures for randomization. These are structures that have been targeted in the past in security exploits, or contain functions pointers, pointers to function pointer tables, lists, workqueues, ref-counters, credentials, permissions, or are otherwise sensitive. This initial list was extracted from Brad Spengler/PaX Team's code in the last public patch of grsecurity/PaX based on my understanding of the code. Changes or omissions from the original code are mine and don't reflect the original grsecurity/PaX code. Left out of this list is task_struct, which requires special handling and will be covered in a subsequent patch. Signed-off-by: Kees Cook <keescook@chromium.org>
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree. This batch contains connection tracking updates for the cleanup iteration path, patches from Florian Westphal: X) Skip unconfirmed conntracks in nf_ct_iterate_cleanup_net(), just set dying bit to let the CPU release them. X) Add nf_ct_iterate_destroy() to be used on module removal, to kill conntrack from all namespace. X) Restart iteration on hashtable resizing, since both may occur at the same time. X) Use the new nf_ct_iterate_destroy() to remove conntrack with NAT mapping on module removal. X) Use nf_ct_iterate_destroy() to remove conntrack entries helper module removal, from Liping Zhang. X) Use nf_ct_iterate_cleanup_net() to remove the timeout extension if user requests this, also from Liping. X) Add net_ns_barrier() and use it from FTP helper, so make sure no concurrent namespace removal happens at the same time while the helper module is being removed. X) Use NFPROTO_MAX in layer 3 conntrack protocol array, to reduce module size. Same thing in nf_tables. Updates for the nf_tables infrastructure: X) Prepare usage of the extended ACK reporting infrastructure for nf_tables. X) Remove unnecessary forward declaration in nf_tables hash set. X) Skip set size estimation if number of element is not specified. X) Changes to accomodate a (faster) unresizable hash set implementation, for anonymous sets and dynamic size fixed sets with no timeouts. X) Faster lookup function for unresizable hash table for 2 and 4 bytes key. And, finally, a bunch of asorted small updates and cleanups: X) Do not hold reference to netdev from ipt_CLUSTER, instead subscribe to device events and look up for index from the packet path, this is fixing an issue that is present since the very beginning, patch from Xin Long. X) Use nf_register_net_hook() in ipt_CLUSTER, from Florian Westphal. X) Use ebt_invalid_target() whenever possible in the ebtables tree, from Gao Feng. X) Calm down compilation warning in nf_dup infrastructure, patch from stephen hemminger. X) Statify functions in nftables rt expression, also from stephen. X) Update Makefile to use canonical method to specify nf_tables-objs. From Jike Song. X) Use nf_conntrack_helpers_register() in amanda and H323. X) Space cleanup for ctnetlink, from linzhang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30Merge tag 'kvmarm-for-4.13' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.13 - vcpu request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups Conflicts: arch/s390/include/asm/kvm_host.h
2017-06-30nanosleep: Use get_timespec64() and put_timespec64()Deepa Dinamani
Usage of these apis and their compat versions makes the syscalls: clock_nanosleep and nanosleep and their compat implementations simpler. This is a preparatory patch to isolate data conversions to struct timespec64 at userspace boundaries. This helps contain the changes needed to transition to new y2038 safe types. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-30ieee80211: update public action codesPeter Oh
Update Public Action field values as updated in IEEE Std 802.11-2016, so that modules/drivers can refer it. Signed-off-by: Peter Oh <peter.oh@bowerswilkins.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-06-29iov_iter/hardening: move object size checks to inlined partAl Viro
There we actually have useful information about object sizes. Note: this patch has them done for all iov_iter flavours. Right now we do them twice in iovec case, but that'll change very shortly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29copy_{to,from}_user(): consolidate object size checksAl Viro
... and move them into thread_info.h, next to check_object_size() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29copy_{from,to}_user(): move kasan checks and might_fault() out-of-lineAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29fs: implement vfs_iter_write using do_iter_writeChristoph Hellwig
De-dupliate some code and allow for passing the flags argument to vfs_iter_write. Additionally it now properly updates timestamps. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29fs: implement vfs_iter_read using do_iter_readChristoph Hellwig
De-dupliate some code and allow for passing the flags argument to vfs_iter_read. Additional it properly updates atime now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-06-29libnvdimm, btt: BTT updates for UEFI 2.7 formatVishal Verma
The UEFI 2.7 specification defines an updated BTT metadata format, bumping the revision to 2.0. Add support for the new format, while retaining compatibility for the old 1.1 format. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29bpf: Add syscall lookup support for fd array and htabMartin KaFai Lau
This patch allows userspace to do BPF_MAP_LOOKUP_ELEM on BPF_MAP_TYPE_PROG_ARRAY, BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS. The lookup returns a prog-id or map-id to the userspace. The userspace can then use the BPF_PROG_GET_FD_BY_ID or BPF_MAP_GET_FD_BY_ID to get a fd. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29libnvdimm, pmem: disable dax flushing when pmem is fronting a volatile regionDan Williams
The pmem driver attaches to both persistent and volatile memory ranges advertised by the ACPI NFIT. When the region is volatile it is redundant to spend cycles flushing caches at fsync(). Check if the hosting region is volatile and do not set dax_write_cache() if it is. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29libnvdimm, pmem, dax: export a cache control attributeDan Williams
The dax_flush() operation can be turned into a nop on platforms where firmware arranges for cpu caches to be flushed on a power-fail event. The ACPI 6.2 specification defines a mechanism for the platform to indicate this capability so the kernel can select the proper default. However, for other platforms, the administrator must toggle this setting manually. Given this flush setting is a dax-specific mechanism we advertise it through a 'dax' attribute group hanging off a host device. For example, a 'pmem0' block-device gets a 'dax' sysfs-subdirectory with a 'write_cache' attribute to control response to dax cache flush requests. This is similar to the 'queue/write_cache' attribute that appears under block devices. Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-29Merge tag 'actions-drivers-for-4.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/drivers Pull "Actions Semi SoC drivers for 4.13" from Andreas Färber: This adds clock source and power domain drivers for S500/S900. * tag 'actions-drivers-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions: soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating soc: actions: Add Owl SPS dt-bindings: power: Add Owl SPS power domains clocksource: owl: Add S900 support clocksource: Add Owl timer
2017-06-29Merge tag 'actions-arm-soc+sps-for-4.13' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions into next/soc Pull "Actions Semi ARM SoC for v4.13 #2" from Andreas Färber: This adds SMP code to bring up the remaining S500 CPU cores by reusing a helper factored out of the SPS power domains driver. * tag 'actions-arm-soc+sps-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/afaerber/linux-actions: ARM: owl: smp: Implement SPS power-gating for CPU2 and CPU3 soc: actions: owl-sps: Factor out owl_sps_set_pg() for power-gating soc: actions: Add Owl SPS dt-bindings: power: Add Owl SPS power domains
2017-06-29pinctrl: generic: Add output-enable propertyJacopo Mondi
Add output-enable generic pin configuration property. This properties allows enabling/disabling pin's output capabilities without actually driving any value on the line. Acked-by: Rob Herring <robh@kernel.org> [Added inline elaborations on buffer enabling/disabling] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-06-29Merge tag 'v4.12-rc7' into develLinus Walleij
Linux 4.12-rc7
2017-06-29ras: mark stub functions as 'inline'Arnd Bergmann
With CONFIG_RAS disabled, we get two harmless warnings about unused functions: include/linux/ras.h:37:13: error: 'log_arm_hw_error' defined but not used [-Werror=unused-function] static void log_arm_hw_error(struct cper_sec_proc_arm *err) { return; } include/linux/ras.h:33:13: error: 'log_non_standard_event' defined but not used [-Werror=unused-function] static void log_non_standard_event(const guid_t *sec_type, Clearly these are meant to be 'inline', like the other stubs in the same header. Fixes: 297b64c74385 ("ras: acpi / apei: generate trace event for unrecognized CPER section") Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event") Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-28block: provide bio_uninit() free freeing integrity/task associationsJens Axboe
Wen reports significant memory leaks with DIF and O_DIRECT: "With nvme devive + T10 enabled, On a system it has 256GB and started logging /proc/meminfo & /proc/slabinfo for every minute and in an hour it increased by 15968128 kB or ~15+GB.. Approximately 256 MB / minute leaking. /proc/meminfo | grep SUnreclaim... SUnreclaim: 6752128 kB SUnreclaim: 6874880 kB SUnreclaim: 7238080 kB .... SUnreclaim: 22307264 kB SUnreclaim: 22485888 kB SUnreclaim: 22720256 kB When testcases with T10 enabled call into __blkdev_direct_IO_simple, code doesn't free memory allocated by bio_integrity_alloc. The patch fixes the issue. HTX has been run with +60 hours without failure." Since __blkdev_direct_IO_simple() allocates the bio on the stack, it doesn't go through the regular bio free. This means that any ancillary data allocated with the bio through the stack is not freed. Hence, we can leak the integrity data associated with the bio, if the device is using DIF/DIX. Fix this by providing a bio_uninit() and export it, so that we can use it to free this data. Note that this is a minimal fix for this issue. Any current user of bio's that are allocated outside of bio_alloc_bioset() suffers from this issue, most notably some drivers. We will fix those in a more comprehensive patch for 4.13. This also means that the commit marked as being fixed by this isn't the real culprit, it's just the most obvious one out there. Fixes: 542ff7bf18c6 ("block: new direct I/O implementation") Reported-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28blk-mq: Create hctx for each present CPUChristoph Hellwig
Currently we only create hctx for online CPUs, which can lead to a lot of churn due to frequent soft offline / online operations. Instead allocate one for each present CPU to avoid this and dramatically simplify the code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <keith.busch@intel.com> Cc: linux-block@vger.kernel.org Cc: linux-nvme@lists.infradead.org Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-06-28PCI: Make pci_register_host_bridge() PCI core internalLorenzo Pieralisi
With the introduction of pci_scan_root_bus_bridge() there is no need to export pci_register_host_bridge() to other kernel subsystems other than the PCI compilation unit that needs it. Make pci_register_host_bridge() static to its compilation unit and convert the existing drivers usage over to pci_scan_root_bus_bridge(). Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add pci_scan_root_bus_bridge() interfaceLorenzo Pieralisi
The current pci_scan_root_bus() interface is made up of two main code paths: - pci_create_root_bus() - pci_scan_child_bus() pci_create_root_bus() is a wrapper function that allows to create a struct pci_host_bridge structure, initialize it with the passed parameters and register it with the kernel. As the struct pci_host_bridge require additional struct members, pci_create_root_bus() parameters list has grown in time, making it unwieldy to add further parameters to it in case the struct pci_host_bridge gains more members fields to augment its functionality. Since PCI core code provides functions to allocate struct pci_host_bridge, instead of forcing the pci_create_root_bus() interface to add new parameters to cater for new struct pci_host_bridge functionality, it is more suitable to add an interface in PCI core code to scan a PCI bus straight from a struct pci_host_bridge created and customized by each specific PCI host controller driver. Add a pci_scan_root_bus_bridge() function to allow PCI host controller drivers to create and initialize struct pci_host_bridge and scan the resulting bus. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add devm_pci_alloc_host_bridge() interfaceLorenzo Pieralisi
Struct pci_host_bridge can be allocated by PCI host bridge drivers which usually allocate and map memory through devm managed interfaces. Add a devm version for the pci_alloc_host_bridge() interface to simplify PCI host controller driver porting and simplify the driver failure paths. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28PCI: Add pci_free_host_bridge() interfaceLorenzo Pieralisi
Commit a52d1443bba1 ("PCI: Export host bridge registration interface") exported the pci_alloc_host_bridge() interface so that PCI host controllers drivers can make use of it. Introduce pci_alloc_host_bridge() kernel counterpart to free the host bridge data structures, pci_free_host_bridge(), export it and update kernel functions releasing host bridge objects allocated memory to make use of it. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
2017-06-28vfio: New external user group/file matchAlex Williamson
At the point where the kvm-vfio pseudo device wants to release its vfio group reference, we can't always acquire a new reference to make that happen. The group can be in a state where we wouldn't allow a new reference to be added. This new helper function allows a caller to match a file to a group to facilitate this. Given a file and group, report if they match. Thus the caller needs to already have a group reference to match to the file. This allows the deletion of a group without acquiring a new reference. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Cc: stable@vger.kernel.org
2017-06-28regmap: irq: add chip option mask_writeonlyMichael Grzeschik
Some irq controllers have writeonly/multipurpose register layouts. In those cases we read invalid data back. Here we add the option mask_writeonly as masking option. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-06-28cgroup: implement "nsdelegate" mount optionTejun Heo
Currently, cgroup only supports delegation to !root users and cgroup namespaces don't get any special treatments. This limits the usefulness of cgroup namespaces as they by themselves can't be safe delegation boundaries. A process inside a cgroup can change the resource control knobs of the parent in the namespace root and may move processes in and out of the namespace if cgroups outside its namespace are visible somehow. This patch adds a new mount option "nsdelegate" which makes cgroup namespaces delegation boundaries. If set, cgroup behaves as if write permission based delegation took place at namespace boundaries - writes to the resource control knobs from the namespace root are denied and migration crossing the namespace boundary aren't allowed from inside the namespace. This allows cgroup namespace to function as a delegation boundary by itself. v2: Silently ignore nsdelegate specified on !init mounts. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Aravind Anbudurai <aru7@fb.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Eric Biederman <ebiederm@xmission.com>
2017-06-28svcrdma: Remove svc_rdma_marshal.cChuck Lever
svc_rdma_marshal.c has one remaining exported function -- svc_rdma_xdr_decode_req -- and it has a single call site. Take the same approach as the sendto path, and move this function into the source file where it is called. This is a refactoring change only. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-06-28Merge tag 'v4.12-rc5' into nfsd treeJ. Bruce Fields
Update to get f0c3192ceee3 "virtio_net: lower limit on buffer size". That bug was interfering with my nfsd testing.
2017-06-28locking/refcount: Create unchecked atomic_t implementationKees Cook
Many subsystems will not use refcount_t unless there is a way to build the kernel so that there is no regression in speed compared to atomic_t. This adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation which has the validation but is slightly slower. When not enabled, refcount_t uses the basic unchecked atomic_t routines, which results in no code changes compared to just using atomic_t directly. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Windsor <dwindsor@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Hans Liljestrand <ishkamiel@gmail.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Jann Horn <jannh@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arozansk@redhat.com Cc: axboe@kernel.dk Cc: linux-arch <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-28nvme: use a single NVME_AQ_DEPTH and relax it to 32Sagi Grimberg
No need to differentiate fabrics from pci/loop, also lower it to 32 as we don't really need 256 inflight admin commands. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-28dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrsChristoph Hellwig
dmam_alloc_noncoherent is a trivial wrapper around dmam_alloc_attrs, that hardcodes one particular flag. Make the devres code more flexible by allowing the callers to pass arbitrary flags. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
2017-06-28dma-mapping: remove dmam_free_noncoherentChristoph Hellwig
This function was never used since it was added. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org>
2017-06-28dma-mapping: remove the set_dma_mask methodChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28dma-mapping: remove HAVE_ARCH_DMA_SUPPORTEDChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28dma-mapping: remove DMA_ERROR_CODEChristoph Hellwig
And update the documentation - dma_mapping_error has been supported everywhere for a long time. Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-06-28spin loop primitives for busy waitingNicholas Piggin
Current busy-wait loops are implemented by repeatedly calling cpu_relax() to give an arch option for a low-latency option to improve power and/or SMT resource contention. This poses some difficulties for powerpc, which has SMT priority setting instructions (priorities determine how ifetch cycles are apportioned). powerpc's cpu_relax() is implemented by setting a low priority then setting normal priority. This has several problems: - Changing thread priority can have some execution cost and potential impact to other threads in the core. It's inefficient to execute them every time around a busy-wait loop. - Depending on implementation details, a `low ; medium` sequence may not have much if any affect. Some software with similar pattern actually inserts a lot of nops between, in order to cause a few fetch cycles with the low priority. - The busy-wait loop runs with regular priority. This might only be a few fetch cycles, but if there are several threads running such loops, they could cause a noticable impact on a non-idle thread. Implement spin_begin, spin_end primitives that can be used around busy wait loops, which default to no-ops. And spin_cpu_relax which defaults to cpu_relax. This will allow architectures to hook the entry and exit of busy-wait loops, and will allow powerpc to set low SMT priority at entry, and normal priority at exit. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-06-28Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', ↵Joerg Roedel
'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next
2017-06-28PM / core: Drop run_wake flag from struct dev_pm_infoRafael J. Wysocki
The run_wake flag in struct dev_pm_info is used to indicate whether or not the device is capable of generating remote wakeup signals at run time (or in the system working state), but the distinction between runtime remote wakeup and system wakeup signaling has always been rather artificial. The only practical reason for it to exist at the core level was that ACPI and PCI treated those two cases differently, but that's not the case any more after recent changes. For this reason, get rid of the run_wake flag and, when applicable, use device_set_wakeup_capable() and device_can_wakeup() instead of device_set_run_wake() and device_run_wake(), respectively. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28PCI / PM: Simplify device wakeup settings codeRafael J. Wysocki
After previous changes it is not necessary to distinguish between device wakeup for run time and device wakeup from system sleep states any more, so rework the PCI device wakeup settings code accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-28PCI / PM: Drop pme_interrupt flag from struct pci_devRafael J. Wysocki
The pme_interrupt flag in struct pci_dev is set when PMEs generated by the device are going to be signaled via root port PME interrupts. Ironically enough, that information is only used by the code setting up device wakeup through ACPI which returns as soon as it sees the pme_interrupt flag set while setting up "remote runtime wakeup". That is questionable, however, because in theory there may be PCIe devices using out-of-band PME signaling under root ports handled by the native PME code or devices requiring wakeup power setup to be carried out by AML. For such devices, ACPI wakeup should be invoked regardless of whether or not native PME signaling is used in general. For this reason, drop the pme_interrupt flag and rework the code using it which then allows the ACPI-based device wakeup handling in PCI to be consolidated to use one code path for both "runtime remote wakeup" and system wakeup (from sleep states). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-27dax: remove default copy_from_iter fallbackDan Williams
Require all dax-drivers to register a ->copy_from_iter() operation so that it is clear which dax_operations are optional and which must be implemented for filesystem-dax to operate. Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-27x86, libnvdimm, pmem: remove global pmem apiDan Williams
Now that all callers of the pmem api have been converted to dax helpers that call back to the pmem driver, we can remove include/linux/pmem.h and asm/pmem.h. Cc: <x86@kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-27x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimmDan Williams
Kill this globally defined wrapper and move to libnvdimm so that we can ultimately remove include/linux/pmem.h and asm/pmem.h. Cc: <x86@kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-06-27block: remove the queue_bounce_pfn helperChristoph Hellwig
Only used inside the bounce code, and opencoding it makes it more obvious what is going on. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>