summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2016-02-20regmap: merge regmap_field_update_bits() into macroKuninori Morimoto
This patch merges regmap_field_update_bits() into macro by using regmap_field_update_bits_base(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: merge regmap_field_write() into macroKuninori Morimoto
This patch merges regmap_field_write() into macro by using regmap_field_update_bits_base(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: add regmap_field_update_bits_base()Kuninori Morimoto
This patch adds new regmap_field_update_bits_base() which is using regmap_update_bits_base(). Current regmap_field_xxx() can be merged into it by macro. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: merge regmap_update_bits_check_async() into macroKuninori Morimoto
Current regmap has many similar update functions like below, but the difference is very few. regmap_update_bits() regmap_update_bits_async() regmap_update_bits_check() regmap_update_bits_check_async() Furthermore, we can add *force* write option too in the future. This patch merges regmap_update_bits_check_async() into macro by using regmap_update_bits_base(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: merge regmap_update_bits_check() into macroKuninori Morimoto
Current regmap has many similar update functions like below, but the difference is very few. regmap_update_bits() regmap_update_bits_async() regmap_update_bits_check() regmap_update_bits_check_async() Furthermore, we can add *force* write option too in the future. This patch merges regmap_update_bits_check() into macro by using regmap_update_bits_base(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: merge regmap_update_bits_async() into macroKuninori Morimoto
Current regmap has many similar update functions like below, but the difference is very few. regmap_update_bits() regmap_update_bits_async() regmap_update_bits_check() regmap_update_bits_check_async() Furthermore, we can add *force* write option too in the future. This patch merges regmap_update_bits_async() into macro by using regmap_update_bits_base(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: merge regmap_update_bits() into macroKuninori Morimoto
Current regmap has many similar update functions like below, but the difference is very few. regmap_update_bits() regmap_update_bits_async() regmap_update_bits_check() regmap_update_bits_check_async() Furthermore, we can add *force* write option too in the future. This patch merges regmap_update_bits() into macro by using regmap_update_bits_base(). Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-20regmap: add regmap_update_bits_base()Kuninori Morimoto
Current regmap has many similar update functions like below, but the difference is very few. regmap_update_bits() regmap_update_bits_async() regmap_update_bits_check() regmap_update_bits_check_async() Furthermore, we can add *force* write option too in the future. This patch adds new regmap_update_bits_base() which is feature merged function. Above functions can be merged into it by macro. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-19block: Add blk_set_runtime_active()Mika Westerberg
If block device is left runtime suspended during system suspend, resume hook of the driver typically corrects runtime PM status of the device back to "active" after it is resumed. However, this is not enough as queue's runtime PM status is still "suspended". As long as it is in this state blk_pm_peek_request() returns NULL and thus prevents new requests to be processed. Add new function blk_set_runtime_active() that can be used to force the queue status back to "active" as needed. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Tejun Heo <tj@kernel.org>
2016-02-19[media] v4l: vsp1: Implement atomic update for the DRM driverLaurent Pinchart
Add two API functions named vsp1_du_atomic_begin() and vsp1_du_atomic_flush() to signal the start and end of an atomic update. The vsp1_du_setup_rpf() function is renamed to vsp1_du_atomic_update() for consistency. With this new API, the driver will reprogram all modified inputs atomically before restarting the video stream. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2016-02-19[media] v4l: vsp1: Add VSP+DU supportLaurent Pinchart
Implement internal control of the VSP pipeline to be used by the DU DRM/KMS driver when using the VSP as an internal composer handled through DRM/KMS only. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2016-02-19[media] v4l: Add YUV 4:2:2 and YUV 4:4:4 tri-planar non-contiguous formatsLaurent Pinchart
The formats use three planes through the multiplanar API, allowing for non-contiguous planes in memory. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2016-02-19[media] v4l: add V4L2_CID_MPEG_VIDEO_FORCE_KEY_FRAMEWu-Cheng Li
Some drivers also need a control like V4L2_CID_MPEG_MFC51_VIDEO_FORCE_FRAME_TYPE to force an encoder key frame. Add a general V4L2_CID_MPEG_VIDEO_FORCE_KEY_FRAME so the new drivers and applications can use it. Signed-off-by: Wu-Cheng Li <wuchengli@chromium.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2016-02-19pinctrl: mediatek: Modify pinctrl bindings for mt7623John Crispin
Signed-off-by: John Crispin <blogic@openwrt.org> Cc: devicetree@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-02-19gpio: add userspace ABI for GPIO line informationLinus Walleij
This adds a GPIO line ABI for getting name, label and a few select flags from the kernel. This hides the kernel internals and only tells userspace what it may need to know: the different in-kernel consumers are masked behind the flag "kernel" and that is all userspace needs to know. However electric characteristics like active low, open drain etc are reflected to userspace, as this is important information. We provide information on all lines on all chips, later on we will likely add a flag for the chardev consumer so we can filter and display only the lines userspace actually uses in e.g. lsgpio, but then we first need an ABI for userspace to grab and use (get/set/select direction) a GPIO line. Sample output from "lsgpio" on ux500: GPIO chip: gpiochip7, "8011e000.gpio", 32 GPIO lines line 0: unnamed unlabeled line 1: unnamed unlabeled (...) line 25: unnamed "SFH7741 Proximity Sensor" [kernel output open-drain] line 26: unnamed unlabeled (...) Tested-by: Michael Welling <mwelling@ieee.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-02-19gpio: store reflect the label to userspaceLinus Walleij
The gpio_chip label is useful for userspace to understand what kind of GPIO chip it is dealing with. Let's store a copy of this label in the gpio_device, add it to the struct passed to userspace for GPIO_GET_CHIPINFO_IOCTL and modify lsgpio to show it. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-02-19ARM: dts: r8a7794: add EtherAVB clockSergei Shtylyov
Add the EtherAVB clock to the R8A7794 device tree. Based on the commit eaa870b30553 ("ARM: shmobile: r8a7791: add EtherAVB clock"). Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2016-02-19drm/atomic: Allow for holes in connector state, v2.Maarten Lankhorst
Because we record connector_mask using 1 << drm_connector_index now the connector_mask should stay the same even when other connectors are removed. This was not the case with MST, in that case when removing a connector all other connectors may change their index. This is fixed by waiting until the first get_connector_state to allocate connector_state, and force reallocation when state is too small. As a side effect connector arrays no longer have to be preallocated, and can be allocated on first use which means a less allocations in the page flip only path. Changes since v1: - Whitespace. (Ville) - Call ida_remove when destroying the connector. (Ville) - u32 alloc -> int. (Ville) Fixes: 14de6c44d149 ("drm/atomic: Remove drm_atomic_connectors_for_crtc.") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lyude <cpaul@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-02-19Merge tag 'topic/drm-misc-2016-02-18' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-next Misc stuff all over: - more mode_fixup removal from Carlos, there's another final pile still left. - final bits of vgaswitcheroo from Lukas for apple gmux, we're still discussing an api cleanup patch to make it a bit more abuse-safe as a follow-up - dp aux interface for userspace for tools&tests from Rafael Antognolli - actual interface parts for dma-buf flushing for userspace mmap - few small bits all over - vgaswitcheroo support for apple gmux from Lukas Wunner - checks for ->mode_fixup in non-atomic helpers from Carlos Palminha, plus removing dummy funcs from drivers. Carlos promised to follow up with more, since there's lots more silly dummy functions around. - dma-buf patches from Tiago, except the ioctl itself (that needed a respin to address review from David Herrmann) - encoder mask for atomic from Maarten - bunch of random things all over. * tag 'topic/drm-misc-2016-02-18' of git://anongit.freedesktop.org/drm-intel: (57 commits) drm/udl: Use module_usb_driver drm: fixes crct set_mode when crtc mode_fixup is null. drm/tilcdc: removed optional dummy encoder mode_fixup function. drm/sti: removed optional dummy encoder mode_fixup function. drm/rockchip: removed optional dummy encoder mode_fixup function. drm/qxl: removed optional dummy encoder mode_fixup function. drm/mgag200: removed optional dummy encoder mode_fixup function. drm/msm/mdp: removed optional dummy encoder mode_fixup function. drm/imx: removed optional dummy encoder mode_fixup function. drm/gma500: removed optional dummy encoder mode_fixup function. drm/radeon: removed optional dummy encoder mode_fixup function. drm/cirrus: removed optional dummy encoder mode_fixup function. drm/bochs: removed optional dummy encoder mode_fixup function. drm/ast: removed optional dummy encoder mode_fixup function. drm/amdgpu: removed optional dummy encoder mode_fixup function. drm/exynos: removed optional dummy encoder mode_fixup function. drm/udl: removed optional dummy encoder mode_fixup function. drm/virtio: removed optional dummy encoder mode_fixup function. drm/fb_helper: Use add_one_connector in add_all_connectors. drm/fb_helper: Use correct allocation count for arrays. ...
2016-02-18Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching fixes from Jiri Kosina: - regression (from 4.4) fix for ordering issue, introduced by an earlier ftrace change, that broke live patching of modules. The fix replaces the ftrace module notifier by direct call in order to make the ordering guaranteed and well-defined. The patch, from Jessica Yu, has been acked both by Steven and Rusty - error message fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: ftrace/module: remove ftrace module notifier livepatch: change the error message in asm/livepatch.h header files
2016-02-18Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread"Jeff Layton
This reverts commit c510eff6beba ("fsnotify: destroy marks with call_srcu instead of dedicated thread"). Eryu reported that he was seeing some OOM kills kick in when running a testcase that adds and removes inotify marks on a file in a tight loop. The above commit changed the code to use call_srcu to clean up the marks. While that does (in principle) work, the srcu callback job is limited to cleaning up entries in small batches and only once per jiffy. It's easily possible to overwhelm that machinery with too many call_srcu callbacks, and Eryu's reproduer did just that. There's also another potential problem with using call_srcu here. While you can obviously sleep while holding the srcu_read_lock, the callbacks run under local_bh_disable, so you can't sleep there. It's possible when putting the last reference to the fsnotify_mark that we'll end up putting a chain of references including the fsnotify_group, uid, and associated keys. While I don't see any obvious ways that that could occurs, it's probably still best to avoid using call_srcu here after all. This patch reverts the above patch. A later patch will take a different approach to eliminated the dedicated thread here. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reported-by: Eryu Guan <guaneryu@gmail.com> Tested-by: Eryu Guan <guaneryu@gmail.com> Cc: Jan Kara <jack@suse.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-19gpio: allow setting ARCH_NR_GPIOS from KconfigArnd Bergmann
The ARM version of asm/gpio.h basically just contains the same definitions as the gpiolib version, with the exception of ARCH_NR_GPIOS. This adds the option for overriding the constant through Kconfig to the architecture-independent header, so we can remove the ARM specific file later. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-02-19gpio: remove broken irq_to_gpio() interfaceArnd Bergmann
gpiolib has removed the irq_to_gpio() API several years ago, but the global header still provided a non-working stub. To prevent new users of this broken function from showing up, let's remove the stubs as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-02-18PCI: Update VPD definitionsHannes Reinecke
The 'end' tag is actually 0x0f; it's the representation as a small resource data type tag that's 0x78 (i.e., shifted by 3). Correct PCI_VPD_STIN_END and PCI_VPD_SRDT_END accordingly. Also, add helper functions to extract the resource data type tags for both large and small resource data types. [bhelgaas: changelog] Tested-by: Shane Seymour <shane.seymour@hpe.com> Tested-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Alexander Duyck <alexander.duyck@gmail.com>
2016-02-18vfs: define kernel_read_file_id enumerationMimi Zohar
To differentiate between the kernel_read_file() callers, this patch defines a new enumeration named kernel_read_file_id and includes the caller identifier as an argument. Subsequent patches define READING_KEXEC_IMAGE, READING_KEXEC_INITRAMFS, READING_FIRMWARE, READING_MODULE, and READING_POLICY. Changelog v3: - Replace the IMA specific enumeration with a generic one. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk>
2016-02-18vfs: define a generic function to read a file from the kernelMimi Zohar
For a while it was looked down upon to directly read files from Linux. These days there exists a few mechanisms in the kernel that do just this though to load a file into a local buffer. There are minor but important checks differences on each. This patch set is the first attempt at resolving some of these differences. This patch introduces a common function for reading files from the kernel with the corresponding security post-read hook and function. Changelog v4+: - export security_kernel_post_read_file() - Fengguang Wu v3: - additional bounds checking - Luis v2: - To simplify patch review, re-ordered patches Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk>
2016-02-18qed: Lay infrastructure for vlan filtering offloadYuval Mintz
Today, interfaces are working in vlan-promisc mode; But once vlan filtering offloaded would be supported, we'll need a method to control it directly [e.g., when setting device to PROMISC, or when running out of vlan credits]. This adds the necessary API for L2 client to manually choose whether to accept all vlans or only those for which filters were configured. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18net: Optimize local checksum offloadAlexander Duyck
This patch takes advantage of several assumptions we can make about the headers of the frame in order to reduce overall processing overhead for computing the outer header checksum. First we can assume the entire header is in the region pointed to by skb->head as this is what csum_start is based on. Second, as a result of our first assumption, we can just call csum_partial instead of making a call to skb_checksum which would end up having to configure things so that we could walk through the frags list. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18ipv6: Annotate change of locking mechanism for np->optBenjamin Poirier
follows up commit 45f6fad84cc3 ("ipv6: add complete rcu protection around np->opt") which added mixed rcu/refcount protection to np->opt. Given the current implementation of rcu_pointer_handoff(), this has no effect at runtime. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18iptunnel: scrub packet in iptunnel_pull_headerJiri Benc
Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call skb_scrub_packet directly instead. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18vxlan: tun_id is 64bit, not 32bitJiri Benc
The tun_id field in struct ip_tunnel_key is __be64, not __be32. We need to convert the vni to tun_id correctly. Fixes: 54bfd872bf16 ("vxlan: keep flags and vni in network byte order") Reported-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18mm/core, x86/mm/pkeys: Add execute-only protection keys supportDave Hansen
Protection keys provide new page-based protection in hardware. But, they have an interesting attribute: they only affect data accesses and never affect instruction fetches. That means that if we set up some memory which is set as "access-disabled" via protection keys, we can still execute from it. This patch uses protection keys to set up mappings to do just that. If a user calls: mmap(..., PROT_EXEC); or mprotect(ptr, sz, PROT_EXEC); (note PROT_EXEC-only without PROT_READ/WRITE), the kernel will notice this, and set a special protection key on the memory. It also sets the appropriate bits in the Protection Keys User Rights (PKRU) register so that the memory becomes unreadable and unwritable. I haven't found any userspace that does this today. With this facility in place, we expect userspace to move to use it eventually. Userspace _could_ start doing this today. Any PROT_EXEC calls get converted to PROT_READ inside the kernel, and would transparently be upgraded to "true" PROT_EXEC with this code. IOW, userspace never has to do any PROT_EXEC runtime detection. This feature provides enhanced protection against leaking executable memory contents. This helps thwart attacks which are attempting to find ROP gadgets on the fly. But, the security provided by this approach is not comprehensive. The PKRU register which controls access permissions is a normal user register writable from unprivileged userspace. An attacker who can execute the 'wrpkru' instruction can easily disable the protection provided by this feature. The protection key that is used for execute-only support is permanently dedicated at compile time. This is fine for now because there is currently no API to set a protection key other than this one. Despite there being a constant PKRU value across the entire system, we do not set it unless this feature is in use in a process. That is to preserve the PKRU XSAVE 'init state', which can lead to faster context switches. PKRU *is* a user register and the kernel is modifying it. That means that code doing: pkru = rdpkru() pkru |= 0x100; mmap(..., PROT_EXEC); wrpkru(pkru); could lose the bits in PKRU that enforce execute-only permissions. To avoid this, we suggest avoiding ever calling mmap() or mprotect() when the PKRU value is expected to be unstable. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave@sr71.net> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Piotr Kwapulinski <kwapulinski.piotr@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: keescook@google.com Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210240.CB4BB5CA@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18x86/mm/pkeys: Allow kernel to modify user pkey rights registerDave Hansen
The Protection Key Rights for User memory (PKRU) is a 32-bit user-accessible register. It contains two bits for each protection key: one to write-disable (WD) access to memory covered by the key and another to access-disable (AD). Userspace can read/write the register with the RDPKRU and WRPKRU instructions. But, the register is saved and restored with the XSAVE family of instructions, which means we have to treat it like a floating point register. The kernel needs to write to the register if it wants to implement execute-only memory or if it implements a system call to change PKRU. To do this, we need to create a 'pkru_state' buffer, read the old contents in to it, modify it, and then tell the FPU code that there is modified data in there so it can (possibly) move the buffer back in to the registers. This uses the fpu__xfeature_set_state() function that we defined in the previous patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210236.0BE13217@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/core, x86/mm/pkeys: Add arch_validate_pkey()Dave Hansen
The syscall-level code is passed a protection key and need to return an appropriate error code if the protection key is bogus. We will be using this in subsequent patches. Note that this also begins a series of arch-specific calls that we need to expose in otherwise arch-independent code. We create a linux/pkeys.h header where we will put *all* the stubs for these functions. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210232.774EEAAB@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()Dave Hansen
This plumbs a protection key through calc_vm_flag_bits(). We could have done this in calc_vm_prot_bits(), but I did not feel super strongly which way to go. It was pretty arbitrary which one to use. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arve Hjønnevåg <arve@android.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave@sr71.net> Cc: David Airlie <airlied@linux.ie> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Geliang Tang <geliangtang@163.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Leon Romanovsky <leon@leon.nu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Riley Andrews <riandrews@android.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: devel@driverdev.osuosl.org Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20160212210231.E6F1F0D6@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/core, x86/mm/pkeys: Differentiate instruction fetchesDave Hansen
As discussed earlier, we attempt to enforce protection keys in software. However, the code checks all faults to ensure that they are not violating protection key permissions. It was assumed that all faults are either write faults where we check PKRU[key].WD (write disable) or read faults where we check the AD (access disable) bit. But, there is a third category of faults for protection keys: instruction faults. Instruction faults never run afoul of protection keys because they do not affect instruction fetches. So, plumb the PF_INSTR bit down in to the arch_vma_access_permitted() function where we do the protection key checks. We also add a new FAULT_FLAG_INSTRUCTION. This is because handle_mm_fault() is not passed the architecture-specific error_code where we keep PF_INSTR, so we need to encode the instruction fetch information in to the arch-generic fault flags. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210224.96928009@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/core: Do not enforce PKEY permissions on remote mm accessDave Hansen
We try to enforce protection keys in software the same way that we do in hardware. (See long example below). But, we only want to do this when accessing our *own* process's memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then tried to PTRACE_POKE a target process which just happened to have some mprotect_pkey(pkey=6) memory, we do *not* want to deny the debugger access to that memory. PKRU is fundamentally a thread-local structure and we do not want to enforce it on access to _another_ thread's data. This gets especially tricky when we have workqueues or other delayed-work mechanisms that might run in a random process's context. We can check that we only enforce pkeys when operating on our *own* mm, but delayed work gets performed when a random user context is active. We might end up with a situation where a delayed-work gup fails when running randomly under its "own" task but succeeds when running under another process. We want to avoid that. To avoid that, we use the new GUP flag: FOLL_REMOTE and add a fault flag: FAULT_FLAG_REMOTE. They indicate that we are walking an mm which is not guranteed to be the same as current->mm and should not be subject to protection key enforcement. Thanks to Jerome Glisse for pointing out this scenario. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Geliang Tang <geliangtang@163.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: iommu@lists.linux-foundation.org Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18spi: docbook: fix parsing errorMartin Sperl
Fixes docbook parsing error because documentation is not directly followed by the structure, but typedef used in structure. Reordering should solve that issue. Signed-off-by: Martin Sperl <kernel@martin.sperl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-02-18nfnetlink: Revert "nfnetlink: add support for memory mapped netlink"Florian Westphal
reverts commit 3ab1f683bf8b ("nfnetlink: add support for memory mapped netlink")' Like previous commits in the series, remove wrappers that are not needed after mmapped netlink removal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18nfnetlink: remove nfnetlink_alloc_skbFlorian Westphal
Following mmapped netlink removal this code can be simplified by removing the alloc wrapper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18Revert "genl: Add genlmsg_new_unicast() for unicast message allocation"Florian Westphal
This reverts commit bb9b18fb55b0 ("genl: Add genlmsg_new_unicast() for unicast message allocation")'. Nothing wrong with it; its no longer needed since this was only for mmapped netlink support. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18netlink: remove mmapped netlink supportFlorian Westphal
mmapped netlink has a number of unresolved issues: - TX zerocopy support had to be disabled more than a year ago via commit 4682a0358639b29cf ("netlink: Always copy on mmap TX.") because the content of the mmapped area can change after netlink attribute validation but before message processing. - RX support was implemented mainly to speed up nfqueue dumping packet payload to userspace. However, since commit ae08ce0021087a5d812d2 ("netfilter: nfnetlink_queue: zero copy support") we avoid one copy with the socket-based interface too (via the skb_zerocopy helper). The other problem is that skbs attached to mmaped netlink socket behave different from normal skbs: - they don't have a shinfo area, so all functions that use skb_shinfo() (e.g. skb_clone) cannot be used. - reserving headroom prevents userspace from seeing the content as it expects message to start at skb->head. See for instance commit aa3a022094fa ("netlink: not trim skb for mmaped socket when dump"). - skbs handed e.g. to netlink_ack must have non-NULL skb->sk, else we crash because it needs the sk to check if a tx ring is attached. Also not obvious, leads to non-intuitive bug fixes such as 7c7bdf359 ("netfilter: nfnetlink: use original skbuff when acking batches"). mmaped netlink also didn't play nicely with the skb_zerocopy helper used by nfqueue and openvswitch. Daniel Borkmann fixed this via commit 6bb0fef489f6 ("netlink, mmap: fix edge-case leakages in nf queue zero-copy")' but at the cost of also needing to provide remaining length to the allocation function. nfqueue also has problems when used with mmaped rx netlink: - mmaped netlink doesn't allow use of nfqueue batch verdict messages. Problem is that in the mmap case, the allocation time also determines the ordering in which the frame will be seen by userspace (A allocating before B means that A is located in earlier ring slot, but this also means that B might get a lower sequence number then A since seqno is decided later. To fix this we would need to extend the spinlocked region to also cover the allocation and message setup which isn't desirable. - nfqueue can now be configured to queue large (GSO) skbs to userspace. Queing GSO packets is faster than having to force a software segmentation in the kernel, so this is a desirable option. However, with a mmap based ring one has to use 64kb per ring slot element, else mmap has to fall back to the socket path (NL_MMAP_STATUS_COPY) for all large packets. To use the mmap interface, userspace not only has to probe for mmap netlink support, it also has to implement a recv/socket receive path in order to handle messages that exceed the size of an rx ring element. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18tcp/dccp: fix another race at listener dismantleEric Dumazet
Ilya reported following lockdep splat: kernel: ========================= kernel: [ BUG: held lock freed! ] kernel: 4.5.0-rc1-ceph-00026-g5e0a311 #1 Not tainted kernel: ------------------------- kernel: swapper/5/0 is freeing memory ffff880035c9d200-ffff880035c9dbff, with a lock still held there! kernel: (&(&queue->rskq_lock)->rlock){+.-...}, at: [<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0 kernel: 4 locks held by swapper/5/0: kernel: #0: (rcu_read_lock){......}, at: [<ffffffff8169ef6b>] netif_receive_skb_internal+0x4b/0x1f0 kernel: #1: (rcu_read_lock){......}, at: [<ffffffff816e977f>] ip_local_deliver_finish+0x3f/0x380 kernel: #2: (slock-AF_INET){+.-...}, at: [<ffffffff81685ffb>] sk_clone_lock+0x19b/0x440 kernel: #3: (&(&queue->rskq_lock)->rlock){+.-...}, at: [<ffffffff816f6a88>] inet_csk_reqsk_queue_add+0x28/0xa0 To properly fix this issue, inet_csk_reqsk_queue_add() needs to return to its callers if the child as been queued into accept queue. We also need to make sure listener is still there before calling sk->sk_data_ready(), by holding a reference on it, since the reference carried by the child can disappear as soon as the child is put on accept queue. Reported-by: Ilya Dryomov <idryomov@gmail.com> Fixes: ebb516af60e1 ("tcp/dccp: fix race at listener dismantle phase") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18route: check and remove route cache when we get routeXin Long
Since the gc of ipv4 route was removed, the route cached would has no chance to be removed, and even it has been timeout, it still could be used, cause no code to check it's expires. Fix this issue by checking and removing route cache when we get route. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-18crypto: public_key: remove MPIs from public_key_signature structTadeusz Struk
After digsig_asymmetric.c is converted the MPIs can be now safely removed from the public_key_signature structure. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David Howells <dhowells@redhat.com>
2016-02-18mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keysDave Hansen
Today, for normal faults and page table walks, we check the VMA and/or PTE to ensure that it is compatible with the action. For instance, if we get a write fault on a non-writeable VMA, we SIGSEGV. We try to do the same thing for protection keys. Basically, we try to make sure that if a user does this: mprotect(ptr, size, PROT_NONE); *ptr = foo; they see the same effects with protection keys when they do this: mprotect(ptr, size, PROT_READ|PROT_WRITE); set_pkey(ptr, size, 4); wrpkru(0xffffff3f); // access disable pkey 4 *ptr = foo; The state to do that checking is in the VMA, but we also sometimes have to do it on the page tables only, like when doing a get_user_pages_fast() where we have no VMA. We add two functions and expose them to generic code: arch_pte_access_permitted(pte_flags, write) arch_vma_access_permitted(vma, write) These are, of course, backed up in x86 arch code with checks against the PTE or VMA's protection key. But, there are also cases where we do not want to respect protection keys. When we ptrace(), for instance, we do not want to apply the tracer's PKRU permissions to the PTEs from the process being traced. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20160212210219.14D5D715@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18signals, pkeys: Notify userspace about protection key faultsDave Hansen
A protection key fault is very similar to any other access error. There must be a VMA, etc... We even want to take the same action (SIGSEGV) that we do with a normal access fault. However, we do need to let userspace know that something is different. We do this the same way what we did with SEGV_BNDERR with Memory Protection eXtensions (MPX): define a new SEGV code: SEGV_PKUERR. We add a siginfo field: si_pkey that reveals to userspace which protection key was set on the PTE that we faulted on. There is no other easy way for userspace to figure this out. They could parse smaps but that would be a bit cruel. We share space with in siginfo with _addr_bnd. #BR faults from MPX are completely separate from page faults (#PF) that trigger from protection key violations, so we never need both at the same time. Note that _pkey is a 64-bit value. The current hardware only supports 4-bit protection keys. We do this because there is _plenty_ of space in _sigfault and it is possible that future processors would support more than 4 bits of protection keys. The x86 code to actually fill in the siginfo is in the next patch. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Amanieu d'Antras <amanieu@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210212.3A9B83AC@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18x86/mm/pkeys: Add arch-specific VMA protection bitsDave Hansen
Lots of things seem to do: vma->vm_page_prot = vm_get_page_prot(flags); and the ptes get created right from things we pull out of ->vm_page_prot. So it is very convenient if we can store the protection key in flags and vm_page_prot, just like the existing permission bits (_PAGE_RW/PRESENT). It greatly reduces the amount of plumbing and arch-specific hacking we have to do in generic code. This also takes the new PROT_PKEY{0,1,2,3} flags and turns *those* in to VM_ flags for vma->vm_flags. The protection key values are stored in 4 places: 1. "prot" argument to system calls 2. vma->vm_flags, filled from the mmap "prot" 3. vma->vm_page prot, filled from vma->vm_flags 4. the PTE itself. The pseudocode for these for steps are as follows: mmap(PROT_PKEY*) vma->vm_flags = ... | arch_calc_vm_prot_bits(mmap_prot); vma->vm_page_prot = ... | arch_vm_get_page_prot(vma->vm_flags); pte = pfn | vma->vm_page_prot Note that this provides a new definitions for x86: arch_vm_get_page_prot() Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210210.FE483A42@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18mm/core, x86/mm/pkeys: Store protection bits in high VMA flagsDave Hansen
vma->vm_flags is an 'unsigned long', so has space for 32 flags on 32-bit architectures. The high 32 bits are unused on 64-bit platforms. We've steered away from using the unused high VMA bits for things because we would have difficulty supporting it on 32-bit. Protection Keys are not available in 32-bit mode, so there is no concern about supporting this feature in 32-bit mode or on 32-bit CPUs. This patch carves out 4 bits from the high half of vma->vm_flags and allows architectures to set config option to make them available. Sparse complains about these constants unless we explicitly call them "UL". Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210208.81AF00D5@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-18Merge branch 'x86/urgent' into x86/asm, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>