From 667a0a06c99d5291433b869ed35dabdd95ba1453 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Thu, 18 Dec 2014 14:48:15 +0000 Subject: mm: provide a find_special_page vma operation The optional find_special_page VMA operation is used to lookup the pages backing a VMA. This is useful in cases where the normal mechanisms for finding the page don't work. This is only called if the PTE is special. One use case is a Xen PV guest mapping foreign pages into userspace. In a Xen PV guest, the PTEs contain MFNs so get_user_pages() (for example) must do an MFN to PFN (M2P) lookup before it can get the page. For foreign pages (those owned by another guest) the M2P lookup returns the PFN as seen by the foreign guest (which would be completely the wrong page for the local guest). This cannot be fixed up improving the M2P lookup since one MFN may be mapped onto two or more pages so getting the right page is impossible given just the MFN. Signed-off-by: David Vrabel Acked-by: Andrew Morton --- include/linux/mm.h | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'include') diff --git a/include/linux/mm.h b/include/linux/mm.h index 80fc92a49649..9269af7349fe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -290,6 +290,14 @@ struct vm_operations_struct { /* called by sys_remap_file_pages() to populate non-linear mapping */ int (*remap_pages)(struct vm_area_struct *vma, unsigned long addr, unsigned long size, pgoff_t pgoff); + + /* + * Called by vm_normal_page() for special PTEs to find the + * page for @addr. This is useful if the default behavior + * (using pte_page()) would not find the correct page. + */ + struct page *(*find_special_page)(struct vm_area_struct *vma, + unsigned long addr); }; struct mmu_gather; -- cgit From d8ac3dd41aea245f65465449efc35dd3ac71e91d Mon Sep 17 00:00:00 2001 From: Jennifer Herbert Date: Mon, 5 Jan 2015 13:24:09 +0000 Subject: mm: add 'foreign' alias for the 'pinned' page flag The foreign page flag will be used by Xen guests to mark pages that have grant mappings of frames from other (foreign) guests. The foreign flag is an alias for the existing (Xen-specific) pinned flag. This is safe because pinned is only used on pages used for page tables and these cannot also be foreign. Signed-off-by: Jennifer Herbert Acked-by: Andrew Morton Signed-off-by: David Vrabel --- include/linux/page-flags.h | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'include') diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index e1f5fcd79792..5ed7bdaf22d5 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -121,8 +121,12 @@ enum pageflags { PG_fscache = PG_private_2, /* page backed by cache */ /* XEN */ + /* Pinned in Xen as a read-only pagetable page. */ PG_pinned = PG_owner_priv_1, + /* Pinned as part of domain save (see xen_mm_pin_all()). */ PG_savepinned = PG_dirty, + /* Has a grant mapping of another (foreign) domain's page. */ + PG_foreign = PG_owner_priv_1, /* SLOB */ PG_slob_free = PG_private, @@ -215,6 +219,7 @@ __PAGEFLAG(Slab, slab) PAGEFLAG(Checked, checked) /* Used by some filesystems */ PAGEFLAG(Pinned, pinned) TESTSCFLAG(Pinned, pinned) /* Xen */ PAGEFLAG(SavePinned, savepinned); /* Xen */ +PAGEFLAG(Foreign, foreign); /* Xen */ PAGEFLAG(Reserved, reserved) __CLEARPAGEFLAG(Reserved, reserved) PAGEFLAG(SwapBacked, swapbacked) __CLEARPAGEFLAG(SwapBacked, swapbacked) __SETPAGEFLAG(SwapBacked, swapbacked) -- cgit From 853d0289340026b30f93fd0e768340221d4e605c Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Mon, 5 Jan 2015 14:13:41 +0000 Subject: xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() When unmapping grants, instead of converting the kernel map ops to unmap ops on the fly, pre-populate the set of unmap ops. This allows the grant unmap for the kernel mappings to be trivially batched in the future. Signed-off-by: David Vrabel Reviewed-by: Stefano Stabellini --- include/xen/grant_table.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include') diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 3387465b9caa..7235d8f35459 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -167,7 +167,7 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops, struct gnttab_map_grant_ref *kmap_ops, struct page **pages, unsigned int count); int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops, - struct gnttab_map_grant_ref *kunmap_ops, + struct gnttab_unmap_grant_ref *kunmap_ops, struct page **pages, unsigned int count); /* Perform a batch of grant map/copy operations. Retry every batch slot -- cgit From ff4b156f166b3931894d2a8b5cdba6cdf4da0618 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Thu, 8 Jan 2015 18:06:01 +0000 Subject: xen/grant-table: add helpers for allocating pages Add gnttab_alloc_pages() and gnttab_free_pages() to allocate/free pages suitable to for granted maps. Signed-off-by: David Vrabel Reviewed-by: Stefano Stabellini --- include/xen/grant_table.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'include') diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 7235d8f35459..949803e20872 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -163,6 +163,9 @@ void gnttab_free_auto_xlat_frames(void); #define gnttab_map_vaddr(map) ((void *)(map.host_virt_addr)) +int gnttab_alloc_pages(int nr_pages, struct page **pages); +void gnttab_free_pages(int nr_pages, struct page **pages); + int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops, struct gnttab_map_grant_ref *kmap_ops, struct page **pages, unsigned int count); -- cgit From 8da7633f168b5428e2cfb7342408b2c44088f5df Mon Sep 17 00:00:00 2001 From: Jennifer Herbert Date: Wed, 24 Dec 2014 14:17:06 +0000 Subject: xen: mark grant mapped pages as foreign Use the "foreign" page flag to mark pages that have a grant map. Use page->private to store information of the grant (the granting domain and the grant reference). Signed-off-by: Jennifer Herbert Reviewed-by: Stefano Stabellini Signed-off-by: David Vrabel --- include/xen/grant_table.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) (limited to 'include') diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index 949803e20872..d3bef563e8da 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -45,6 +45,8 @@ #include #include +#include +#include #define GNTTAB_RESERVED_XENSTORE 1 @@ -185,4 +187,22 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops, void gnttab_batch_map(struct gnttab_map_grant_ref *batch, unsigned count); void gnttab_batch_copy(struct gnttab_copy *batch, unsigned count); + +struct xen_page_foreign { + domid_t domid; + grant_ref_t gref; +}; + +static inline struct xen_page_foreign *xen_page_foreign(struct page *page) +{ + if (!PageForeign(page)) + return NULL; +#if BITS_PER_LONG < 64 + return (struct xen_page_foreign *)page->private; +#else + BUILD_BUG_ON(sizeof(struct xen_page_foreign) > BITS_PER_LONG); + return (struct xen_page_foreign *)&page->private; +#endif +} + #endif /* __ASM_GNTTAB_H__ */ -- cgit From 3f9f1c67572f5e5e6dc84216d48d1480f3c4fcf6 Mon Sep 17 00:00:00 2001 From: Jennifer Herbert Date: Tue, 9 Dec 2014 18:28:37 +0000 Subject: xen/grant-table: add a mechanism to safely unmap pages that are in use Introduce gnttab_unmap_refs_async() that can be used to safely unmap pages that may be in use (ref count > 1). If the pages are in use the unmap is deferred and retried later. This polling is not very clever but it should be good enough if the cases where the delay is necessary are rare. The initial delay is 5 ms and is increased linearly on each subsequent retry (to reduce load if the page is in use for a long time). This is needed to allow block backends using grant mapping to safely use network storage (block or filesystem based such as iSCSI or NFS). The network storage driver may complete a block request whilst there is a queued network packet retry (because the ack from the remote end races with deciding to queue the retry). The pages for the retried packet would be grant unmapped and the network driver (or hardware) would access the unmapped page. Signed-off-by: Jennifer Herbert Acked-by: Stefano Stabellini Signed-off-by: David Vrabel --- include/xen/grant_table.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'include') diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index d3bef563e8da..143ca5ffab7a 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -60,6 +60,22 @@ struct gnttab_free_callback { u16 count; }; +struct gntab_unmap_queue_data; + +typedef void (*gnttab_unmap_refs_done)(int result, struct gntab_unmap_queue_data *data); + +struct gntab_unmap_queue_data +{ + struct delayed_work gnttab_work; + void *data; + gnttab_unmap_refs_done done; + struct gnttab_unmap_grant_ref *unmap_ops; + struct gnttab_unmap_grant_ref *kunmap_ops; + struct page **pages; + unsigned int count; + unsigned int age; +}; + int gnttab_init(void); int gnttab_suspend(void); int gnttab_resume(void); @@ -174,6 +190,8 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops, int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops, struct gnttab_unmap_grant_ref *kunmap_ops, struct page **pages, unsigned int count); +void gnttab_unmap_refs_async(struct gntab_unmap_queue_data* item); + /* Perform a batch of grant map/copy operations. Retry every batch slot * for which the hypervisor returns GNTST_eagain. This is typically due -- cgit From 923b2919e2c318ee1c360a2119a14889fd0fcce4 Mon Sep 17 00:00:00 2001 From: David Vrabel Date: Thu, 18 Dec 2014 14:56:54 +0000 Subject: xen/gntdev: mark userspace PTEs as special on x86 PV guests In an x86 PV guest, get_user_pages_fast() on a userspace address range containing foreign mappings does not work correctly because the M2P lookup of the MFN from a userspace PTE may return the wrong page. Force get_user_pages_fast() to fail on such addresses by marking the PTEs as special. If Xen has XENFEAT_gnttab_map_avail_bits (available since at least 4.0), we can do so efficiently in the grant map hypercall. Otherwise, it needs to be done afterwards. This is both inefficient and racy (the mapping is visible to the task before we fixup the PTEs), but will be fine for well-behaved applications that do not use the mapping until after the mmap() system call returns. Guests with XENFEAT_auto_translated_physmap (ARM and x86 HVM or PVH) do not need this since get_user_pages() has always worked correctly for them. Signed-off-by: David Vrabel Reviewed-by: Stefano Stabellini --- include/xen/interface/features.h | 6 ++++++ include/xen/interface/grant_table.h | 7 +++++++ 2 files changed, 13 insertions(+) (limited to 'include') diff --git a/include/xen/interface/features.h b/include/xen/interface/features.h index 131a6ccdba25..6ad3d110bb81 100644 --- a/include/xen/interface/features.h +++ b/include/xen/interface/features.h @@ -41,6 +41,12 @@ /* x86: Does this Xen host support the MMU_PT_UPDATE_PRESERVE_AD hypercall? */ #define XENFEAT_mmu_pt_update_preserve_ad 5 +/* + * If set, GNTTABOP_map_grant_ref honors flags to be placed into guest kernel + * available pte bits. + */ +#define XENFEAT_gnttab_map_avail_bits 7 + /* x86: Does this Xen host support the HVM callback vector type? */ #define XENFEAT_hvm_callback_vector 8 diff --git a/include/xen/interface/grant_table.h b/include/xen/interface/grant_table.h index bcce56439d64..56806bc90c2f 100644 --- a/include/xen/interface/grant_table.h +++ b/include/xen/interface/grant_table.h @@ -525,6 +525,13 @@ DEFINE_GUEST_HANDLE_STRUCT(gnttab_cache_flush); #define _GNTMAP_contains_pte (4) #define GNTMAP_contains_pte (1<<_GNTMAP_contains_pte) +/* + * Bits to be placed in guest kernel available PTE bits (architecture + * dependent; only supported when XENFEAT_gnttab_map_avail_bits is set). + */ +#define _GNTMAP_guest_avail0 (16) +#define GNTMAP_guest_avail_mask ((uint32_t)~0 << _GNTMAP_guest_avail0) + /* * Values for error status returns. All errors are -ve. */ -- cgit