summaryrefslogtreecommitdiff
path: root/arch/s390/include/asm/gmap.h
AgeCommit message (Collapse)Author
2025-05-28KVM: s390: Refactor and split some gmap helpersClaudio Imbrenda
Refactor some gmap functions; move the implementation into a separate file with only helper functions. The new helper functions work on vm addresses, leaving all gmap logic in the gmap functions, which mostly become just wrappers. The whole gmap handling is going to be moved inside KVM soon, but the helper functions need to touch core mm functions, and thus need to stay in the core of kernel. Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20250528095502.226213-4-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250528095502.226213-4-imbrenda@linux.ibm.com>
2025-03-14KVM: s390: pv: fix race when making a page secureClaudio Imbrenda
Holding the pte lock for the page that is being converted to secure is needed to avoid races. A previous commit removed the locking, which caused issues. Fix by locking the pte again. Fixes: 5cbe24350b7d ("KVM: s390: move pv gmap functions into kvm") Reported-by: David Hildenbrand <david@redhat.com> Tested-by: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> [david@redhat.com: replace use of get_locked_pte() with folio_walk_start()] Link: https://lore.kernel.org/r/20250312184912.269414-2-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250312184912.269414-2-imbrenda@linux.ibm.com>
2025-01-31KVM: s390: move gmap_shadow_pgt_lookup() into kvmClaudio Imbrenda
Move gmap_shadow_pgt_lookup() from mm/gmap.c into kvm/gaccess.c . Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-13-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-13-imbrenda@linux.ibm.com>
2025-01-31KVM: s390: stop using lists to keep track of used dat tablesClaudio Imbrenda
Until now, every dat table allocated to map a guest was put in a linked list. The page->lru field of struct page was used to keep track of which pages were being used, and when the gmap is torn down, the list was walked and all pages freed. This patch gets rid of the usage of page->lru. Page tables are now freed by recursively walking the dat table tree. Since s390_unlist_old_asce() becomes useless now, remove it. Acked-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-12-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-12-imbrenda@linux.ibm.com>
2025-01-31KVM: s390: move some gmap shadowing functions away from mm/gmap.cClaudio Imbrenda
Move some gmap shadowing functions from mm/gmap.c to kvm/kvm-s390.c and the newly created kvm/gmap-vsie.c This is a step toward removing gmap from mm. Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-10-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-10-imbrenda@linux.ibm.com>
2025-01-31KVM: s390: get rid of gmap_translate()Claudio Imbrenda
Add gpa_to_hva(), which uses memslots, and use it to replace all uses of gmap_translate(). Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-9-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-9-imbrenda@linux.ibm.com>
2025-01-31KVM: s390: get rid of gmap_fault()Claudio Imbrenda
All gmap page faults are already handled in kvm by the function kvm_s390_handle_dat_fault(); only few users of gmap_fault remained, all within kvm. Convert those calls to use kvm_s390_handle_dat_fault() instead. Remove gmap_fault() entirely since it has no more users. Acked-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Link: https://lore.kernel.org/r/20250123144627.312456-8-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-8-imbrenda@linux.ibm.com>
2025-01-31KVM: s390: move pv gmap functions into kvmClaudio Imbrenda
Move gmap related functions from kernel/uv into kvm. Create a new file to collect gmap-related functions. Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> [fixed unpack_one(), thanks mhartmay@linux.ibm.com] Link: https://lore.kernel.org/r/20250123144627.312456-6-imbrenda@linux.ibm.com Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-ID: <20250123144627.312456-6-imbrenda@linux.ibm.com>
2024-11-27s390/mm: Rearrange region-third and segment table entry SW bitsGerald Schaefer
Rearrange region-third and segment table entry SW bits, in order to make room for future encoding of region/segment table swap entries. Also adjust _SEGMENT_ENTRY_GMAP_UC and _SEGMENT_ENTRY_GMAP_IN bits in gmap code. Those should only apply for gmap PMDs, and not really depend on or conflict with host PMD bits, but for consistency also adjust them: - _SEGMENT_ENTRY_GMAP_UC "dirty (migration)" was using the same bit as _SEGMENT_ENTRY_SOFT_DIRTY in the host PMD -> make it use the new SOFT_DIRTY bit 63 (0x0002) - _SEGMENT_ENTRY_GMAP_IN "invalidation notify bit" was using 0x8000, which was an unused bit in the host PMD, that is now used for _SEGMENT_ENTRY_WRITE -> make it use bit 52 (0x0800) instead, which is still unused in the host PMD This is a prerequisite for hugetlbfs PTE_MARKER support on s390, which is needed to fix a regression introduced with commit 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs"). That commit depends on the availability of swap entries for hugetlbfs, which were not available for s390 so far. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-29s390/mm/gmap: Remove gmap_{en,dis}able()Claudio Imbrenda
Remove gmap_enable(), gmap_disable(), and gmap_get_enabled() since they do not have any users anymore. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20241022120601.167009-8-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-04-18s390/mm: Re-enable the shared zeropage for !PV and !skeys KVM guestsDavid Hildenbrand
commit fa41ba0d08de ("s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs") introduced an undesired side effect when combined with memory ballooning and VM migration: memory part of the inflated memory balloon will consume memory. Assuming we have a 100GiB VM and inflated the balloon to 40GiB. Our VM will consume ~60GiB of memory. If we now trigger a VM migration, hypervisors like QEMU will read all VM memory. As s390x does not support the shared zeropage, we'll end up allocating for all previously-inflated memory part of the memory balloon: 50 GiB. So we might easily (unexpectedly) crash the VM on the migration source. Even worse, hypervisors like QEMU optimize for zeropage migration to not consume memory on the migration destination: when migrating a "page full of zeroes", on the migration destination they check whether the target memory is already zero (by reading the destination memory) and avoid writing to the memory to not allocate memory: however, s390x will also allocate memory here, implying that also on the migration destination, we will end up allocating all previously-inflated memory part of the memory balloon. This is especially bad if actual memory overcommit was not desired, when memory ballooning is used for dynamic VM memory resizing, setting aside some memory during boot that can be added later on demand. Alternatives like virtio-mem that would avoid this issue are not yet available on s390x. There could be ways to optimize some cases in user space: before reading memory in an anonymous private mapping on the migration source, check via /proc/self/pagemap if anything is already populated. Similarly check on the migration destination before reading. While that would avoid populating tables full of shared zeropages on all architectures, it's harder to get right and performant, and requires user space changes. Further, with posctopy live migration we must place a page, so there, "avoid touching memory to avoid allocating memory" is not really possible. (Note that a previously we would have falsely inserted shared zeropages into processes using UFFDIO_ZEROPAGE where mm_forbids_zeropage() would have actually forbidden it) PV is currently incompatible with memory ballooning, and in the common case, KVM guests don't make use of storage keys. Instead of zapping zeropages when enabling storage keys / PV, that turned out to be problematic in the past, let's do exactly the same we do with KSM pages: trigger unsharing faults to replace the shared zeropages by proper anonymous folios. What about added latency when enabling storage kes? Having a lot of zeropages in applicable environments (PV, legacy guests, unittests) is unexpected. Further, KSM could today already unshare the zeropages and unmerging KSM pages when enabling storage kets would unshare the KSM-placed zeropages in the same way, resulting in the same latency. [ agordeev: Fixed sparse and checkpatch complaints and error handling ] Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com> Fixes: fa41ba0d08de ("s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs") Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20240411161441.910170-3-david@redhat.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-13KVM: s390: pv: refactor s390_reset_accClaudio Imbrenda
Refactor s390_reset_acc so that it can be reused in upcoming patches. We don't want to hold all the locks used in a walk_page_range for too long, and the destroy page UVC does take some time to complete. Therefore we quickly gather the pages to destroy, and then destroy them without holding all the locks. The new refactored function optionally allows to return early without completing if a fatal signal is pending (and return and appropriate error code). Two wrappers are provided to call the new function. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Link: https://lore.kernel.org/r/20220628135619.32410-5-imbrenda@linux.ibm.com Message-Id: <20220628135619.32410-5-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-07-13KVM: s390: pv: leak the topmost page table when destroy failsClaudio Imbrenda
Each secure guest must have a unique ASCE (address space control element); we must avoid that new guests use the same page for their ASCE, to avoid errors. Since the ASCE mostly consists of the address of the topmost page table (plus some flags), we must not return that memory to the pool unless the ASCE is no longer in use. Only a successful Destroy Secure Configuration UVC will make the ASCE reusable again. If the Destroy Configuration UVC fails, the ASCE cannot be reused for a secure guest (either for the ASCE or for other memory areas). To avoid a collision, it must not be used again. This is a permanent error and the page becomes in practice unusable, so we set it aside and leak it. On failure we already leak other memory that belongs to the ultravisor (i.e. the variable and base storage for a guest) and not leaking the topmost page table was an oversight. This error (and thus the leakage) should not happen unless the hardware is broken or KVM has some unknown serious bug. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: 29b40f105ec8d55 ("KVM: s390: protvirt: Add initial vm and cpu lifecycle handling") Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20220628135619.32410-2-imbrenda@linux.ibm.com Message-Id: <20220628135619.32410-2-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2020-09-30s390: remove orphaned function declarationsVasily Gorbik
arch/s390/pci/pci_bus.h: zpci_bus_init - only declaration left after commit 05bc1be6db4b ("s390/pci: create zPCI bus") arch/s390/include/asm/gmap.h: gmap_pte_notify - only declaration left after commit 4be130a08420 ("s390/mm: add shadow gmap support") arch/s390/include/asm/pgalloc.h: rcu_table_freelist_finish - only declaration left after commit 36409f6353fc ("[S390] use generic RCU page-table freeing code") arch/s390/include/asm/tlbflush.h: smp_ptlb_all - only declaration left after commit 5a79859ae0f3 ("s390: remove 31 bit support") arch/s390/include/asm/vtimer.h: init_cpu_vtimer - only declaration left after commit b5f87f15e200 ("s390/idle: consolidate idle functions and definitions") arch/s390/include/asm/pci.h: zpci_debug_info - only declaration left after commit 386aa051fb4b ("s390/pci: remove per device debug attribute") arch/s390/include/asm/vdso.h: vdso_alloc_boot_cpu - only declaration left after commit 4bff8cb54502 ("s390: convert to GENERIC_VDSO") arch/s390/include/asm/smp.h: smp_vcpu_scheduled - only declaration left after commit 67626fadd269 ("s390: enforce CONFIG_SMP") arch/s390/kernel/entry.h: restart_call_handler - only declaration left after commit 8b646bd75908 ("[S390] rework smp code") arch/s390/kernel/entry.h: startup_init_nobss - only declaration left after commit 2e83e0eb85ca ("s390: clean .bss before running uncompressed kernel") arch/s390/kernel/entry.h: s390_early_resume - only declaration left after commit 394216275c7d ("s390: remove broken hibernate / power management support") drivers/s390/char/raw3270.h: raw3270_request_alloc_bootmem - only declaration left after commit 33403dcfcdfd ("[S390] 3270 console: convert from bootmem to slab") drivers/s390/cio/device.h: ccw_device_schedule_sch_unregister - only declaration left after commit 37de53bb5290 ("[S390] cio: introduce ccw device todos") drivers/s390/char/tape.h: tape_hotplug_event - has only declaration since recorded git history. drivers/s390/char/tape.h: tape_oper_handler - has only declaration since recorded git history. drivers/s390/char/tape.h: tape_noper_handler - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_check_locate - only declaration left after commit 161beff8f40d ("s390/tape: remove tape block leftovers") drivers/s390/char/tape_std.h: tape_std_default_handler - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_unexpect_uchk_handler - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_irq - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_error_recovery - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_error_recovery_has_failed - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_error_recovery_succeded - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_error_recovery_do_retry - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_error_recovery_read_opposite - has only declaration since recorded git history. drivers/s390/char/tape_std.h: tape_std_error_recovery_HWBUG - has only declaration since recorded git history. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-02-27KVM: s390/mm: Make pages accessible before destroying the guestChristian Borntraeger
Before we destroy the secure configuration, we better make all pages accessible again. This also happens during reboot, where we reboot into a non-secure guest that then can go again into secure mode. As this "new" secure guest will have a new ID we cannot reuse the old page state. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2020-02-27KVM: s390: protvirt: Secure memory is not mergeableJanosch Frank
KSM will not work on secure pages, because when the kernel reads a secure page, it will be encrypted and hence no two pages will look the same. Let's mark the guest pages as unmergeable when we transition to secure mode. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27s390/mm: provide memory management functions for protected KVM guestsClaudio Imbrenda
This provides the basic ultravisor calls and page table handling to cope with secure guests: - provide arch_make_page_accessible - make pages accessible after unmapping of secure guests - provide the ultravisor commands convert to/from secure - provide the ultravisor commands pin/unpin shared - provide callbacks to make pages secure (inacccessible) - we check for the expected pin count to only make pages secure if the host is not accessing them - we fence hugetlbfs for secure pages - add missing radix-tree include into gmap.h The basic idea is that a page can have 3 states: secure, normal or shared. The hypervisor can call into a firmware function called ultravisor that allows to change the state of a page: convert from/to secure. The convert from secure will encrypt the page and make it available to the host and host I/O. The convert to secure will remove the host capability to access this page. The design is that on convert to secure we will wait until writeback and page refs are indicating no host usage. At the same time the convert from secure (export to host) will be called in common code when the refcount or the writeback bit is already set. This avoids races between convert from and to secure. Then there is also the concept of shared pages. Those are kind of secure where the host can still access those pages. We need to be notified when the guest "unshares" such a page, basically doing a convert to secure by then. There is a call "pin shared page" that we use instead of convert from secure when possible. We do use PG_arch_1 as an optimization to minimize the convert from secure/pin shared. Several comments have been added in the code to explain the logic in the relevant places. Co-developed-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com> Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [borntraeger@de.ibm.com: patch merging, splitting, fixing] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-08-21s390/mm: use refcount_t for refcountChuhong Yuan
Reference counters are preferred to use refcount_t instead of atomic_t. This is because the implementation of refcount_t can prevent overflows and detect possible use-after-free. So convert atomic_t ref counters to refcount_t. Link: http://lkml.kernel.org/r/20190808071826.6649-1-hslester96@gmail.com Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2018-07-30s390/mm: Add huge page dirty sync supportJanosch Frank
To do dirty loging with huge pages, we protect huge pmds in the gmap. When they are written to, we unprotect them and mark them dirty. We introduce the function gmap_test_and_clear_dirty_pmd which handles dirty sync for huge pages. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com>
2018-07-30s390/mm: Add gmap pmd notification bit settingJanosch Frank
Like for ptes, we also need invalidation notification for pmds, to make sure the guest lowcore pages are always accessible and later addition of shadowed pmds. With PMDs we do not have PGSTEs or some other bits we could use in the host PMD. Instead we pick one of the free bits in the gmap PMD. Every time a host pmd will be invalidated, we will check if the respective gmap PMD has the bit set and in that case fire up the notifier. Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-07-30s390/mm: Abstract gmap notify bit settingJanosch Frank
Currently we use the software PGSTE bits PGSTE_IN_BIT and PGSTE_VSIE_BIT to notify before an invalidation occurs on a prefix page or a VSIE page respectively. Both bits are pgste specific, but are used when protecting a memory range. Let's introduce abstract GMAP_NOTIFY_* bits that will be realized into the respective bits when gmap DAT table entries are protected. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20KVM: s390: backup the currently enabled gmap when scheduled outDavid Hildenbrand
Nested virtualization will have to enable own gmaps. Current code would enable the wrong gmap whenever scheduled out and back in, therefore resulting in the wrong gmap being enabled. This patch reenables the last enabled gmap, therefore avoiding having to touch vcpu->arch.gmap when enabling a different gmap. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: allow to check if a gmap shadow is validDavid Hildenbrand
It will be very helpful to have a mechanism to check without any locks if a given gmap shadow is still valid and matches the given properties. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: support real-space for gmap shadowsDavid Hildenbrand
We can easily support real-space designation just like EDAT1 and EDAT2. So guest2 can provide for guest3 an asce with the real-space control being set. We simply have to allocate the biggest page table possible and fake all levels. There is no protection to consider. If we exceed guest memory, vsie code will inject an addressing exception (via program intercept). In the future, we could limit the fake table level to the gmap page table. As the top level page table can never go away, such gmap shadows will never get unshadowed, we'll have to come up with another way to limit the number of kept gmap shadows. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: support EDAT2 for gmap shadowsDavid Hildenbrand
If the guest is enabled for EDAT2, we can easily create shadows for guest2 -> guest3 provided tables that make use of EDAT2. If guest2 references a 2GB page, this memory looks consecutive for guest2, but it does not have to be so for us. Therefore we have to create fake segment and page tables. This works just like EDAT1 support, so page tables are removed when the parent table (r3t table entry) is changed. We don't hve to care about: - ACCF-Validity Control in RTTE - Access-Control Bits in RTTE - Fetch-Protection Bit in RTTE - Common-Region Bit in RTTE Just like for EDAT1, all bits might be dropped and there is no guaranteed that they are active. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: support EDAT1 for gmap shadowsDavid Hildenbrand
If the guest is enabled for EDAT1, we can easily create shadows for guest2 -> guest3 provided tables that make use of EDAT1. If guest2 references a 1MB page, this memory looks consecutive for guest2, but it might not be so for us. Therefore we have to create fake page tables. We can easily add that to our existing infrastructure. The invalidation mechanism will make sure that fake page tables are removed when the parent table (sgt table entry) is changed. As EDAT1 also introduced protection on all page table levels, we have to also shadow these correctly. We don't have to care about: - ACCF-Validity Control in STE - Access-Control Bits in STE - Fetch-Protection Bit in STE - Common-Segment Bit in STE As all bits might be dropped and there is no guaranteed that they are active ("unpredictable whether the CPU uses these bits", "may be used"). Without using EDAT1 in the shadow ourselfes (STE-format control == 0), simply shadowing these bits would not be enough. They would be ignored. Please note that we are using the "fake" flag to make this look consistent with further changes (EDAT2, real-space designation support) and don't let the shadow functions handle fc=1 stes. In the future, with huge pages in the host, gmap_shadow_pgt() could simply try to map a huge host page if "fake" is set to one and indicate via return value that no lower fake tables / shadow ptes are required. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: prepare for EDAT1/EDAT2 support in gmap shadowDavid Hildenbrand
In preparation for EDAT1/EDAT2 support for gmap shadows, we have to store the requested edat level in the gmap shadow. The edat level used during shadow translation is a property of the gmap shadow. Depending on that level, the gmap shadow will look differently for the same guest tables. We have to store it internally in order to support it later. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: fix races on gmap_shadow creationDavid Hildenbrand
Before any thread is allowed to use a gmap_shadow, it has to be fully initialized. However, for invalidation to work properly, we have to register the new gmap_shadow before we protect the parent gmap table. Because locking is tricky, and we have to avoid duplicate gmaps, let's introduce an initialized field, that signalizes other threads if that gmap_shadow can already be used or if they have to retry. Let's properly return errors using ERR_PTR() instead of simply returning NULL, so a caller can properly react on the error. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: shadow pages with real guest requested protectionDavid Hildenbrand
We really want to avoid manually handling protection for nested virtualization. By shadowing pages with the protection the guest asked us for, the SIE can handle most protection-related actions for us (e.g. special handling for MVPG) and we can directly forward protection exceptions to the guest. PTEs will now always be shadowed with the correct _PAGE_PROTECT flag. Unshadowing will take care of any guest changes to the parent PTE and any host changes to the host PTE. If the host PTE doesn't have the fitting access rights or is not available, we have to fix it up. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: add shadow gmap supportMartin Schwidefsky
For a nested KVM guest the outer KVM host needs to create shadow page tables for the nested guest. This patch adds the basic support to the guest address space (gmap) code. For each guest address space the inner KVM host creates, the first outer KVM host needs to create shadow page tables. The address space is identified by the ASCE loaded into the control register 1 at the time the inner SIE instruction for the second nested KVM guest is executed. The outer KVM host creates the shadow tables starting with the table identified by the ASCE on a on-demand basis. The outer KVM host will get repeated faults for all the shadow tables needed to run the second KVM guest. While a shadow page table for the second KVM guest is active the access to the origin region, segment and page tables needs to be restricted for the first KVM guest. For region and segment and page tables the first KVM guest may read the memory, but write attempt has to lead to an unshadow. This is done using the page invalid and read-only bits in the page table of the first KVM guest. If the first guest re-accesses one of the origin pages of a shadow, it gets a fault and the affected parts of the shadow page table hierarchy needs to be removed again. PGSTE tables don't have to be shadowed, as all interpretation assist can't deal with the invalid bits in the shadow pte being set differently than the original ones provided by the first KVM guest. Many bug fixes and improvements by David Hildenbrand. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: add reference counter to gmap structureMartin Schwidefsky
Let's use a reference counter mechanism to control the lifetime of gmap structures. This will be needed for further changes related to gmap shadows. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: extended gmap pte notifierMartin Schwidefsky
The current gmap pte notifier forces a pte into to a read-write state. If the pte is invalidated the gmap notifier is called to inform KVM that the mapping will go away. Extend this approach to allow read-write, read-only and no-access as possible target states and call the pte notifier for any change to the pte. This mechanism is used to temporarily set specific access rights for a pte without doing the heavy work of a true mprotect call. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: use RCU for gmap notifier list and the per-mm gmap listMartin Schwidefsky
The gmap notifier list and the gmap list in the mm_struct change rarely. Use RCU to optimize the reader of these lists. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/kvm: page table invalidation notifierMartin Schwidefsky
Pass an address range to the page table invalidation notifier for KVM. This allows to notify changes that affect a larger virtual memory area, e.g. for 1MB pages. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-03-08s390/mm: split arch/s390/mm/pgtable.cMartin Schwidefsky
The pgtable.c file is quite big, before it grows any larger split it into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related header definitions into the new gmap.h header and all of the pgste helpers from pgtable.h to pgtable.c. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>