summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-24KVM: s390: diagnoses are instructions as wellChristian Borntraeger
Make the diagnose counters also appear as instruction counters. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2018-01-24mt76x2: init: disable all pending tasklets during device removalLorenzo Bianconi
There is a possible race in mt76x2_stop_hardware() since pre_tbtt and dfs tasklets could run during driver cleanup. Fix it disabling all pending tasklets during device removal Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-24s390x/mm: simplify gmap_protect_rmap()David Hildenbrand
We never call it with anything but PROT_READ. This is a left over from an old prototype. For creation of shadow page tables, we always only have to protect the original table in guest memory from write accesses, so we can properly invalidate the shadow on writes. Other protections are not needed. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180123212618.32611-1-david@redhat.com> Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-24mt76x2: dfs: take into account dfs region in mt76x2_dfs_init_params()Lorenzo Bianconi
Do not enable DFS state machine if dfs region is set to NL80211_DFS_UNSET Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-24mt76x2: dfs: add set_domain handlerLorenzo Bianconi
Add mt76x2_dfs_set_domain routine in order to properly reconfigure pattern detector when DFS domain has been changed Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-24mt76x2: dfs: avoid tasklet scheduling during mt76x2_dfs_init_params()Lorenzo Bianconi
Substitute tasklet_kill with tasklet_disable/tasklet_enable in order to guarantee dfs tasklet can not be executed during dfs parameter initialization Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-24mt76: fix transmission of encrypted management framesFelix Fietkau
Hardware encryption seems to break encrypted unicast mgmt tx. Unfortunately the hardware TXWI header does not have a bit to indicate that a frame is software encrypted, so sw-encrypted frames need to use a different WCID. For that to work, the CCMP PN needs to be generated in software, which makes things a bit slower, so only do it for keys that also need to tx management frames. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-24mt76x2: fix WMM parameter configurationLorenzo Bianconi
Fix hw queue configuration since mt76x2 devices use a reverse queue enumeration respect to mac80211 one: - 0: AC_BE - 1: AC_BK - 2: AC_VI - 3: AC_VO The issue can be reproduced sending two concurrent flow using two separate queues: - VO: 20Mbps UDP traffic - BE: TCP traffic In this scenario the UDP traffic will be blocked by the TCP one. Fix it configuring properly WMM hw queue parameters Fixes: 7bc04215a66b ("mt76: add driver code for MT76x2e") Tested-by: Gaetano Catalli <gaetano.catalli@gmail.com> Signed-off-by: Gaetano Catalli <gaetano.catalli@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-24spi: a3700: Remove endianness swapping for full-duplex transfersMaxime Chevallier
Fixes the following sparse warnings : line 767: warning: incorrect type in assignment (different base types) line 767: expected unsigned int [unsigned] [assigned] [usertype] val_out line 767: got restricted __le32 [usertype] <noident> line 776: warning: cast to restricted __le32 This takes advantage of readl/writel to do the endianness reordering, and removes an extra variable in the function. Fixes: f68a7dcb91b7 ("spi: a3700: Add full-duplex support") Signed-off-by: Maxime Chevallier <maxime.chevallier@smile.fr> Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-24spi: a3700: Remove endianness swapping functions when accessing FIFOsMaxime Chevallier
Fixes the following sparse warnings : line 504: warning: incorrect type in assignment (different base types) line 504: expected unsigned int [unsigned] [usertype] val line 504: got restricted __le32 [usertype] <noident> line 527: warning: cast to restricted __le32 This is solved by removing endian-converson functions, since the converted values are going through readl/writel anyway, which take care of the conversion. Fixes: 6fd6fd68c9e2 ("spi: armada-3700: Fix padding when sending not 4-byte aligned data") Signed-off-by: Maxime Chevallier <maxime.chevallier@smile.fr> Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-24KVM: s390: add proper locking for CMMA migration bitmapChristian Borntraeger
Some parts of the cmma migration bitmap is already protected with the kvm->lock (e.g. the migration start). On the other hand the read of the cmma bits is not protected against a concurrent free, neither is the emulation of the ESSA instruction. Let's extend the locking to all related ioctls by using the slots lock for - kvm_s390_vm_start_migration - kvm_s390_vm_stop_migration - kvm_s390_set_cmma_bits - kvm_s390_get_cmma_bits In addition to that, we use synchronize_srcu before freeing the migration structure as all users hold kvm->srcu for read. (e.g. the ESSA handler). Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # 4.13+ Fixes: 190df4a212a7 (KVM: s390: CMMA tracking, ESSA emulation, migration mode) Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2018-01-24MIPS: Loongson fix name confict - MEM_RESERVEDHuacai Chen
MEM_RESERVED is used as a value of enum mem_type in include/linux/ edac.h. This will make failure to build for Loongson in some case: for example with CONFIG_RAS enabled. So here rename MEM_RESERVED to SYSTEM_RAM_RESERVED in Loongson code. Signed-off-by: YunQiang Su <yunqiang.su@imgtec.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Reviewed-by: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17724/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-01-24KVM: s390: vsie: store guest addresses of satellite blocks in vsie_pageDavid Hildenbrand
This way, the values cannot change, even if another VCPU might try to mess with the nested SCB currently getting executed by another VCPU. We now always use the same gpa for pinning and unpinning a page (for unpinning, it is only relevant to mark the guest page dirty for migration). Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180116171526.12343-3-david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-24KVM: s390: vsie: use READ_ONCE to access some SCB fieldsDavid Hildenbrand
Another VCPU might try to modify the SCB while we are creating the shadow SCB. In general this is no problem - unless the compiler decides to not load values once, but e.g. twice. For us, this is only relevant when checking/working with such values. E.g. the prefix value, the mso, state of transactional execution and addresses of satellite blocks. E.g. if we blindly forward values (e.g. general purpose registers or execution controls after masking), we don't care. Leaving unpin_blocks() untouched for now, will handle it separately. The worst thing right now that I can see would be a missed prefix un/remap (mso, prefix, tx) or using wrong guest addresses. Nothing critical, but let's try to avoid unpredictable behavior. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20180116171526.12343-2-david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-24mmc: mmci: fix error return code in mmci_probe()Wei Yongjun
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: f9bb304ce855 ("mmc: mmci: Add support for setting pad type via pinctrl") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-01-24x86/centaur: Mark TSC invariantdavidwang
Centaur CPU has a constant frequency TSC and that TSC does not stop in C-States. But because the corresponding TSC feature flags are not set for that CPU, the TSC is treated as not constant frequency and assumed to stop in C-States, which makes it an unreliable and unusable clock source. Setting those flags tells the kernel that the TSC is usable, so it will select it over HPET. The effect of this is that reading time stamps (from kernel or user space) will be faster and more efficent. Signed-off-by: davidwang <davidwang@zhaoxin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: qiyuanwang@zhaoxin.com Cc: linux-pm@vger.kernel.org Cc: brucechang@via-alliance.com Cc: cooperyan@zhaoxin.com Cc: benjaminpan@viatech.com Link: https://lkml.kernel.org/r/1516616057-5158-1-git-send-email-davidwang@zhaoxin.com
2018-01-24x86/microcode: Fix again accessing initrd after having been freedBorislav Petkov
Commit 24c2503255d3 ("x86/microcode: Do not access the initrd after it has been freed") fixed attempts to access initrd from the microcode loader after it has been freed. However, a similar KASAN warning was reported (stack trace edited): smpboot: Booting Node 0 Processor 1 APIC 0x11 ================================================================== BUG: KASAN: use-after-free in find_cpio_data+0x9b5/0xa50 Read of size 1 at addr ffff880035ffd000 by task swapper/1/0 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.8-slack #7 Hardware name: System manufacturer System Product Name/A88X-PLUS, BIOS 3003 03/10/2016 Call Trace: dump_stack print_address_description kasan_report ? find_cpio_data __asan_report_load1_noabort find_cpio_data find_microcode_in_initrd __load_ucode_amd load_ucode_amd_ap load_ucode_ap After some investigation, it turned out that a merge was done using the wrong side to resolve, leading to picking up the previous state, before the 24c2503255d3 fix. Therefore the Fixes tag below contains a merge commit. Revert the mismerge by catching the save_microcode_in_initrd_amd() retval and thus letting the function exit with the last return statement so that initrd_gone can be set to true. Fixes: f26483eaedec ("Merge branch 'x86/urgent' into x86/microcode, to resolve conflicts") Reported-by: <higuita@gmx.net> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=198295 Link: https://lkml.kernel.org/r/20180123104133.918-2-bp@alien8.de
2018-01-24x86/microcode/intel: Extend BDW late-loading further with LLC size checkJia Zhang
Commit b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check") reduced the impact of erratum BDF90 for Broadwell model 79. The impact can be reduced further by checking the size of the last level cache portion per core. Tony: "The erratum says the problem only occurs on the large-cache SKUs. So we only need to avoid the update if we are on a big cache SKU that is also running old microcode." For more details, see erratum BDF90 in document #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family Specification Update) from September 2017. Fixes: b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check") Signed-off-by: Jia Zhang <zhang.jia@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Tony Luck <tony.luck@intel.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux.alibaba.com
2018-01-24perf/x86/amd/power: Do not load AMD power module on !AMD platformsXiao Liang
The AMD power module can be loaded on non AMD platforms, but unload fails with the following Oops: BUG: unable to handle kernel NULL pointer dereference at (null) IP: __list_del_entry_valid+0x29/0x90 Call Trace: perf_pmu_unregister+0x25/0xf0 amd_power_pmu_exit+0x1c/0xd23 [power] SyS_delete_module+0x1a8/0x2b0 ? exit_to_usermode_loop+0x8f/0xb0 entry_SYSCALL_64_fastpath+0x20/0x83 Return -ENODEV instead of 0 from the module init function if the CPU does not match. Fixes: c7ab62bfbe0e ("perf/x86/amd/power: Add AMD accumulated power reporting mechanism") Signed-off-by: Xiao Liang <xiliang@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180122061252.6394-1-xiliang@redhat.com
2018-01-24irqdomain: Kill CONFIG_IRQ_DOMAIN_DEBUGMarc Zyngier
CONFIG_IRQ_DOMAIN_DEBUG is similar to CONFIG_GENERIC_IRQ_DEBUGFS, just with less information. Spring cleanup time. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yang Shunyong <shunyong.yang@hxt-semitech.com> Link: https://lkml.kernel.org/r/20180117142647.23622-1-marc.zyngier@arm.com
2018-01-24x86/retpoline: Remove the esp/rsp thunkWaiman Long
It doesn't make sense to have an indirect call thunk with esp/rsp as retpoline code won't work correctly with the stack pointer register. Removing it will help compiler writers to catch error in case such a thunk call is emitted incorrectly. Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support") Suggested-by: Jeff Law <law@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Kees Cook <keescook@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1516658974-27852-1-git-send-email-longman@redhat.com
2018-01-24ASoC: ak4613: call dummy write for PW_MGMT1/3 when PlaybackKuninori Morimoto
Power Down Release Command (PMVR, PMDAC, RSTN, PMDA1-PMDA6) which are located on PW_MGMT1 / PW_MGMT3 register must be write again after at least 5 LRCK cycle or later on each command. Otherwise, Playback volume will be 0dB. Basically, it should be 1. PowerDownRelease by Power Management1 <= call 1.x after 5LRCK 1.x Dummy write to Power Management1 2. PowerDownRelease by Power Management3 <= call 2.x after 5LRCK 2.x Dummy write to Power Management3 To avoid too many dummy write, this patch is merging these. 1. PowerDownRelease by Power Management1 2. PowerDownRelease by Power Management3 <= call after 5LRCK 2.x Dummy write to Power Management1/3 <= merge dummy write This patch adds dummy write when Playback Start timing. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-24MIPS: bcm47xx: enable ZBOOT supportAaro Koskinen
Enable ZBOOT support. The WRT54GL router's bootloader limits kernel size to 3 MB with the normal load address, which is a bit challenging vmlinux size with modern Linux. A compressed kernel allows booting much bigger kernels. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Hauke Mehrtens <hauke@hauke-m.de> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18492/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-01-24MIPS: Fix trailing semicolonLuis de Bethencourt
The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Fixes: d0f0f63ac137 ("MIPS: Rewrite csum_fold to plain C.") Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18517/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-01-24ASoC: soc-pcm: don't call flush_delayed_work() many times in ↵Kuninori Morimoto
soc_pcm_private_free() commit f523acebbb74 ("ASoC: add Component level pcm_new/pcm_free v2") added component level pcm_new/pcm_free, but flush_delayed_work() on soc_pcm_private_free() is called in for_each_rtdcom() loop. It doesn't need to be called many times. This patch moves it out of loop. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-24ovl: wire up NFS export operationsAmir Goldstein
Now that NFS export operations are implemented, enable overlayfs NFS export support if the "nfs_export" feature is enabled. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: lookup indexed ancestor of lower dirAmir Goldstein
ovl_lookup_real() in lower layer walks back lower parents to find the topmost indexed parent. If an indexed ancestor is found before reaching lower layer root, ovl_lookup_real() is called recursively with upper layer to walk back from indexed upper to the topmost connected/hashed upper parent (or up to root). ovl_lookup_real() in upper layer then walks forward to connect the topmost upper overlay dir dentry and ovl_lookup_real() in lower layer continues to walk forward to connect the decoded lower overlay dir dentry. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: lookup connected ancestor of dir in inode cacheAmir Goldstein
Decoding a dir file handle requires walking backward up to layer root and for lower dir also checking the index to see if any of the parents have been copied up. Lookup overlay ancestor dentry in inode/dentry cache by decoded real parents to shortcut looking up all the way back to layer root. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: hash non-indexed dir by upper inode for NFS exportAmir Goldstein
Non-indexed upper dirs are encoded as upper file handles. When NFS export is enabled, hash non-indexed directory inodes by upper inode, so we can find them in inode cache using the decoded upper inode. When NFS export is disabled, directories are not indexed on copy up, so hash non-indexed directory inodes by origin inode, the same hash key that is used before copy up. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode pure lower dir file handlesAmir Goldstein
Similar to decoding a pure upper dir file handle, decoding a pure lower dir file handle is implemented by looking an overlay dentry of the same path as the pure lower path and verifying that the overlay dentry's real lower matches the decoded real lower file handle. Unlike the case of upper dir file handle, the lookup of overlay path by lower real path can fail or find a mismatched overlay dentry if any of the lower parents have been copied up and renamed. To address this case we will need to check if any of the lower parents are indexed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode indexed dir file handlesAmir Goldstein
Decoding an indexed dir file handle is done by looking up the file handle in index dir by name and then decoding the upper dir from the index origin file handle. The decoded upper path is used to lookup an overlay dentry of the same path. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode lower file handles of unlinked but open filesAmir Goldstein
Lookup overlay inode in cache by origin inode, so we can decode a file handle of an open file even if the index has a whiteout index entry to mark this overlay inode was unlinked. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode indexed non-dir file handlesAmir Goldstein
Decoding an indexed non-dir file handle is similar to decoding a lower non-dir file handle, but additionally, we lookup the file handle in index dir by name to find the real upper inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode lower non-dir file handlesAmir Goldstein
Decoding a lower non-dir file handle is done by decoding the lower dentry from underlying lower fs, finding or allocating an overlay inode that is hashed by the real lower inode and instantiating an overlay dentry with that inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: encode lower file handlesAmir Goldstein
For indexed or lower non-dir, encode a non-connectable lower file handle from origin inode. For indexed or lower dir, when ofs->numlower == 1, encode a lower file handle from lower dir. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: copy up before encoding non-connectable dir file handleAmir Goldstein
Decoding a merge dir, whose origin's parent is under a redirected lower dir is not always possible. As a simple aproximation, we do not encode lower dir file handles when overlay has multiple lower layers and origin is below the topmost lower layer. We should later relax this condition and copy up only the parent that is under a redirected lower. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: encode non-indexed upper file handlesAmir Goldstein
We only need to encode origin if there is a chance that the same object was encoded pre copy up and then we need to stay consistent with the same encoding also after copy up. In case a non-pure upper is not indexed, then it was copied up before NFS export support was enabled. In that case, we don't need to worry about staying consistent with pre copy up encoding and we encode an upper file handle. This mitigates the problem that with no index, we cannot find an upper inode from origin inode, so we cannot decode a non-indexed upper from origin file handle. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode connected upper dir file handlesAmir Goldstein
Until this change, we decoded upper file handles by instantiating an overlay dentry from the real upper dentry. This is sufficient to handle pure upper files, but insufficient to handle merge/impure dirs. To that end, if decoded real upper dir is connected and hashed, we lookup an overlay dentry with the same path as the real upper dir. If decoded real upper is non-dir, we instantiate a disconnected overlay dentry as before this change. Because ovl_fh_to_dentry() returns a connected overlay dir dentry, exportfs never needs to call get_parent() and get_name() to reconnect an upper overlay dir. Because connectable non-dir file handles are not supported, exportfs will not be able to use fh_to_parent() and get_name() methods to reconnect a disconnected non-dir to its parent. Therefore, the methods get_parent() and get_name() are implemented just to print out a sanity warning and the method fh_to_parent() is implemented to warn the user that using the 'subtree_check' exportfs option is not supported. An alternative approach could have been to implement instantiating of an overlay directory inode from origin/index and implement get_parent() and get_name() by calling into underlying fs operations and them instantiating the overlay parent dir. The reasons for not choosing the get_parent() approach were: - Obtaining a disconnected overlay dir dentry would requires a delicate re-factoring of ovl_lookup() to get a dentry with overlay parent info. It was preferred to avoid doing that re-factoring unless it was proven worthy. - Going down the path of disconnected dir would mean that the (non trivial) code path of d_splice_alias() could be traveled and that meant writing more tests and introduces race cases that are very hard to hit on purpose. Taking the path of connecting overlay dentry by forward lookup is therefore the safe and boring way to avoid surprises. The culprits of the chosen "connected overlay dentry" approach: - We need to take special care to rename of ancestors while connecting the overlay dentry by real dentry path. These subtleties are usually handled by generic exportfs and VFS code. - In a hypothetical workload, we could end up in a loop trying to connect, interrupted by rename and restarting connect forever. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: decode pure upper file handlesAmir Goldstein
Decoding an upper file handle is done by decoding the upper dentry from underlying upper fs, finding or allocating an overlay inode that is hashed by the real upper inode and instantiating an overlay dentry with that inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: encode pure upper file handlesAmir Goldstein
Encode overlay file handles as struct ovl_fh containing the file handle encoding of the real upper inode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: document NFS exportAmir Goldstein
Document NFS export design. Followup patches will implement this design. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24vfs: factor out helpers d_instantiate_anon() and d_alloc_anon()Miklos Szeredi
Those helpers are going to be used by overlayfs to implement NFS export decode. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: store 'has_upper' and 'opaque' as bit flagsAmir Goldstein
We need to make some room in struct ovl_entry to store information about redirected ancestors for NFS export, so cram two booleans as bit flags. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: copy up of disconnected dentriesAmir Goldstein
With NFS export, some operations on decoded file handles (e.g. open, link, setattr, xattr_set) may call copy up with a disconnected non-dir. In this case, we will copy up lower inode to index dir without linking it to upper dir. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: use d_splice_alias() in place of d_add() in lookupAmir Goldstein
This is required for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: do not pass overlay dentry to ovl_get_inode()Amir Goldstein
This is needed for using ovl_get_inode() for decoding file handles for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: factor out ovl_get_index_fh() helperAmir Goldstein
The helper is needed to lookup an index by file handle for NFS export. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: whiteout orphan index entries on mountAmir Goldstein
Orphan index entries are non-dir index entries whose union nlink count dropped to zero. With index=on, orphan index entries are removed on mount. With NFS export feature enabled, orphan index entries are replaced with white out index entries to block future open by handle from opening the lower file. When dir index has a stale 'upper' xattr, we assume that the upper dir was removed and we treat the dir index as orphan entry that needs to be whited out or removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: whiteout index when union nlink drops to zeroAmir Goldstein
With NFS export feature enabled, when overlay inode nlink drops to zero, instead of removing the index entry, replace it with a whiteout index entry. This is needed for NFS export in order to prevent future open by handle from opening the lower file directly. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-01-24ovl: cleanup dir index when dir nlink drops to zeroAmir Goldstein
When non-dir index union nlink drops to zero the non-dir index is cleaned. Do the same for directory type index entries when union directory is removed. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>