summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-07-29NFSD: Ensure nf_inode is never dereferencedChuck Lever
The documenting comment for struct nf_file states: /* * A representation of a file that has been opened by knfsd. These are hashed * in the hashtable by inode pointer value. Note that this object doesn't * hold a reference to the inode by itself, so the nf_inode pointer should * never be dereferenced, only used for comparison. */ Replace the two existing dereferences to make the comment always true. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: NFSv4 CLOSE should release an nfsd_file immediatelyChuck Lever
The last close of a file should enable other accessors to open and use that file immediately. Leaving the file open in the filecache prevents other users from accessing that file until the filecache garbage-collects the file -- sometimes that takes several seconds. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Move nfsd_file_trace_alloc() tracepointChuck Lever
Avoid recording the allocation of an nfsd_file item that is immediately released because a matching item was already inserted in the hash. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Separate tracepoints for acquire and createChuck Lever
These tracepoints collect different information: the create case does not open a file, so there's no nf_file available. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Clean up unused code after rhashtable conversionChuck Lever
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Convert the filecache to use rhashtableChuck Lever
Enable the filecache hash table to start small, then grow with the workload. Smaller server deployments benefit because there should be lower memory utilization. Larger server deployments should see improved scaling with the number of open files. Suggested-by: Jeff Layton <jlayton@kernel.org> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Set up an rhashtable for the filecacheChuck Lever
Add code to initialize and tear down an rhashtable. The rhashtable is not used yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Replace the "init once" mechanismChuck Lever
In a moment, the nfsd_file_hashtbl global will be replaced with an rhashtable. Replace the one or two spots that need to check if the hash table is available. We can easily reuse the SHUTDOWN flag for this purpose. Document that this mechanism relies on callers to hold the nfsd_mutex to prevent init, shutdown, and purging to run concurrently. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Remove nfsd_file::nf_hashvalChuck Lever
The value in this field can always be computed from nf_inode, thus it is no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: nfsd_file_hash_remove can compute hashvalChuck Lever
Remove an unnecessary use of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Refactor __nfsd_file_close_inode()Chuck Lever
The code that computes the hashval is the same in both callers. To prevent them from going stale, reframe the documenting comments to remove descriptions of the underlying hash table structure, which is about to be replaced. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: nfsd_file_unhash can compute hashval from nf->nf_inodeChuck Lever
Remove an unnecessary usage of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Remove lockdep assertion from unhash_and_release_locked()Chuck Lever
IIUC, holding the hash bucket lock is needed only in nfsd_file_unhash, and there is already a lockdep assertion there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: No longer record nf_hashval in the trace logChuck Lever
I'm about to replace nfsd_file_hashtbl with an rhashtable. The individual hash values will no longer be visible or relevant, so remove them from the tracepoints. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Never call nfsd_file_gc() in foreground pathsChuck Lever
The checks in nfsd_file_acquire() and nfsd_file_put() that directly invoke filecache garbage collection are intended to keep cache occupancy between a low- and high-watermark. The reason to limit the capacity of the filecache is to keep filecache lookups reasonably fast. However, invoking garbage collection at those points has some undesirable negative impacts. Files that are held open by NFSv4 clients often push the occupancy of the filecache over these watermarks. At that point: - Every call to nfsd_file_acquire() and nfsd_file_put() results in an LRU walk. This has the same effect on lookup latency as long chains in the hash table. - Garbage collection will then run on every nfsd thread, causing a lot of unnecessary lock contention. - Limiting cache capacity pushes out files used only by NFSv3 clients, which are the type of files the filecache is supposed to help. To address those negative impacts, remove the direct calls to the garbage collector. Subsequent patches will address maintaining lookup efficiency as cache capacity increases. Suggested-by: Wang Yugui <wangyugui@e16-tech.com> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Fix the filecache LRU shrinkerChuck Lever
Without LRU item rotation, the shrinker visits only a few items on the end of the LRU list, and those would always be long-term OPEN files for NFSv4 workloads. That makes the filecache shrinker completely ineffective. Adopt the same strategy as the inode LRU by using LRU_ROTATE. Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Leave open files out of the filecache LRUChuck Lever
There have been reports of problems when running fstests generic/531 against Linux NFS servers with NFSv4. The NFS server that hosts the test's SCRATCH_DEV suffers from CPU soft lock-ups during the test. Analysis shows that: fs/nfsd/filecache.c 482 ret = list_lru_walk(&nfsd_file_lru, 483 nfsd_file_lru_cb, 484 &head, LONG_MAX); causes nfsd_file_gc() to walk the entire length of the filecache LRU list every time it is called (which is quite frequently). The walk holds a spinlock the entire time that prevents other nfsd threads from accessing the filecache. What's more, for NFSv4 workloads, none of the items that are visited during this walk may be evicted, since they are all files that are held OPEN by NFS clients. Address this by ensuring that open files are not kept on the LRU list. Reported-by: Frank van der Linden <fllinden@amazon.com> Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386 Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Trace filecache LRU activityChuck Lever
Observe the operation of garbage collection and the lifetime of filecache items. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: WARN when freeing an item still linked via nf_lruChuck Lever
Add a guardrail to prevent freeing memory that is still on a list. This includes either a dispose list or the LRU list. This is the sign of a bug, but this class of bugs can be detected so that they don't endanger system stability, especially while debugging. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Hook up the filecache stat fileChuck Lever
There has always been the capability of exporting filecache metrics via /proc, but it was never hooked up. Let's surface these metrics to enable better observability of the filecache. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Zero counters when the filecache is re-initializedChuck Lever
If nfsd_file_cache_init() is called after a shutdown, be sure the stat counters are reset. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Record number of flush callsChuck Lever
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Report the number of items evicted by the LRU walkChuck Lever
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Refactor nfsd_file_lru_scan()Chuck Lever
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Refactor nfsd_file_gc()Chuck Lever
Refactor nfsd_file_gc() to use the new list_lru helper. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Add nfsd_file_lru_dispose_list() helperChuck Lever
Refactor the invariant part of nfsd_file_lru_walk_list() into a separate helper function. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Report average age of filecache itemsChuck Lever
This is a measure of how long items stay in the filecache, to help assess how efficient the cache is. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Report count of freed filecache itemsChuck Lever
Surface the count of freed nfsd_file items. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Report count of calls to nfsd_file_acquire()Chuck Lever
Count the number of successful acquisitions that did not create a file (ie, acquisitions that do not result in a compulsory cache miss). This count can be compared directly with the reported hit count to compute a hit ratio. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Report filecache LRU sizeChuck Lever
Surface the NFSD filecache's LRU list length to help field troubleshooters monitor filecache issues. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Demote a WARN to a pr_warn()Chuck Lever
The call trace doesn't add much value, but it sure is noisy. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29SUNRPC: Fix server-side fault injection documentationChuck Lever
Fixes: 37324e6bb120 ("SUNRPC: Cache deferral injection") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29nfsd: remove redundant assignment to variable lenColin Ian King
Variable len is being assigned a value zero and this is never read, it is being re-assigned later. The assignment is redundant and can be removed. Cleans up clang scan-build warning: fs/nfsd/nfsctl.c:636:2: warning: Value stored to 'len' is never read Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Fix space and spelling mistakeZhang Jiaming
Add a blank space after ','. Change 'succesful' to 'successful'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NFSD: Instrument fh_verify()Chuck Lever
Capture file handles and how they map to local inodes. In particular, NFSv4 PUTFH uses fh_verify() so we can now observe which file handles are the target of OPEN, LOOKUP, RENAME, and so on. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29SUNRPC: Expand the svc_alloc_arg_err tracepointChuck Lever
Record not only the number of pages requested, but the number of pages that were actually allocated, to get a measure of progress (or lack thereof). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NLM: Defend against file_lock changes after vfs_test_lock()Benjamin Coddington
Instead of trusting that struct file_lock returns completely unchanged after vfs_test_lock() when there's no conflicting lock, stash away our nlm_lockowner reference so we can properly release it for all cases. This defends against another file_lock implementation overwriting fl_owner when the return type is F_UNLCK. Reported-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Tested-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29SUNRPC: Fix xdr_encode_bool()Chuck Lever
I discovered that xdr_encode_bool() was returning the same address that was passed in the @p parameter. The documenting comment states that the intent is to return the address of the next buffer location, just like the other "xdr_encode_*" helpers. The result was the encoded results of NFSv3 PATHCONF operations were not formed correctly. Fixes: ded04a587f6c ("NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-07-29nfsd: eliminate the NFSD_FILE_BREAK_* flagsJeff Layton
We had a report from the spring Bake-a-thon of data corruption in some nfstest_interop tests. Looking at the traces showed the NFS server allowing a v3 WRITE to proceed while a read delegation was still outstanding. Currently, we only set NFSD_FILE_BREAK_* flags if NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc. NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for COMMIT ops, where we need a writeable filehandle but don't need to break read leases. It doesn't make any sense to consult that flag when allocating a file since the file may be used on subsequent calls where we do want to break the lease (and the usage of it here seems to be reverse from what it should be anyway). Also, after calling nfsd_open_break_lease, we don't want to clear the BREAK_* bits. A lease could end up being set on it later (more than once) and we need to be able to break those leases as well. This means that the NFSD_FILE_BREAK_* flags now just mirror NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just drop those flags and unconditionally call nfsd_open_break_lease every time. Reported-by: Olga Kornieskaia <kolga@netapp.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360 Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd) Cc: <stable@vger.kernel.org> # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29clk: fixed-factor: Introduce *clk_hw_register_fixed_factor_parent_hw()Marijn Suijten
Add the devres and non-devres variant of clk_hw_register_fixed_factor_parent_hw() for registering a fixed factor clock with clk_hw parent pointer instead of parent name. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Link: https://lore.kernel.org/r/20220629225331.357308-4-marijn.suijten@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-29clk: mux: Introduce devm_clk_hw_register_mux_parent_hws()Marijn Suijten
Add the devres variant of clk_hw_register_mux_hws() for registering a mux clock with clk_hw parent pointers instead of parent names. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220629225331.357308-3-marijn.suijten@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-29clk: divider: Introduce devm_clk_hw_register_divider_parent_hw()Marijn Suijten
Add the devres variant of clk_hw_register_divider_parent_hw() for registering a divider clock with clk_hw parent pointer instead of parent name. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220629225331.357308-2-marijn.suijten@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-29dt-bindings: eeprom: microchip,93lc46b: move to eeprom directoryKrzysztof Kozlowski
Move the Atmel/Microchip 93xx46 SPI compatible EEPROM family bindings from misc to eeprom directory to properly match subsystem. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220727164424.386499-2-krzysztof.kozlowski@linaro.org
2022-07-29dt-bindings: eeprom: at25: use spi-peripheral-props.yamlKrzysztof Kozlowski
Instead of listing directly properties typical for SPI peripherals, reference the spi-peripheral-props.yaml schema. This allows using all properties typical for SPI-connected devices, even these which device bindings author did not tried yet. Remove the spi-* properties which now come via spi-peripheral-props.yaml schema, except for the cases when device schema adds some constraints like maximum frequency. While changing additionalProperties->unevaluatedProperties, put it in typical place, just before example DTS. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220727164424.386499-1-krzysztof.kozlowski@linaro.org
2022-07-29dt-bindings: display: use spi-peripheral-props.yamlKrzysztof Kozlowski
Instead of listing directly properties typical for SPI peripherals, reference the spi-peripheral-props.yaml schema. This allows using all properties typical for SPI-connected devices, even these which device bindings author did not tried yet. Remove the spi-* properties which now come via spi-peripheral-props.yaml schema, except for the cases when device schema adds some constraints like maximum frequency. While changing additionalProperties->unevaluatedProperties, put it in typical place, just before example DTS. The sitronix,st7735r references also panel-common.yaml and lists explicitly allowed properties, thus here reference only spi-peripheral-props.yaml for purpose of documenting the SPI slave device and bringing spi-max-frequency type validation. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220727164312.385836-1-krzysztof.kozlowski@linaro.org
2022-07-30random: correct spelling of "overwrites"Jason A. Donenfeld
It was missing an 'r'. Fixes: 186873c549df ("random: use simpler fast key erasure flow on per-cpu keys") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-29Merge tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fix from Jens Axboe: "Just a single fix for NVMe, yet another quirk addition" * tag 'block-5.19-2022-07-29' of git://git.kernel.dk/linux-block: nvme-pci: Crucial P2 has bogus namespace ids
2022-07-29bpf: Remove unneeded semicolonYang Li
Eliminate the following coccicheck warning: /kernel/bpf/trampoline.c:101:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220725222733.55613-1-yang.lee@linux.alibaba.com
2022-07-29libbpf: Add bpf_obj_get_opts()Joe Burton
Add an extensible variant of bpf_obj_get() capable of setting the `file_flags` parameter. This parameter is needed to enable unprivileged access to BPF maps. Without a method like this, users must manually make the syscall. Signed-off-by: Joe Burton <jevburton@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220729202727.3311806-1-jevburton.kernel@gmail.com
2022-07-29netdevsim: Avoid allocation warnings triggered from user spaceJakub Kicinski
We need to suppress warnings from sily map sizes. Also switch from GFP_USER to GFP_KERNEL_ACCOUNT, I'm pretty sure I misunderstood the flags when writing this code. Fixes: 395cacb5f1a0 ("netdevsim: bpf: support fake map offload") Reported-by: syzbot+ad24705d3fd6463b18c6@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220726213605.154204-1-kuba@kernel.org