summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2022-01-07fscache: Provide a means to begin an operationDavid Howells
Provide a function to begin a read operation: int fscache_begin_read_operation( struct netfs_cache_resources *cres, struct fscache_cookie *cookie) This is primarily intended to be called by network filesystems on behalf of netfslib, but may also be called to use the I/O access functions directly. It attaches the resources required by the cache to cres struct from the supplied cookie. This holds access to the cache behind the cookie for the duration of the operation and forces cache withdrawal and cookie invalidation to perform synchronisation on the operation. cres->inval_counter is set from the cookie at this point so that it can be compared at the end of the operation. Note that this does not guarantee that the cache state is fully set up and able to perform I/O immediately; looking up and creation may be left in progress in the background. The operations intended to be called by the network filesystem, such as reading and writing, are expected to wait for the cookie to move to the correct state. This will, however, potentially sleep, waiting for a certain minimum state to be set or for operations such as invalidate to advance far enough that I/O can resume. Also provide a function for the cache to call to wait for the cache object to get to a state where it can be used for certain things: bool fscache_wait_for_operation(struct netfs_cache_resources *cres, enum fscache_want_stage stage); This looks at the cache resources provided by the begin function and waits for them to get to an appropriate stage. There's a choice of wanting just some parameters (FSCACHE_WANT_PARAM) or the ability to do I/O (FSCACHE_WANT_READ or FSCACHE_WANT_WRITE). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819603692.215744.146724961588817028.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906910672.143852.13856103384424986357.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967110245.1823006.2239170567540431836.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021513617.640689.16627329360866150606.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement cookie invalidationDavid Howells
Add a function to invalidate the cache behind a cookie: void fscache_invalidate(struct fscache_cookie *cookie, const void *aux_data, loff_t size, unsigned int flags) This causes any cached data for the specified cookie to be discarded. If the cookie is marked as being in use, a new cache object will be created if possible and future I/O will use that instead. In-flight I/O should be abandoned (writes) or reconsidered (reads). Each time it is called cookie->inval_counter is incremented and this can be used to detect invalidation at the end of an I/O operation. The coherency data attached to the cookie can be updated and the cookie size should be reset. One flag is available, FSCACHE_INVAL_DIO_WRITE, which should be used to indicate invalidation due to a DIO write on a file. This will temporarily disable caching for this cookie. Changes ======= ver #2: - Should only change to inval state if can get access to cache. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819602231.215744.11206598147269491575.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906909707.143852.18056070560477964891.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967107447.1823006.5945029409592119962.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021512640.640689.11418616313147754172.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement cookie user counting and resource pinningDavid Howells
Provide a pair of functions to count the number of users of a cookie (open files, writeback, invalidation, resizing, reads, writes), to obtain and pin resources for the cookie and to prevent culling for the whilst there are users. The first function marks a cookie as being in use: void fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify); The caller should indicate the cookie to use and whether or not the caller is in a context that may modify the cookie (e.g. a file open O_RDWR). If the cookie is not already resourced, fscache will ask the cache backend in the background to do whatever it needs to look up, create or otherwise obtain the resources necessary to access data. This is pinned to the cookie and may not be culled, though it may be withdrawn if the cache as a whole is withdrawn. The second function removes the in-use mark from a cookie and, optionally, updates the coherency data: void fscache_unuse_cookie(struct fscache_cookie *cookie, const void *aux_data, const loff_t *object_size); If non-NULL, the aux_data buffer and/or the object_size will be saved into the cookie and will be set on the backing store when the object is committed. If this removes the last usage on a cookie, the cookie is placed onto an LRU list from which it will be removed and closed after a couple of seconds if it doesn't get reused. This prevents resource overload in the cache - in particular it prevents it from holding too many files open. Changes ======= ver #2: - Fix fscache_unuse_cookie() to use atomic_dec_and_lock() to avoid a potential race if the cookie gets reused before it completes the unusement. - Added missing transition to LRU_DISCARDING state. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819600612.215744.13678350304176542741.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906907567.143852.16979631199380722019.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967106467.1823006.6790864931048582667.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021511674.640689.10084988363699111860.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement simple cookie state machineDavid Howells
Implement a very simple cookie state machine to handle lookup, invalidation, withdrawal, relinquishment and, to be added later, commit on LRU discard. Three cache methods are provided: ->lookup_cookie() to look up and, if necessary, create a data storage object; ->withdraw_cookie() to free the resources associated with that object and potentially delete it; and ->prepare_to_write(), to do prepare for changes to the cached data to be modified locally. Changes ======= ver #3: - Fix a race between LRU discard and relinquishment whereby the former would override the latter and thus the latter would never happen[1]. ver #2: - Don't hold n_accesses elevated whilst cache is bound to a cookie, but rather add a flag that prevents the state machine from being queued when n_accesses reaches 0. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/599331.1639410068@warthog.procyon.org.uk/ [1] Link: https://lore.kernel.org/r/163819599657.215744.15799615296912341745.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906903925.143852.1805855338154353867.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967105456.1823006.14730395299835841776.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021510706.640689.7961423370243272583.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Add a function for a cache backend to note an I/O errorDavid Howells
Add a function to the backend API to note an I/O error in a cache. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819598741.215744.891281275151382095.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906901316.143852.15225412215771586528.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967100721.1823006.16435671567428949398.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021508840.640689.11902836226570620424.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Provide and use cache methods to lookup/create/free a volumeDavid Howells
Add cache methods to lookup, create and remove a volume. Looking up or creating the volume requires the cache pinning for access; freeing the volume requires the volume pinning for access. The ->acquire_volume() method is used to ask the cache backend to lookup and, if necessary, create a volume; the ->free_volume() method is used to free the resources for a volume. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819597821.215744.5225318658134989949.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906898645.143852.8537799955945956818.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967099771.1823006.1455197910571061835.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021507345.640689.4073511598838843040.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement functions add/remove a cacheDavid Howells
Implement functions to allow the cache backend to add or remove a cache: (1) Declare a cache to be live: int fscache_add_cache(struct fscache_cache *cache, const struct fscache_cache_ops *ops, void *cache_priv); Take a previously acquired cache cookie, set the operations table and private data and mark the cache open for access. (2) Withdraw a cache from service: void fscache_withdraw_cache(struct fscache_cache *cache); This marks the cache as withdrawn and thus prevents further cache-level and volume-level accesses. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819596022.215744.8799712491432238827.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906896599.143852.17049208999019262884.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967097870.1823006.3470041000971522030.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021505541.640689.1819714759326331054.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement cookie-level access helpersDavid Howells
Add a number of helper functions to manage access to a cookie, pinning the cache object in place for the duration to prevent cache withdrawal from removing it: (1) void fscache_init_access_gate(struct fscache_cookie *cookie); This function initialises the access count when a cache binds to a cookie. An extra ref is taken on the access count to prevent wakeups while the cache is active. We're only interested in the wakeup when a cookie is being withdrawn and we're waiting for it to quiesce - at which point the counter will be decremented before the wait. The FSCACHE_COOKIE_NACC_ELEVATED flag is set on the cookie to keep track of the extra ref in order to handle a race between relinquishment and withdrawal both trying to drop the extra ref. (2) bool fscache_begin_cookie_access(struct fscache_cookie *cookie, enum fscache_access_trace why); This function attempts to begin access upon a cookie, pinning it in place if it's cached. If successful, it returns true and leaves a the access count incremented. (3) void fscache_end_cookie_access(struct fscache_cookie *cookie, enum fscache_access_trace why); This function drops the access count obtained by (2), permitting object withdrawal to take place when it reaches zero. A tracepoint is provided to track changes to the access counter on a cookie. Changes ======= ver #2: - Don't hold n_accesses elevated whilst cache is bound to a cookie, but rather add a flag that prevents the state machine from being queued when n_accesses reaches 0. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819595085.215744.1706073049250505427.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906895313.143852.10141619544149102193.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967095980.1823006.1133648159424418877.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021503063.640689.8870918985269528670.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement volume-level access helpersDavid Howells
Add a pair of helper functions to manage access to a volume, pinning the volume in place for the duration to prevent cache withdrawal from removing it: bool fscache_begin_volume_access(struct fscache_volume *volume, enum fscache_access_trace why); void fscache_end_volume_access(struct fscache_volume *volume, enum fscache_access_trace why); The way the access gate on the volume works/will work is: (1) If the cache tests as not live (state is not FSCACHE_CACHE_IS_ACTIVE), then we return false to indicate access was not permitted. (2) If the cache tests as live, then we increment the volume's n_accesses count and then recheck the cache liveness, ending the access if it ceased to be live. (3) When we end the access, we decrement the volume's n_accesses and wake up the any waiters if it reaches 0. (4) Whilst the cache is caching, the volume's n_accesses is kept artificially incremented to prevent wakeups from happening. (5) When the cache is taken offline, the state is changed to prevent new accesses, the volume's n_accesses is decremented and we wait for it to become 0. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819594158.215744.8285859817391683254.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906894315.143852.5454793807544710479.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967095028.1823006.9173132503876627466.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021501546.640689.9631510472149608443.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement cookie registrationDavid Howells
Add functions to the fscache API to allow data file cookies to be acquired and relinquished by the network filesystem. It is intended that the filesystem will create such cookies per-inode under a volume. To request a cookie, the filesystem should call: struct fscache_cookie * fscache_acquire_cookie(struct fscache_volume *volume, u8 advice, const void *index_key, size_t index_key_len, const void *aux_data, size_t aux_data_len, loff_t object_size) The filesystem must first have created a volume cookie, which is passed in here. If it passes in NULL then the function will just return a NULL cookie. A binary key should be passed in index_key and is of size index_key_len. This is saved in the cookie and is used to locate the associated data in the cache. A coherency data buffer of size aux_data_len will be allocated and initialised from the buffer pointed to by aux_data. This is used to validate cache objects when they're opened and is stored on disk with them when they're committed. The data is stored in the cookie and will be updateable by various functions in later patches. The object_size must also be given. This is also used to perform a coherency check and to size the backing storage appropriately. This function disallows a cookie from being acquired twice in parallel, though it will cause the second user to wait if the first is busy relinquishing its cookie. When a network filesystem has finished with a cookie, it should call: void fscache_relinquish_cookie(struct fscache_volume *volume, bool retire) If retire is true, any backing data will be discarded immediately. Changes ======= ver #3: - fscache_hash()'s size parameter is now in bytes. Use __le32 as the unit to round up to. - When comparing cookies, simply see if the attributes are the same rather than subtracting them to produce a strcmp-style return[1]. - Add a check to see if the cookie is still hashed at the point of freeing. ver #2: - Don't hold n_accesses elevated whilst cache is bound to a cookie, but rather add a flag that prevents the state machine from being queued when n_accesses reaches 0. - Remove the unused cookie pointer field from the fscache_acquire tracepoint. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=whtkzB446+hX0zdLsdcUJsJ=8_-0S1mE_R+YurThfUbLA@mail.gmail.com/ [1] Link: https://lore.kernel.org/r/163819590658.215744.14934902514281054323.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906891983.143852.6219772337558577395.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967088507.1823006.12659006350221417165.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021498432.640689.12743483856927722772.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement volume registrationDavid Howells
Add functions to the fscache API to allow volumes to be acquired and relinquished by the network filesystem. A volume is an index of data storage cache objects. A volume is represented by a volume cookie in the API. A filesystem would typically create a volume for a superblock and then create per-inode cookies within it. To request a volume, the filesystem calls: struct fscache_volume * fscache_acquire_volume(const char *volume_key, const char *cache_name, const void *coherency_data, size_t coherency_len) The volume_key is a printable string used to match the volume in the cache. It should not contain any '/' characters. For AFS, for example, this would be "afs,<cellname>,<volume_id>", e.g. "afs,example.com,523001". The cache_name can be NULL, but if not it should be a string indicating the name of the cache to use if there's more than one available. The coherency data, if given, is an arbitrarily-sized blob that's attached to the volume and is compared when the volume is looked up. If it doesn't match, the old volume is judged to be out of date and it and everything within it is discarded. Acquiring a volume twice concurrently is disallowed, though the function will wait if an old volume cookie is being relinquishing. When a network filesystem has finished with a volume, it should return the volume cookie by calling: void fscache_relinquish_volume(struct fscache_volume *volume, const void *coherency_data, bool invalidate) If invalidate is true, the entire volume will be discarded; if false, the volume will be synced and the coherency data will be updated. Changes ======= ver #4: - Removed an extraneous param from kdoc on fscache_relinquish_volume()[3]. ver #3: - fscache_hash()'s size parameter is now in bytes. Use __le32 as the unit to round up to. - When comparing cookies, simply see if the attributes are the same rather than subtracting them to produce a strcmp-style return[2]. - Make the coherency data an arbitrary blob rather than a u64, but don't store it for the moment. ver #2: - Fix error check[1]. - Make a fscache_acquire_volume() return errors, including EBUSY if a conflicting volume cookie already exists. No error is printed now - that's left to the netfs. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/20211203095608.GC2480@kili/ [1] Link: https://lore.kernel.org/r/CAHk-=whtkzB446+hX0zdLsdcUJsJ=8_-0S1mE_R+YurThfUbLA@mail.gmail.com/ [2] Link: https://lore.kernel.org/r/20211220224646.30e8205c@canb.auug.org.au/ [3] Link: https://lore.kernel.org/r/163819588944.215744.1629085755564865996.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906890630.143852.13972180614535611154.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967086836.1823006.8191672796841981763.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021495816.640689.4403156093668590217.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Implement cache registrationDavid Howells
Implement a register of caches and provide functions to manage it. Two functions are provided for the cache backend to use: (1) Acquire a cache cookie: struct fscache_cache *fscache_acquire_cache(const char *name) This gets the cache cookie for a cache of the specified name and moves it to the preparation state. If a nameless cache cookie exists, that will be given this name and used. (2) Relinquish a cache cookie: void fscache_relinquish_cache(struct fscache_cache *cache); This relinquishes a cache cookie, cleans it and makes it available if it's still referenced by a network filesystem. Note that network filesystems don't deal with cache cookies directly, but rather go straight to the volume registration. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819587157.215744.13523139317322503286.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906889665.143852.10378009165231294456.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967085081.1823006.2218944206363626210.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021494847.640689.10109692261640524343.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Introduce new driverDavid Howells
Introduce basic skeleton of the new, rewritten fscache driver. Changes ======= ver #3: - Use remove_proc_subtree(), not remove_proc_entry() to remove a populated dir. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819584034.215744.4290533472390439030.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906887770.143852.3577888294989185666.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967080039.1823006.5702921801104057922.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021491014.640689.4292699878317589512.stgit@warthog.procyon.org.uk/ # v4
2022-01-07netfs: Pass a flag to ->prepare_write() to say if there's no alloc'd spaceDavid Howells
Pass a flag to ->prepare_write() to indicate if there's definitely no space allocated in the cache yet (for instance if we've already checked as we were asked to do a read). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819583123.215744.12783808230464471417.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906886835.143852.6689886781122679769.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967079100.1823006.12889542712309574359.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021489334.640689.3131206613015409076.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache: Remove the contents of the fscache driver, pending rewriteDavid Howells
Remove the code that comprises the fscache driver as it's going to be substantially rewritten, with the majority of the code being erased in the rewrite. A small piece of linux/fscache.h is left as that is #included by a bunch of network filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819578724.215744.18210619052245724238.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906884814.143852.6727245089843862889.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967077097.1823006.1377665951499979089.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021485548.640689.13876080567388696162.stgit@warthog.procyon.org.uk/ # v4
2022-01-06Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Alexei Starovoitov says: ==================== pull-request: bpf-next 2022-01-06 We've added 41 non-merge commits during the last 2 day(s) which contain a total of 36 files changed, 1214 insertions(+), 368 deletions(-). The main changes are: 1) Various fixes in the verifier, from Kris and Daniel. 2) Fixes in sockmap, from John. 3) bpf_getsockopt fix, from Kuniyuki. 4) INET_POST_BIND fix, from Menglong. 5) arm64 JIT fix for bpf pseudo funcs, from Hou. 6) BPF ISA doc improvements, from Christoph. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits) bpf: selftests: Add bind retry for post_bind{4, 6} bpf: selftests: Use C99 initializers in test_sock.c net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() bpf/selftests: Test bpf_d_path on rdonly_mem. libbpf: Add documentation for bpf_map batch operations selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3 xdp: Add xdp_do_redirect_frame() for pre-computed xdp_frames xdp: Move conversion to xdp_frame out of map functions page_pool: Store the XDP mem id page_pool: Add callback to init pages when they are allocated xdp: Allow registering memory model without rxq reference samples/bpf: xdpsock: Add timestamp for Tx-only operation samples/bpf: xdpsock: Add time-out for cleaning Tx samples/bpf: xdpsock: Add sched policy and priority support samples/bpf: xdpsock: Add cyclic TX operation capability samples/bpf: xdpsock: Add clockid selection support samples/bpf: xdpsock: Add Dest and Src MAC setting for Tx-only operation samples/bpf: xdpsock: Add VLAN support for Tx-only operation libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API libbpf 1.0: Deprecate bpf_map__is_offload_neutral() ... ==================== Link: https://lore.kernel.org/r/20220107013626.53943-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06ACPI: APD: Add a fmw property clk-nameAjit Kumar Pandey
Add a new device property to fetch clk-name from firmware. Signed-off-by: Ajit Kumar Pandey <AjitKumar.Pandey@amd.com> Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com> Link: https://lore.kernel.org/r/20211212180527.1641362-4-AjitKumar.Pandey@amd.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-01-06drivers: acpi: acpi_apd: Remove unused device property "is-rv"Ajit Kumar Pandey
Initially "is-rv" device property is added for 48MHz fixed clock support on Raven or RV architecture. It's unused now as we moved to pci device_id based selection to extend such support on other architectures. This change removed unused code from acpi driver. Signed-off-by: Ajit Kumar Pandey <AjitKumar.Pandey@amd.com> Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com> Link: https://lore.kernel.org/r/20211212180527.1641362-3-AjitKumar.Pandey@amd.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-01-06net/mlx5: Introduce API for bulk request and release of IRQsShay Drory
Currently IRQs are requested one by one. To balance spreading IRQs among cpus using such scheme requires remembering cpu mask for the cpus used for a given device. This complicates the IRQ allocation scheme in subsequent patch. Hence, prepare the code for bulk IRQs allocation. This enables spreading IRQs among cpus in subsequent patch. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-01-07random: remove unused irq_flags argument from add_interrupt_randomness()Sebastian Andrzej Siewior
Since commit ee3e00e9e7101 ("random: use registers from interrupted code for CPU's w/o a cycle counter") the irq_flags argument is no longer used. Remove unused irq_flags. Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Liu <wei.liu@kernel.org> Cc: linux-hyperv@vger.kernel.org Cc: x86@kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-01-07Merge tag 'amd-drm-fixes-5.16-2021-12-31' of ↵Dave Airlie
ssh://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.16-2021-12-31: amdgpu: - Suspend/resume fix - Restore runtime pm behavior with efifb Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211231143825.11479-1-alexander.deucher@amd.com
2022-01-06mm: Remove slab from struct pageMatthew Wilcox (Oracle)
All members of struct slab can now be removed from struct page. This shrinks the definition of struct page by 30 LOC, making it easier to understand. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-01-06Merge branch 'core' of ↵Vlastimil Babka
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into slab-struct_slab-part2-v1 Merge iommu tree for a series that removes usage of struct page 'freelist' field.
2022-01-06lib/raid6: Use strict priority ranking for pq gen() benchmarkingDirk Müller
On x86_64, currently 3 variants of AVX512, 3 variants of AVX2 and 3 variants of SSE2 are benchmarked on initialization, taking between 144-153 jiffies. Testing across a hardware pool of various generations of intel cpus I could not find a single case where SSE2 won over AVX2 or AVX512. There are cases where AVX2 wins over AVX512 however. Change "prefer" into an integer priority field (similar to how recov selection works) to have more than one ranking level available, which is backwards compatible with existing behavior. Give AVX2/512 variants higher priority over SSE2 in order to skip SSE testing when AVX is available. in a AVX2/x86_64/HZ=250 case this saves in the order of 200ms of initialization time. Signed-off-by: Dirk Müller <dmueller@suse.de> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org>
2022-01-06HID: address kernel-doc warningsLukas Bulwahn
The command ./scripts/kernel-doc -none include/linux/hid.h reports: include/linux/hid.h:818: warning: cannot understand function prototype: 'struct hid_ll_driver ' include/linux/hid.h:1135: warning: expecting prototype for hid_may_wakeup(). Prototype was for hid_hw_may_wakeup() instead Address those kernel-doc warnings. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-01-06gro: add ability to control gro max packet sizeCoco Li
Eric Dumazet suggested to allow users to modify max GRO packet size. We have seen GRO being disabled by users of appliances (such as wifi access points) because of claimed bufferbloat issues, or some work arounds in sch_cake, to split GRO/GSO packets. Instead of disabling GRO completely, one can chose to limit the maximum packet size of GRO packets, depending on their latency constraints. This patch adds a per device gro_max_size attribute that can be changed with ip link command. ip link set dev eth0 gro_max_size 16000 Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Coco Li <lixiaoyan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06net: fix SOF_TIMESTAMPING_BIND_PHC to work with multiple socketsMiroslav Lichvar
When multiple sockets using the SOF_TIMESTAMPING_BIND_PHC flag received a packet with a hardware timestamp (e.g. multiple PTP instances in different PTP domains using the UDPv4/v6 multicast or L2 transport), the timestamps received on some sockets were corrupted due to repeated conversion of the same timestamp (by the same or different vclocks). Fix ptp_convert_timestamp() to not modify the shared skb timestamp and return the converted timestamp as a ktime_t instead. If the conversion fails, return 0 to not confuse the application with timestamps corresponding to an unexpected PHC. Fixes: d7c088265588 ("net: socket: support hardware timestamp conversion to PHC bound") Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Cc: Richard Cochran <richardcochran@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06bootmem: Use page->index instead of page->freelistMatthew Wilcox (Oracle)
page->freelist is for the use of slab. Using page->index is the same set of bits as page->freelist, and by using an integer instead of a pointer, we can avoid casts. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: <x86@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com>
2022-01-06mm/kasan: Convert to struct folio and struct slabMatthew Wilcox (Oracle)
KASAN accesses some slab related struct page fields so we need to convert it to struct slab. Some places are a bit simplified thanks to kasan_addr_to_slab() encapsulating the PageSlab flag check through virt_to_slab(). When resolving object address to either a real slab or a large kmalloc, use struct folio as the intermediate type for testing the slab flag to avoid unnecessary implicit compound_head(). [ vbabka@suse.cz: use struct folio, adjust to differences in previous patches ] Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Tested-by: Hyeongogn Yoo <42.hyeyoo@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <kasan-dev@googlegroups.com>
2022-01-06mm/memcg: Convert slab objcgs from struct page to struct slabVlastimil Babka
page->memcg_data is used with MEMCG_DATA_OBJCGS flag only for slab pages so convert all the related infrastructure to struct slab. Also use struct folio instead of struct page when resolving object pointers. This is not just mechanistic changing of types and names. Now in mem_cgroup_from_obj() we use folio_test_slab() to decide if we interpret the folio as a real slab instead of a large kmalloc, instead of relying on MEMCG_DATA_OBJCGS bit that used to be checked in page_objcgs_check(). Similarly in memcg_slab_free_hook() where we can encounter kmalloc_large() pages (here the folio slab flag check is implied by virt_to_slab()). As a result, page_objcgs_check() can be dropped instead of converted. To avoid include cycles, move the inline definition of slab_objcgs() from memcontrol.h to mm/slab.h. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <cgroups@vger.kernel.org>
2022-01-06mm: Convert struct page to struct slab in functions used by other subsystemsVlastimil Babka
KASAN, KFENCE and memcg interact with SLAB or SLUB internals through functions nearest_obj(), obj_to_index() and objs_per_slab() that use struct page as parameter. This patch converts it to struct slab including all callers, through a coccinelle semantic patch. // Options: --include-headers --no-includes --smpl-spacing include/linux/slab_def.h include/linux/slub_def.h mm/slab.h mm/kasan/*.c mm/kfence/kfence_test.c mm/memcontrol.c mm/slab.c mm/slub.c // Note: needs coccinelle 1.1.1 to avoid breaking whitespace @@ @@ -objs_per_slab_page( +objs_per_slab( ... ) { ... } @@ @@ -objs_per_slab_page( +objs_per_slab( ... ) @@ identifier fn =~ "obj_to_index|objs_per_slab"; @@ fn(..., - const struct page *page + const struct slab *slab ,...) { <... ( - page_address(page) + slab_address(slab) | - page + slab ) ...> } @@ identifier fn =~ "nearest_obj"; @@ fn(..., - struct page *page + const struct slab *slab ,...) { <... ( - page_address(page) + slab_address(slab) | - page + slab ) ...> } @@ identifier fn =~ "nearest_obj|obj_to_index|objs_per_slab"; expression E; @@ fn(..., ( - slab_page(E) + E | - virt_to_page(E) + virt_to_slab(E) | - virt_to_head_page(E) + virt_to_slab(E) | - page + page_slab(page) ) ,...) Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Julia Lawall <julia.lawall@inria.fr> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <kasan-dev@googlegroups.com> Cc: <cgroups@vger.kernel.org>
2022-01-06mm/slub: Finish struct page to struct slab conversionVlastimil Babka
Update comments mentioning pages to mention slabs where appropriate. Also some goto labels. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Roman Gushchin <guro@fb.com>
2022-01-06mm/slub: Convert most struct page to struct slab by spatchVlastimil Babka
The majority of conversion from struct page to struct slab in SLUB internals can be delegated to a coccinelle semantic patch. This includes renaming of variables with 'page' in name to 'slab', and similar. Big thanks to Julia Lawall and Luis Chamberlain for help with coccinelle. // Options: --include-headers --no-includes --smpl-spacing include/linux/slub_def.h mm/slub.c // Note: needs coccinelle 1.1.1 to avoid breaking whitespace, and ocaml for the // embedded script // build list of functions to exclude from applying the next rule @initialize:ocaml@ @@ let ok_function p = not (List.mem (List.hd p).current_element ["nearest_obj";"obj_to_index";"objs_per_slab_page";"__slab_lock";"__slab_unlock";"free_nonslab_page";"kmalloc_large_node"]) // convert the type from struct page to struct page in all functions except the // list from previous rule // this also affects struct kmem_cache_cpu, but that's ok @@ position p : script:ocaml() { ok_function p }; @@ - struct page@p + struct slab // in struct kmem_cache_cpu, change the name from page to slab // the type was already converted by the previous rule @@ @@ struct kmem_cache_cpu { ... -struct slab *page; +struct slab *slab; ... } // there are many places that use c->page which is now c->slab after the // previous rule @@ struct kmem_cache_cpu *c; @@ -c->page +c->slab @@ @@ struct kmem_cache { ... - unsigned int cpu_partial_pages; + unsigned int cpu_partial_slabs; ... } @@ struct kmem_cache *s; @@ - s->cpu_partial_pages + s->cpu_partial_slabs @@ @@ static void - setup_page_debug( + setup_slab_debug( ...) {...} @@ @@ - setup_page_debug( + setup_slab_debug( ...); // for all functions (with exceptions), change any "struct slab *page" // parameter to "struct slab *slab" in the signature, and generally all // occurences of "page" to "slab" in the body - with some special cases. @@ identifier fn !~ "free_nonslab_page|obj_to_index|objs_per_slab_page|nearest_obj"; @@ fn(..., - struct slab *page + struct slab *slab ,...) { <... - page + slab ...> } // similar to previous but the param is called partial_page @@ identifier fn; @@ fn(..., - struct slab *partial_page + struct slab *partial_slab ,...) { <... - partial_page + partial_slab ...> } // similar to previous but for functions that take pointer to struct page ptr @@ identifier fn; @@ fn(..., - struct slab **ret_page + struct slab **ret_slab ,...) { <... - ret_page + ret_slab ...> } // functions converted by previous rules that were temporarily called using // slab_page(E) so we want to remove the wrapper now that they accept struct // slab ptr directly @@ identifier fn =~ "slab_free|do_slab_free"; expression E; @@ fn(..., - slab_page(E) + E ,...) // similar to previous but for another pattern @@ identifier fn =~ "slab_pad_check|check_object"; @@ fn(..., - folio_page(folio, 0) + slab ,...) // functions that were returning struct page ptr and now will return struct // slab ptr, including slab_page() wrapper removal @@ identifier fn =~ "allocate_slab|new_slab"; expression E; @@ static -struct slab * +struct slab * fn(...) { <... - slab_page(E) + E ...> } // rename any former struct page * declarations @@ @@ struct slab * ( - page + slab | - partial_page + partial_slab | - oldpage + oldslab ) ; // this has to be separate from previous rule as page and page2 appear at the // same line @@ @@ struct slab * -page2 +slab2 ; // similar but with initial assignment @@ expression E; @@ struct slab * ( - page + slab | - flush_page + flush_slab | - discard_page + slab_to_discard | - page_to_unfreeze + slab_to_unfreeze ) = E; // convert most of struct page to struct slab usage inside functions (with // exceptions), including specific variable renames @@ identifier fn !~ "nearest_obj|obj_to_index|objs_per_slab_page|__slab_(un)*lock|__free_slab|free_nonslab_page|kmalloc_large_node"; expression E; @@ fn(...) { <... ( - int pages; + int slabs; | - int pages = E; + int slabs = E; | - page + slab | - flush_page + flush_slab | - partial_page + partial_slab | - oldpage->pages + oldslab->slabs | - oldpage + oldslab | - unsigned int nr_pages; + unsigned int nr_slabs; | - nr_pages + nr_slabs | - unsigned int partial_pages = E; + unsigned int partial_slabs = E; | - partial_pages + partial_slabs ) ...> } // this has to be split out from the previous rule so that lines containing // multiple matching changes will be fully converted @@ identifier fn !~ "nearest_obj|obj_to_index|objs_per_slab_page|__slab_(un)*lock|__free_slab|free_nonslab_page|kmalloc_large_node"; @@ fn(...) { <... ( - slab->pages + slab->slabs | - pages + slabs | - page2 + slab2 | - discard_page + slab_to_discard | - page_to_unfreeze + slab_to_unfreeze ) ...> } // after we simply changed all occurences of page to slab, some usages need // adjustment for slab-specific functions, or use slab_page() wrapper @@ identifier fn !~ "nearest_obj|obj_to_index|objs_per_slab_page|__slab_(un)*lock|__free_slab|free_nonslab_page|kmalloc_large_node"; @@ fn(...) { <... ( - page_slab(slab) + slab | - kasan_poison_slab(slab) + kasan_poison_slab(slab_page(slab)) | - page_address(slab) + slab_address(slab) | - page_size(slab) + slab_size(slab) | - PageSlab(slab) + folio_test_slab(slab_folio(slab)) | - page_to_nid(slab) + slab_nid(slab) | - compound_order(slab) + slab_order(slab) ) ...> } Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Roman Gushchin <guro@fb.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Julia Lawall <julia.lawall@inria.fr> Cc: Luis Chamberlain <mcgrof@kernel.org>
2022-01-06mm: Convert check_heap_object() to use struct slabMatthew Wilcox (Oracle)
Ensure that we're not seeing a tail page inside __check_heap_object() by converting to a slab instead of a page. Take the opportunity to mark the slab as const since we're not modifying it. Also move the declaration of __check_heap_object() to mm/slab.h so it's not available to the wider kernel. [ vbabka@suse.cz: in check_heap_object() only convert to struct slab for actual PageSlab pages; use folio as intermediate step instead of page ] Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Roman Gushchin <guro@fb.com>
2022-01-06mm: Split slab into its own typeMatthew Wilcox (Oracle)
Make struct slab independent of struct page. It still uses the underlying memory in struct page for storing slab-specific data, but slab and slub can now be weaned off using struct page directly. Some of the wrapper functions (slab_address() and slab_order()) still need to cast to struct folio, but this is a significant disentanglement. [ vbabka@suse.cz: Rebase on folios, use folio instead of page where possible. Do not duplicate flags field in struct slab, instead make the related accessors go through slab_folio(). For testing pfmemalloc use the folio_*_active flag accessors directly so the PageSlabPfmemalloc wrappers can be removed later. Make folio_slab() expect only folio_test_slab() == true folios and virt_to_slab() return NULL when folio_test_slab() == false. Move struct slab to mm/slab.h. Don't represent with struct slab pages that are not true slab pages, but just a compound page obtained directly rom page allocator (with large kmalloc() for SLUB and SLOB). ] Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Roman Gushchin <guro@fb.com>
2022-01-06mm/slub: Make object_err() staticVlastimil Babka
There are no callers outside of mm/slub.c anymore. Move freelist_corrupted() that calls object_err() to avoid a need for forward declaration. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Roman Gushchin <guro@fb.com>
2022-01-05xdp: Add xdp_do_redirect_frame() for pre-computed xdp_framesToke Høiland-Jørgensen
Add an xdp_do_redirect_frame() variant which supports pre-computed xdp_frame structures. This will be used in bpf_prog_run() to avoid having to write to the xdp_frame structure when the XDP program doesn't modify the frame boundaries. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220103150812.87914-6-toke@redhat.com
2022-01-05xdp: Move conversion to xdp_frame out of map functionsToke Høiland-Jørgensen
All map redirect functions except XSK maps convert xdp_buff to xdp_frame before enqueueing it. So move this conversion of out the map functions and into xdp_do_redirect(). This removes a bit of duplicated code, but more importantly it makes it possible to support caller-allocated xdp_frame structures, which will be added in a subsequent commit. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220103150812.87914-5-toke@redhat.com
2022-01-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05Merge tag 'net-5.16-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski" "Networking fixes, including fixes from bpf, and WiFi. One last pull request, turns out some of the recent fixes did more harm than good. Current release - regressions: - Revert "xsk: Do not sleep in poll() when need_wakeup set", made the problem worse - Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register", broke EPROBE_DEFER handling - Revert "net: usb: r8152: Add MAC pass-through support for more Lenovo Docks", broke setups without a Lenovo dock Current release - new code bugs: - selftests: set amt.sh executable Previous releases - regressions: - batman-adv: mcast: don't send link-local multicast to mcast routers Previous releases - always broken: - ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY - sctp: hold endpoint before calling cb in sctp_transport_lookup_process - mac80211: mesh: embed mesh_paths and mpp_paths into ieee80211_if_mesh to avoid complicated handling of sub-object allocation failures - seg6: fix traceroute in the presence of SRv6 - tipc: fix a kernel-infoleak in __tipc_sendmsg()" * tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits) selftests: set amt.sh executable Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks" sfc: The RX page_ring is optional iavf: Fix limit of total number of queues to active queues of VF i40e: Fix incorrect netdev's real number of RX/TX queues i40e: Fix for displaying message regarding NVM version i40e: fix use-after-free in i40e_sync_filters_subtask() i40e: Fix to not show opcode msg on unsuccessful VF MAC change ieee802154: atusb: fix uninit value in atusb_set_extended_addr mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh mac80211: initialize variable have_higher_than_11mbit sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc netrom: fix copying in user data in nr_setsockopt udp6: Use Segment Routing Header for dest address if present icmp: ICMPV6: Examine invoking packet for Segment Route Headers. seg6: export get_srh() for ICMP handling Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register" ipv6: Do cleanup if attribute validation fails in multipath route ipv6: Continue processing multipath route even if gateway attribute is invalid net/fsl: Remove leftover definition in xgmac_mdio ...
2022-01-05block: introduce rq_list_moveKeith Busch
When iterating a list, a particular request may need to be moved for special handling. Provide a helper function to achieve that so drivers don't need to reimplement rqlist manipulation. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220105170518.3181469-4-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-05block: introduce rq_list_for_each_safe macroKeith Busch
While iterating a list, a particular request may need to be removed for special handling. Provide an iterator that can safely handle that. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20220105170518.3181469-3-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-05block: move rq_list macros to blk-mq.hKeith Busch
Move the request list macros to the header file that defines that struct they operate on. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220105170518.3181469-2-kbusch@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-05Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', ↵Catalin Marinas
'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (32 commits) arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses ... * for-next/misc: : Miscellaneous patches arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64/fp: Add comments documenting the usage of state restore functions arm64: mm: Use asid feature macro for cheanup arm64: mm: Rename asid2idx() to ctxid2asid() arm64: kexec: reduce calls to page_address() arm64: extable: remove unused ex_handler_t definition arm64: entry: Use SDEI event constants arm64: Simplify checking for populated DT arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c * for-next/cache-ops-dzp: : Avoid DC instructions when DCZID_EL0.DZP == 1 arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 * for-next/stacktrace: : Unify the arm64 unwind code arm64: Make some stacktrace functions private arm64: Make dump_backtrace() use arch_stack_walk() arm64: Make profile_pc() use arch_stack_walk() arm64: Make return_address() use arch_stack_walk() arm64: Make __get_wchan() use arch_stack_walk() arm64: Make perf_callchain_kernel() use arch_stack_walk() arm64: Mark __switch_to() as __sched arm64: Add comment for stack_info::kr_cur arch: Make ARCH_STACKWALK independent of STACKTRACE * for-next/xor-neon: : Use SHA3 instructions to speed up XOR arm64/xor: use EOR3 instructions when available * for-next/kasan: : Log potential KASAN shadow aliases arm64: mm: log potential KASAN shadow alias arm64: mm: use die_kernel_fault() in do_mem_abort() * for-next/armv8_7-fp: : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES arm64: cpufeature: add HWCAP for FEAT_RPRES arm64: add ID_AA64ISAR2_EL1 sys register arm64: cpufeature: add HWCAP for FEAT_AFP * for-next/atomics: : arm64 atomics clean-ups and codegen improvements arm64: atomics: lse: define RETURN ops in terms of FETCH ops arm64: atomics: lse: improve constraints for simple ops arm64: atomics: lse: define ANDs in terms of ANDNOTs arm64: atomics lse: define SUBs in terms of ADDs arm64: atomics: format whitespace consistently * for-next/bti: : BTI clean-ups arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: Use BTI C directly and unconditionally arm64: Unconditionally override SYM_FUNC macros arm64: Add macro version of the BTI instruction arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel * for-next/sve: : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME * for-next/kselftest: : arm64 kselftest additions kselftest/arm64: Add pidbench for floating point syscall cases kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information * for-next/kcsan: : Enable KCSAN for arm64 arm64: Enable KCSAN
2022-01-05headers/deps: USB: Optimize <linux/usb/ch9.h> dependencies, remove ↵Ingo Molnar
<linux/device.h> The <linux/usb/ch9.h> header is used over 1,400 times in a typical distro build, but few of its users actually need the full <linux/device.h> header. -------------------------------------------------------------------- | Combined, preprocessed C code size of header, without line markers, | with comments stripped: ------------------------- before: | #include <linux/usb/ch9.h> | LOC: 7,078 | headers: 172 after: | #include <linux/usb/ch9.h> | LOC: 812 | headers: 38 Remove it and add it to the places that need it. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-05Merge tag 'linux-can-next-for-5.17-20220105' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2022-01-05 this is a pull request of 15 patches for net-next/master. The first patch is by me and removed an unused variable from the usb_8dev driver. Andy Shevchenko contributes a patch for the mcp251x driver, which removes an unneeded assignment. Jimmy Assarsson's patch for the kvaser_usb makes use of units.h in the assignment of frequencies. Lad Prabhakar provides 2 patches, converting the ti_hecc and the sja1000 driver to make use of platform_get_irq(). The 10 remaining patches are by Vincent Mailhol. First the etas_es58x driver populates the net_device::dev_port. The next 5 patches cleanup the handling of CAN error and CAN RTR messages of all drivers. The remaining 4 patches enhance the CAN controller mode flag handling and export it via netlink to user space. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05net: mdio: add helpers to extract clause 45 regad and devad fieldsRussell King (Oracle)
Add a couple of helpers and definitions to extract the clause 45 regad and devad fields from the regnum passed into MDIO drivers. Tested-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-05can: dev: reorder struct can_priv members for better packingVincent Mailhol
Save eight bytes of holes on x86-64 architectures by reordering the members of struct can_priv. Before: | $ pahole -C can_priv drivers/net/can/dev/dev.o | struct can_priv { | struct net_device * dev; /* 0 8 */ | struct can_device_stats can_stats; /* 8 24 */ | const struct can_bittiming_const * bittiming_const; /* 32 8 */ | const struct can_bittiming_const * data_bittiming_const; /* 40 8 */ | struct can_bittiming bittiming; /* 48 32 */ | /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ | struct can_bittiming data_bittiming; /* 80 32 */ | const struct can_tdc_const * tdc_const; /* 112 8 */ | struct can_tdc tdc; /* 120 12 */ | /* --- cacheline 2 boundary (128 bytes) was 4 bytes ago --- */ | unsigned int bitrate_const_cnt; /* 132 4 */ | const u32 * bitrate_const; /* 136 8 */ | const u32 * data_bitrate_const; /* 144 8 */ | unsigned int data_bitrate_const_cnt; /* 152 4 */ | u32 bitrate_max; /* 156 4 */ | struct can_clock clock; /* 160 4 */ | unsigned int termination_const_cnt; /* 164 4 */ | const u16 * termination_const; /* 168 8 */ | u16 termination; /* 176 2 */ | | /* XXX 6 bytes hole, try to pack */ | | struct gpio_desc * termination_gpio; /* 184 8 */ | /* --- cacheline 3 boundary (192 bytes) --- */ | u16 termination_gpio_ohms[2]; /* 192 4 */ | enum can_state state; /* 196 4 */ | u32 ctrlmode; /* 200 4 */ | u32 ctrlmode_supported; /* 204 4 */ | int restart_ms; /* 208 4 */ | | /* XXX 4 bytes hole, try to pack */ | | struct delayed_work restart_work; /* 216 88 */ | | /* XXX last struct has 4 bytes of padding */ | | /* --- cacheline 4 boundary (256 bytes) was 48 bytes ago --- */ | int (*do_set_bittiming)(struct net_device *); /* 304 8 */ | int (*do_set_data_bittiming)(struct net_device *); /* 312 8 */ | /* --- cacheline 5 boundary (320 bytes) --- */ | int (*do_set_mode)(struct net_device *, enum can_mode); /* 320 8 */ | int (*do_set_termination)(struct net_device *, u16); /* 328 8 */ | int (*do_get_state)(const struct net_device *, enum can_state *); /* 336 8 */ | int (*do_get_berr_counter)(const struct net_device *, struct can_berr_counter *); /* 344 8 */ | unsigned int echo_skb_max; /* 352 4 */ | | /* XXX 4 bytes hole, try to pack */ | | struct sk_buff * * echo_skb; /* 360 8 */ | | /* size: 368, cachelines: 6, members: 32 */ | /* sum members: 354, holes: 3, sum holes: 14 */ | /* paddings: 1, sum paddings: 4 */ | /* last cacheline: 48 bytes */ | }; After: | $ pahole -C can_priv drivers/net/can/dev/dev.o | struct can_priv { | struct net_device * dev; /* 0 8 */ | struct can_device_stats can_stats; /* 8 24 */ | const struct can_bittiming_const * bittiming_const; /* 32 8 */ | const struct can_bittiming_const * data_bittiming_const; /* 40 8 */ | struct can_bittiming bittiming; /* 48 32 */ | /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ | struct can_bittiming data_bittiming; /* 80 32 */ | const struct can_tdc_const * tdc_const; /* 112 8 */ | struct can_tdc tdc; /* 120 12 */ | /* --- cacheline 2 boundary (128 bytes) was 4 bytes ago --- */ | unsigned int bitrate_const_cnt; /* 132 4 */ | const u32 * bitrate_const; /* 136 8 */ | const u32 * data_bitrate_const; /* 144 8 */ | unsigned int data_bitrate_const_cnt; /* 152 4 */ | u32 bitrate_max; /* 156 4 */ | struct can_clock clock; /* 160 4 */ | unsigned int termination_const_cnt; /* 164 4 */ | const u16 * termination_const; /* 168 8 */ | u16 termination; /* 176 2 */ | | /* XXX 6 bytes hole, try to pack */ | | struct gpio_desc * termination_gpio; /* 184 8 */ | /* --- cacheline 3 boundary (192 bytes) --- */ | u16 termination_gpio_ohms[2]; /* 192 4 */ | unsigned int echo_skb_max; /* 196 4 */ | struct sk_buff * * echo_skb; /* 200 8 */ | enum can_state state; /* 208 4 */ | u32 ctrlmode; /* 212 4 */ | u32 ctrlmode_supported; /* 216 4 */ | int restart_ms; /* 220 4 */ | struct delayed_work restart_work; /* 224 88 */ | | /* XXX last struct has 4 bytes of padding */ | | /* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */ | int (*do_set_bittiming)(struct net_device *); /* 312 8 */ | /* --- cacheline 5 boundary (320 bytes) --- */ | int (*do_set_data_bittiming)(struct net_device *); /* 320 8 */ | int (*do_set_mode)(struct net_device *, enum can_mode); /* 328 8 */ | int (*do_set_termination)(struct net_device *, u16); /* 336 8 */ | int (*do_get_state)(const struct net_device *, enum can_state *); /* 344 8 */ | int (*do_get_berr_counter)(const struct net_device *, struct can_berr_counter *); /* 352 8 */ | | /* size: 360, cachelines: 6, members: 32 */ | /* sum members: 354, holes: 1, sum holes: 6 */ | /* paddings: 1, sum paddings: 4 */ | /* last cacheline: 40 bytes */ | }; Link: https://lore.kernel.org/all/20211213160226.56219-4-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-05can: dev: add sanity check in can_set_static_ctrlmode()Vincent Mailhol
Previous patch removed can_priv::ctrlmode_static to replace it with can_get_static_ctrlmode(). A condition sine qua non for this to work is that the controller static modes should never be set in can_priv::ctrlmode_supported (c.f. the comment on can_priv::ctrlmode_supported which states that it is for "options that can be *modified* by netlink"). Also, this condition is already correctly fulfilled by all existing drivers which rely on the ctrlmode_static feature. Nonetheless, we added an extra safeguard in can_set_static_ctrlmode() to return an error value and to warn the developer who would be adventurous enough to set to static a given feature that is already set to supported. The drivers which rely on the static controller mode are then updated to check the return value of can_set_static_ctrlmode(). Link: https://lore.kernel.org/all/20211213160226.56219-3-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-05can: dev: replace can_priv::ctrlmode_static by can_get_static_ctrlmode()Vincent Mailhol
The statically enabled features of a CAN controller can be retrieved using below formula: | u32 ctrlmode_static = priv->ctrlmode & ~priv->ctrlmode_supported; As such, there is no need to store this information. This patch remove the field ctrlmode_static of struct can_priv and provides, in replacement, the inline function can_get_static_ctrlmode() which returns the same value. Link: https://lore.kernel.org/all/20211213160226.56219-2-mailhol.vincent@wanadoo.fr Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>