summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-10-07sched: Add wait/wake interface for variable updated under a lock.NeilBrown
Sometimes we need to wait for a condition to be true which must be testing while holding a lock. Correspondingly the condition is made true while holding the lock and the wake up is sent under the lock. This patch provides wake and wait interfaces which can be used for this situation when the lock is a mutex or a spinlock, or any other lock for which there are foo_lock() and foo_unlock() functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240925053405.3960701-6-neilb@suse.de
2024-10-07sched: Add test_and_clear_wake_up_bit() and atomic_dec_and_wake_up()NeilBrown
There are common patterns in the kernel of using test_and_clear_bit() before wake_up_bit(), and atomic_dec_and_test() before wake_up_var(). These combinations don't need extra barriers but sometimes include them unnecessarily. To help avoid the unnecessary barriers and to help discourage the general use of wake_up_bit/var (which is a fragile interface) introduce two combined functions which implement these patterns. Also add store_release_wake_up() which supports the task of simply setting a non-atomic variable and sending a wakeup. This pattern requires barriers which are often omitted. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240925053405.3960701-5-neilb@suse.de
2024-10-07sched: Document wait_var_event() family of functions and wake_up_var()NeilBrown
wake_up_var(), wait_var_event() and related interfaces are not documented but have important ordering requirements. This patch adds documentation and makes these requirements explicit. The return values for those wait_var_event_* functions which return a value are documented. Note that these are, perhaps surprisingly, sometimes different from comparable wait_on_bit() functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240925053405.3960701-4-neilb@suse.de
2024-10-07sched: Improve documentation for wake_up_bit/wait_on_bit family of functionsNeilBrown
This patch revises the documention for wake_up_bit(), clear_and_wake_up_bit(), and all the wait_on_bit() family of functions. The new documentation places less emphasis on the pool of waitqueues used (an implementation detail) and focuses instead on details of how the functions behave. The barriers included in the wait functions and clear_and_wake_up_bit() and those required for wake_up_bit() are spelled out more clearly. The error statuses returned are given explicitly. The fact that the wait_on_bit_lock() function sets the bit is made more obvious. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240925053405.3960701-3-neilb@suse.de
2024-10-07sched: change wake_up_bit() and related function to expect unsigned long *NeilBrown
wake_up_bit() currently allows a "void *". While this isn't strictly a problem as the address is never dereferenced, it is inconsistent with the corresponding wait_on_bit() which requires "unsigned long *" and does dereference the pointer. Any code that needs to wait for a change in something other than an unsigned long would be better served by wake_up_var()/wait_var_event(). This patch changes all related "void *" to "unsigned long *". Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20240925053405.3960701-2-neilb@suse.de
2024-10-07RDMA/core: Provide rdma_user_mmap_disassociate() to disassociate mmap pagesChengchang Tang
Provide a new api rdma_user_mmap_disassociate() for drivers to disassociate mmap pages for a device. Since drivers can now disassociate mmaps by calling this api, introduce a new disassociation_lock to specifically prevent races between this disassociation process and new mmaps. And thus the old hw_destroy_rwsem is not needed in this api. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20240927103323.1897094-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-10-06Merge branch 'timers/vfs' into timers/coreThomas Gleixner
Pick up the VFS specific interfaces so further timekeeping changes can be based on them.
2024-10-06timekeeping: Add percpu counter for tracking floor swap eventsJeff Layton
The mgtime_floor value is a global variable for tracking the latest fine-grained timestamp handed out. Because it's a global, track the number of times that a new floor value is assigned. Add a new percpu counter to the timekeeping code to track the number of floor swap events that have occurred. A later patch will add a debugfs file to display this counter alongside other stats involving multigrain timestamps. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Link: https://lore.kernel.org/all/20241002-mgtime-v10-2-d1c4717f5284@kernel.org
2024-10-06timekeeping: Add interfaces for handling timestamps with a floor valueJeff Layton
Multigrain timestamps allow the kernel to use fine-grained timestamps when an inode's attributes is being actively observed via ->getattr(). With this support, it's possible for a file to get a fine-grained timestamp, and another modified after it to get a coarse-grained stamp that is earlier than the fine-grained time. If this happens then the files can appear to have been modified in reverse order, which breaks VFS ordering guarantees [1]. To prevent this, maintain a floor value for multigrain timestamps. Whenever a fine-grained timestamp is handed out, record it, and when later coarse-grained stamps are handed out, ensure they are not earlier than that value. If the coarse-grained timestamp is earlier than the fine-grained floor, return the floor value instead. Add a static singleton atomic64_t into timekeeper.c that is used to keep track of the latest fine-grained time ever handed out. This is tracked as a monotonic ktime_t value to ensure that it isn't affected by clock jumps. Because it is updated at different times than the rest of the timekeeper object, the floor value is managed independently of the timekeeper via a cmpxchg() operation, and sits on its own cacheline. Add two new public interfaces: - ktime_get_coarse_real_ts64_mg() fills a timespec64 with the later of the coarse-grained clock and the floor time - ktime_get_real_ts64_mg() gets the fine-grained clock value, and tries to swap it into the floor. A timespec64 is filled with the result. The floor value is global and updated via a single try_cmpxchg(). If that fails then the operation raced with a concurrent update. Any concurrent update must be later than the existing floor value, so any racing tasks can accept any resulting floor value without retrying. [1]: POSIX requires that files be stamped with realtime clock values, and makes no provision for dealing with backward clock jumps. If a backward realtime clock jump occurs, then files can appear to have been modified in reverse order. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Randy Dunlap <rdunlap@infradead.org> # documentation bits Acked-by: John Stultz <jstultz@google.com> Link: https://lore.kernel.org/all/20241002-mgtime-v10-1-d1c4717f5284@kernel.org
2024-10-06dt-bindings: iio: adc: Add the GE HealthCare PMC ADCHerve Codina
The GE HealthCare PMC Analog to Digital Converter (ADC) is a 16-Channel (voltage and current), 16-Bit ADC with an I2C Interface. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Tested-by: Ian Ray <ian.ray@gehealthcare.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20241003114641.672086-3-herve.codina@bootlin.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-05EDAC/qcom: Make irq configuration optionalRajendra Nayak
On most modern qualcomm SoCs, the configuration necessary to enable the Tag/Data RAM related irqs being propagated to the SoC irq controller is already done in firmware (in DSF or 'DDR System Firmware') On some like the x1e80100, these registers aren't even accesible to the kernel causing a crash when edac device is probed. Hence, make the irq configuration optional in the driver and mark x1e80100 as the SoC on which this should be avoided. Fixes: af16b00578a7 ("arm64: dts: qcom: Add base X1E80100 dtsi and the QCP dts") Reported-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Rajendra Nayak <quic_rjendra@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/r/20240903101510.3452734-1-quic_rjendra@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-10-05dt-bindings: clock: qcom,gcc-sm8450: Add SM8475 GCC bindingsDanila Tikhonov
Add new entry to the SM8450 dt-bindings and add SM8475-specific clocks to SM8450 GCC header file. Signed-off-by: Danila Tikhonov <danila@jiaxyga.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240818204348.197788-2-danila@jiaxyga.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-10-05batman-adv: Add flex array to struct batadv_tvlv_tt_dataErick Archer
The "struct batadv_tvlv_tt_data" uses a dynamically sized set of trailing elements. Specifically, it uses an array of structures of type "batadv_tvlv_tt_vlan_data". So, use the preferred way in the kernel declaring a flexible array [1]. At the same time, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). In this case, it is important to note that the attribute used is specifically __counted_by_be since variable "num_vlan" is of type __be16. The following change to the "batadv_tt_tvlv_ogm_handler_v1" function: - tt_vlan = (struct batadv_tvlv_tt_vlan_data *)(tt_data + 1); - tt_change = (struct batadv_tvlv_tt_change *)(tt_vlan + num_vlan); + tt_change = (struct batadv_tvlv_tt_change *)((void *)tt_data + + flex_size); is intended to prevent the compiler from generating an "out-of-bounds" notification due to the __counted_by attribute. The compiler can do a pointer calculation using the vlan_data flexible array memory, or in other words, this may be calculated as an array offset, since it is the same as: &tt_data->vlan_data[num_vlan] Therefore, we go past the end of the array. In other "multiple trailing flexible array" situations, this has been solved by addressing from the base pointer, since the compiler either knows the full allocation size or it knows nothing about it (this case, since it came from a "void *" function argument). The order in which the structure batadv_tvlv_tt_data and the structure batadv_tvlv_tt_vlan_data are defined must be swap to avoid an incomplete type error. Also, avoid the open-coded arithmetic in memory allocator functions [2] using the "struct_size" macro and use the "flex_array_size" helper to clarify some calculations, when possible. Moreover, the new structure member also allow us to avoid the open-coded arithmetic on pointers in some situations. Take advantage of this. This code was detected with the help of Coccinelle, and audited and modified manually. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Erick Archer <erick.archer@outlook.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2024-10-05function_graph: Support recording and printing the function return addressDonglin Peng
When using function_graph tracer to analyze the flow of kernel function execution, it is often necessary to quickly locate the exact line of code where the call occurs. While this may be easy at times, it can be more time-consuming when some functions are inlined or the flow is too long. This feature aims to simplify the process by recording the return address of traced funcions and printing it when outputing trace logs. To enhance human readability, the prefix 'ret=' is used for the kernel return value, while '<-' serves as the prefix for the return address in trace logs to make it look more like the function tracer. A new trace option named 'funcgraph-retaddr' has been introduced, and the existing option 'sym-addr' can be used to control the format of the return address. See below logs with both funcgraph-retval and funcgraph-retaddr enabled. 0) | load_elf_binary() { /* <-bprm_execve+0x249/0x600 */ 0) | load_elf_phdrs() { /* <-load_elf_binary+0x84/0x1730 */ 0) | __kmalloc_noprof() { /* <-load_elf_phdrs+0x4a/0xb0 */ 0) 3.657 us | __cond_resched(); /* <-__kmalloc_noprof+0x28c/0x390 ret=0x0 */ 0) + 24.335 us | } /* __kmalloc_noprof ret=0xffff8882007f3000 */ 0) | kernel_read() { /* <-load_elf_phdrs+0x6c/0xb0 */ 0) | rw_verify_area() { /* <-kernel_read+0x2b/0x50 */ 0) | security_file_permission() { /* <-kernel_read+0x2b/0x50 */ 0) | selinux_file_permission() { /* <-security_file_permission+0x26/0x40 */ 0) | __inode_security_revalidate() { /* <-selinux_file_permission+0x6d/0x140 */ 0) 2.034 us | __cond_resched(); /* <-__inode_security_revalidate+0x5f/0x80 ret=0x0 */ 0) 6.602 us | } /* __inode_security_revalidate ret=0x0 */ 0) 2.214 us | avc_policy_seqno(); /* <-selinux_file_permission+0x107/0x140 ret=0x0 */ 0) + 16.670 us | } /* selinux_file_permission ret=0x0 */ 0) + 20.809 us | } /* security_file_permission ret=0x0 */ 0) + 25.217 us | } /* rw_verify_area ret=0x0 */ 0) | __kernel_read() { /* <-load_elf_phdrs+0x6c/0xb0 */ 0) | ext4_file_read_iter() { /* <-__kernel_read+0x160/0x2e0 */ Then, we can use the faddr2line to locate the source code, for example: $ ./scripts/faddr2line ./vmlinux load_elf_phdrs+0x6c/0xb0 load_elf_phdrs+0x6c/0xb0: elf_read at fs/binfmt_elf.c:471 (inlined by) load_elf_phdrs at fs/binfmt_elf.c:531 Link: https://lore.kernel.org/20240915032912.1118397-1-dolinux.peng@gmail.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409150605.HgUmU8ea-lkp@intel.com/ Signed-off-by: Donglin Peng <dolinux.peng@gmail.com> [ Rebased to handle text_delta offsets ] Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-05crypto: hisilicon/qm - fix the coding specifications issueChenghai Huang
Ensure that the inline function contains no more than 10 lines. move q_num_set() from hisi_acc_qm.h to qm.c. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: ecdsa - Support P1363 signature decodingLukas Wunner
Alternatively to the X9.62 encoding of ecdsa signatures, which uses ASN.1 and is already supported by the kernel, there's another common encoding called P1363. It stores r and s as the concatenation of two big endian, unsigned integers. The name originates from IEEE P1363. Add a P1363 template in support of the forthcoming SPDM library (Security Protocol and Data Model) for PCI device authentication. P1363 is prescribed by SPDM 1.2.1 margin no 44: "For ECDSA signatures, excluding SM2, in SPDM, the signature shall be the concatenation of r and s. The size of r shall be the size of the selected curve. Likewise, the size of s shall be the size of the selected curve. See BaseAsymAlgo in NEGOTIATE_ALGORITHMS for the size of r and s. The byte order for r and s shall be in big endian order. When placing ECDSA signatures into an SPDM signature field, r shall come first followed by s." Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0274_1.2.1.pdf Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: ecdsa - Move X9.62 signature size calculation into templateLukas Wunner
software_key_query() returns the maximum signature and digest size for a given key to user space. When it only supported RSA keys, calculating those sizes was trivial as they were always equivalent to the key size. However when ECDSA was added, the function grew somewhat complicated calculations which take the ASN.1 encoding and curve into account. This doesn't scale well and adjusting the calculations is easily forgotten when adding support for new encodings or curves. In fact, when NIST P521 support was recently added, the function was initially not amended: https://lore.kernel.org/all/b749d5ee-c3b8-4cbd-b252-7773e4536e07@linux.ibm.com/ Introduce a ->max_size() callback to struct sig_alg and take advantage of it to move the signature size calculations to ecdsa-x962.c. Introduce a ->digest_size() callback to struct sig_alg and move the maximum ECDSA digest size to ecdsa.c. It is common across ecdsa-x962.c and the upcoming ecdsa-p1363.c and thus inherited by both of them. For all other algorithms, continue using the key size as maximum signature and digest size. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: sig - Rename crypto_sig_maxsize() to crypto_sig_keysize()Lukas Wunner
crypto_sig_maxsize() is a bit of a misnomer as it doesn't return the maximum signature size, but rather the key size. Rename it as well as all implementations of the ->max_size callback. A subsequent commit introduces a crypto_sig_maxsize() function which returns the actual maximum signature size. While at it, change the return type of crypto_sig_keysize() from int to unsigned int for consistency with crypto_akcipher_maxsize(). None of the callers checks for a negative return value and an error condition can always be indicated by returning zero. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: ecdsa - Move X9.62 signature decoding into templateLukas Wunner
Unlike the rsa driver, which separates signature decoding and signature verification into two steps, the ecdsa driver does both in one. This restricts users to the one signature format currently supported (X9.62) and prevents addition of others such as P1363, which is needed by the forthcoming SPDM library (Security Protocol and Data Model) for PCI device authentication. Per Herbert's suggestion, change ecdsa to use a "raw" signature encoding and then implement X9.62 and P1363 as templates which convert their respective encodings to the raw one. One may then specify "x962(ecdsa-nist-XXX)" or "p1363(ecdsa-nist-XXX)" to pick the encoding. The present commit moves X9.62 decoding to a template. A separate commit is going to introduce another template for P1363 decoding. The ecdsa driver internally represents a signature as two u64 arrays of size ECC_MAX_BYTES. This appears to be the most natural choice for the raw format as it can directly be used for verification without having to further decode signature data or copy it around. Repurpose all the existing test vectors for "x962(ecdsa-nist-XXX)" and create a duplicate of them to test the raw encoding. Link: https://lore.kernel.org/all/ZoHXyGwRzVvYkcTP@gondor.apana.org.au/ Signed-off-by: Lukas Wunner <lukas@wunner.de> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05ASN.1: Clean up include statements in public headersLukas Wunner
If <linux/asn1_decoder.h> is the first header included from a .c file (due to headers being sorted alphabetically), the compiler complains: include/linux/asn1_decoder.h:18:29: error: unknown type name 'size_t' Avoid by including <linux/types.h>. Jonathan notes that the counterpart <linux/asn1_encoder.h> already includes <linux/types.h>, but additionally includes the unnecessary <linux/bug.h>. Drop it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: sig - Move crypto_sig_*() API calls to include fileLukas Wunner
The crypto_sig_*() API calls lived in sig.c so far because they needed access to struct crypto_sig_type: This was necessary to differentiate between signature algorithms that had already been migrated from crypto_akcipher to crypto_sig and those that hadn't yet. Now that all algorithms have been migrated, the API calls can become static inlines in <crypto/sig.h> to mimic what <crypto/akcipher.h> is doing. Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: akcipher - Drop sign/verify operationsLukas Wunner
A sig_alg backend has just been introduced and all asymmetric sign/verify algorithms have been migrated to it. The sign/verify operations can thus be dropped from akcipher_alg. It is now purely for asymmetric encrypt/decrypt. Move struct crypto_akcipher_sync_data from internal.h to akcipher.c and unexport crypto_akcipher_sync_{prep,post}(): They're no longer used by sig.c but only locally in akcipher.c. In crypto_akcipher_sync_{prep,post}(), drop various NULL pointer checks for data->dst as they were only necessary for the verify operation. In the crypto_sig_*() API calls, remove the forks that were necessary while algorithms were converted from crypto_akcipher to crypto_sig one by one. In struct akcipher_testvec, remove the "params", "param_len" and "algo" elements as they were only needed for the ecrdsa verify operation. Remove corresponding dead code from test_akcipher_one() as well. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: virtio - Drop sign/verify operationsLukas Wunner
The virtio crypto driver exposes akcipher sign/verify operations in a user space ABI. This blocks removal of sign/verify from akcipher_alg. Herbert opines: "I would say that this is something that we can break. Breaking it is no different to running virtio on a host that does not support these algorithms. After all, a software implementation must always be present. I deliberately left akcipher out of crypto_user because the API is still in flux. We should not let virtio constrain ourselves." https://lore.kernel.org/all/ZtqoNAgcnXnrYhZZ@gondor.apana.org.au/ "I would remove virtio akcipher support in its entirety. This API was never meant to be exposed outside of the kernel." https://lore.kernel.org/all/Ztqql_gqgZiMW8zz@gondor.apana.org.au/ Drop sign/verify support from virtio crypto. There's no strong reason to also remove encrypt/decrypt support, so keep it. A key selling point of virtio crypto is to allow guest access to crypto accelerators on the host. So far the only akcipher algorithm supported by virtio crypto is RSA. Dropping sign/verify merely means that the PKCS#1 padding is now always generated or verified inside the guest, but the actual signature generation/verification (which is an RSA decrypt/encrypt operation) may still use an accelerator on the host. Generating or verifying the PKCS#1 padding is cheap, so a hardware accelerator won't be of much help there. Which begs the question whether virtio crypto support for sign/verify makes sense at all. It would make sense for the sign operation if the host has a security chip to store asymmetric private keys. But the kernel doesn't even have an asymmetric_key_subtype yet for hardware-based private keys. There's at least one rudimentary driver for such chips (atmel-ecc.c for ATECC508A), but it doesn't implement the sign operation. The kernel would first have to grow support for a hardware asymmetric_key_subtype and at least one driver implementing the sign operation before exposure to guests via virtio makes sense. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: rsassa-pkcs1 - Migrate to sig_alg backendLukas Wunner
A sig_alg backend has just been introduced with the intent of moving all asymmetric sign/verify algorithms to it one by one. Migrate the sign/verify operations from rsa-pkcs1pad.c to a separate rsassa-pkcs1.c which uses the new backend. Consequently there are now two templates which build on the "rsa" akcipher_alg: * The existing "pkcs1pad" template, which is instantiated as an akcipher_instance and retains the encrypt/decrypt operations of RSAES-PKCS1-v1_5 (RFC 8017 sec 7.2). * The new "pkcs1" template, which is instantiated as a sig_instance and contains the sign/verify operations of RSASSA-PKCS1-v1_5 (RFC 8017 sec 8.2). In a separate step, rsa-pkcs1pad.c could optionally be renamed to rsaes-pkcs1.c for clarity. Additional "oaep" and "pss" templates could be added for RSAES-OAEP and RSASSA-PSS. Note that it's currently allowed to allocate a "pkcs1pad(rsa)" transform without specifying a hash algorithm. That makes sense if the transform is only used for encrypt/decrypt and continues to be supported. But for sign/verify, such transforms previously did not insert the Full Hash Prefix into the padding. The resulting message encoding was incompliant with EMSA-PKCS1-v1_5 (RFC 8017 sec 9.2) and therefore nonsensical. From here on in, it is no longer allowed to allocate a transform without specifying a hash algorithm if the transform is used for sign/verify operations. This simplifies the code because the insertion of the Full Hash Prefix is no longer optional, so various "if (digest_info)" clauses can be removed. There has been a previous attempt to forbid transform allocation without specifying a hash algorithm, namely by commit c0d20d22e0ad ("crypto: rsa-pkcs1pad - Require hash to be present"). It had to be rolled back with commit b3a8c8a5ebb5 ("crypto: rsa-pkcs1pad: Allow hash to be optional [ver #2]"), presumably because it broke allocation of a transform which was solely used for encrypt/decrypt, not sign/verify. Avoid such breakage by allowing transform allocation for encrypt/decrypt with and without specifying a hash algorithm (and simply ignoring the hash algorithm in the former case). So again, specifying a hash algorithm is now mandatory for sign/verify, but optional and ignored for encrypt/decrypt. The new sig_alg API uses kernel buffers instead of sglists, which avoids the overhead of copying signature and digest from sglists back into kernel buffers. rsassa-pkcs1.c is thus simplified quite a bit. sig_alg is always synchronous, whereas the underlying "rsa" akcipher_alg may be asynchronous. So await the result of the akcipher_alg, similar to crypto_akcipher_sync_{en,de}crypt(). As part of the migration, rename "rsa_digest_info" to "hash_prefix" to adhere to the spec language in RFC 9580. Otherwise keep the code unmodified wherever possible to ease reviewing and bisecting. Leave several simplification and hardening opportunities to separate commits. rsassa-pkcs1.c uses modern __free() syntax for allocation of buffers which need to be freed by kfree_sensitive(), hence a DEFINE_FREE() clause for kfree_sensitive() is introduced herein as a byproduct. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: rsa-pkcs1pad - Deduplicate set_{pub,priv}_key callbacksLukas Wunner
pkcs1pad_set_pub_key() and pkcs1pad_set_priv_key() are almost identical. The upcoming migration of sign/verify operations from rsa-pkcs1pad.c into a separate crypto_template will require another copy of the exact same functions. When RSASSA-PSS and RSAES-OAEP are introduced, each will need yet another copy. Deduplicate the functions into a single one which lives in a common header file for reuse by RSASSA-PKCS1-v1_5, RSASSA-PSS and RSAES-OAEP. Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-05crypto: sig - Introduce sig_alg backendLukas Wunner
Commit 6cb8815f41a9 ("crypto: sig - Add interface for sign/verify") began a transition of asymmetric sign/verify operations from crypto_akcipher to a new crypto_sig frontend. Internally, the crypto_sig frontend still uses akcipher_alg as backend, however: "The link between sig and akcipher is meant to be temporary. The plan is to create a new low-level API for sig and then migrate the signature code over to that from akcipher." https://lore.kernel.org/r/ZrG6w9wsb-iiLZIF@gondor.apana.org.au/ "having a separate alg for sig is definitely where we want to be since there is very little that the two types actually share." https://lore.kernel.org/r/ZrHlpz4qnre0zWJO@gondor.apana.org.au/ Take the next step of that migration and augment the crypto_sig frontend with a sig_alg backend to which all algorithms can be moved. During the migration, there will briefly be signature algorithms that are still based on crypto_akcipher, whilst others are already based on crypto_sig. Allow for that by building a fork into crypto_sig_*() API calls (i.e. crypto_sig_maxsize() and friends) such that one of the two backends is selected based on the transform's cra_type. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-10-04net_sched: sch_fq: add the ability to offload pacingJeffrey Ji
Some network devices have the ability to offload EDT (Earliest Departure Time) which is the model used for TCP pacing and FQ packet scheduler. Some of them implement the timing wheel mechanism described in https://saeed.github.io/files/carousel-sigcomm17.pdf with an associated 'timing wheel horizon'. This patchs adds to FQ packet scheduler TCA_FQ_OFFLOAD_HORIZON attribute. Its value is capped by the device max_pacing_offload_horizon, added in the prior patch. It allows FQ to let packets within pacing offload horizon to be delivered to the device, which will handle the needed delay without host involvement. Signed-off-by: Jeffrey Ji <jeffreyji@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241003121219.2396589-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attributeEric Dumazet
Some network devices have the ability to offload EDT (Earliest Departure Time) which is the model used for TCP pacing and FQ packet scheduler. Some of them implement the timing wheel mechanism described in https://saeed.github.io/files/carousel-sigcomm17.pdf with an associated 'timing wheel horizon'. This patch adds dev->max_pacing_offload_horizon expressing this timing wheel horizon in nsec units. This is a read-only attribute. Unless a driver sets it, dev->max_pacing_offload_horizon is zero. v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 ) v3: added yaml doc (also per Jakub feedback) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04net: Fix an unsafe loop on the listAnastasia Kovaleva
The kernel may crash when deleting a genetlink family if there are still listeners for that family: Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c000000000c080bc] netlink_update_socket_mc+0x3c/0xc0 LR [c000000000c0f764] __netlink_clear_multicast_users+0x74/0xc0 Call Trace: __netlink_clear_multicast_users+0x74/0xc0 genl_unregister_family+0xd4/0x2d0 Change the unsafe loop on the list to a safe one, because inside the loop there is an element removal from this list. Fixes: b8273570f802 ("genetlink: fix netns vs. netlink table locking (2)") Cc: stable@vger.kernel.org Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com> Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241003104431.12391-1-a.kovaleva@yadro.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04tcp: add a fast path in tcp_delack_timer()Eric Dumazet
delack timer is not stopped from inet_csk_clear_xmit_timer() because we do not define INET_CSK_CLEAR_TIMERS. This is a conscious choice : inet_csk_clear_xmit_timer() is often called from another cpu. Calling del_timer() would cause false sharing and lock contention. This means that very often, tcp_delack_timer() is called at the timer expiration, while there is no ACK to transmit. This can be detected very early, avoiding the socket spinlock. Notes: - test about tp->compressed_ack is racy, but in the unlikely case there is a race, the dedicated compressed_ack_timer hrtimer would close it. - Even if the fast path is not taken, reading icsk->icsk_ack.pending and tp->compressed_ack before acquiring the socket spinlock reduces acquisition time and chances of contention. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241002173042.917928-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04tcp: annotate data-races around icsk->icsk_pendingEric Dumazet
icsk->icsk_pending can be read locklessly already. Following patch in the series will add another lockless read. Add smp_load_acquire() and smp_store_release() annotations because following patch will add a test in tcp_write_timer(), and READ_ONCE()/WRITE_ONCE() alone would possibly lead to races. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241002173042.917928-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04Merge tag 'pm-6.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two cpufreq issues, one in the core and one in the intel_pstate driver: - Fix CPU device node reference counting in the cpufreq core (Miquel Sabaté Solà) - Turn the spinlock used by the intel_pstate driver in hard IRQ context into a raw one to prevent the driver from crashing when PREEMPT_RT is enabled (Uwe Kleine-König)" * tag 'pm-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Avoid a bad reference count on CPU node cpufreq: intel_pstate: Make hwp_notify_lock a raw spinlock
2024-10-04net_tstamp: add SCM_TS_OPT_ID for RAW socketsVadim Fedorenko
The last type of sockets which supports SOF_TIMESTAMPING_OPT_ID is RAW sockets. To add new option this patch converts all callers (direct and indirect) of _sock_tx_timestamp to provide sockcm_cookie instead of tsflags. And while here fix __sock_tx_timestamp to receive tsflags as __u32 instead of __u16. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://patch.msgid.link/20241001125716.2832769-3-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control messageVadim Fedorenko
SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX timestamps and packets sent via socket. Unfortunately, there is no way to reliably predict socket timestamp ID value in case of error returned by sendmsg. For UDP sockets it's impossible because of lockless nature of UDP transmit, several threads may send packets in parallel. In case of RAW sockets MSG_MORE option makes things complicated. More details are in the conversation [1]. This patch adds new control message type to give user-space software an opportunity to control the mapping between packets and values by providing ID with each sendmsg for UDP sockets. The documentation is also added in this patch. [1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/ Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Link: https://patch.msgid.link/20241001125716.2832769-2-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04net/mlx5: hw counters: Remove mlx5_fc_create_exCosmin Ratiu
It no longer serves any purpose and is identical to mlx5_fc_create upon which it was originally based of. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04net/mlx5: hw counters: Make fc_stats & fc_pool privateCosmin Ratiu
The mlx5_fc_stats and mlx5_fc_pool structs are only used from fs_counters.c. As such, make them private there. mlx5_fc_pool is not used or referenced at all outside fs_counters. mlx5_fc_stats is referenced from mlx5_core_dev, so instead of having it as a direct member (which requires exporting it from fs_counters), store a pointer to it, allocate it on init and clear it on destroy. One caveat is that a simple container_of to get from a 'work' struct to the outermost mlx5_core_dev struct directly no longer works, so an extra pointer had to be added to mlx5_fc_stats back to the parent dev. Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241001103709.58127-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04Merge tag 'sound-6.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Slightly high amount of changes in this round, partly because of my vacation in the last weeks. But all changes are small and nothing looks worrisome. The biggest LOCs is MAINTAINERS updates, and there is a core change for card-ID string creation for non-ASCII inputs. Others are rather device-specific, such as new quirks and device IDs for ASoC, usual HD-audio and USB-audio quirks and fixes, as well as regression fixes in HD-audio HDMI audio and Conexant codec" * tag 'sound-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (39 commits) ALSA: hda/conexant: Fix conflicting quirk for System76 Pangolin ALSA: line6: add hw monitor volume control to POD HD500X ALSA: gus: Fix some error handling paths related to get_bpos() usage ALSA: hda: Add missing parameter description for snd_hdac_stream_timecounter_init() ALSA: usb-audio: Add native DSD support for Luxman D-08u ALSA: core: add isascii() check to card ID generator MAINTAINERS: ALSA: use linux-sound@vger.kernel.org list Revert "ALSA: hda: Conditionally use snooping for AMD HDMI" ASoC: intel: sof_sdw: Add check devm_kasprintf() returned value ASoC: imx-card: Set card.owner to avoid a warning calltrace if SND=m ASoC: dt-bindings: davinci-mcasp: Fix interrupts property ASoC: qcom: sm8250: add qrb4210-rb2-sndcard compatible string ASoC: dt-bindings: qcom,sm8250: add qrb4210-rb2-sndcard ALSA: hda: fix trigger_tstamp_latched ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 ALSA: hda/generic: Drop obsoleted obey_preferred_dacs flag ALSA: hda/generic: Unconditionally prefer preferred_dacs pairs ALSA: silence integer wrapping warning ASoC: Intel: soc-acpi: arl: Fix some missing empty terminators ASoC: Intel: soc-acpi-intel-rpl-match: add missing empty item ...
2024-10-04Merge tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly fixes, xe and amdgpu lead the way, with panthor, and few core components getting various fixes. Nothing seems too out of the ordinary. atomic: - Use correct type when reading damage rectangles display: - Fix kernel docs dp-mst: - Fix DSC decompression detection hdmi: - Fix infoframe size sched: - Update maintainers - Fix race condition whne queueing up jobs - Fix locking in drm_sched_entity_modify_sched() - Fix pointer deref if entity queue changes sysfb: - Disable sysfb if framebuffer parent device is unknown amdgpu: - DML2 fix - DSC fix - Dispclk fix - eDP HDR fix - IPS fix - TBT fix i915: - One fix for bitwise and logical "and" mixup in PM code xe: - Restore pci state on resume - Fix locking on submission, queue and vm - Fix UAF on queue destruction - Fix resource release on freq init error path - Use rw_semaphore to reduce contention on ASID->VM lookup - Fix steering for media on Xe2_HPM - Tuning updates to Xe2 - Resume TDR after GT reset to prevent jobs running forever - Move id allocation to avoid userspace using a guessed number to trigger UAF - Fix OA stream close preventing pbatch buffers to complete - Fix NPD when migrating memory on LNL - Fix memory leak when aborting binds panthor: - Fix locking - Set FOP_UNSIGNED_OFFSET in fops instance - Acquire lock in panthor_vm_prepare_map_op_ctx() - Avoid uninitialized variable in tick_ctx_cleanup() - Do not block scheduler queue if work is pending - Do not add write fences to the shared BOs vbox: - Fix VLA handling" * tag 'drm-fixes-2024-10-04' of https://gitlab.freedesktop.org/drm/kernel: (41 commits) drm/xe: Fix memory leak when aborting binds drm/xe: Prevent null pointer access in xe_migrate_copy drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close drm/xe/queue: move xa_alloc to prevent UAF drm/xe/vm: move xa_alloc to prevent UAF drm/xe: Clean up VM / exec queue file lock usage. drm/xe: Resume TDR after GT reset drm/xe/xe2: Add performance tuning for L3 cache flushing drm/xe/xe2: Extend performance tuning to media GT drm/xe/mcr: Use Xe2_LPM steering tables for Xe2_HPM drm/xe: Use helper for ASID -> VM in GPU faults and access counters drm/xe: Convert to USM lock to rwsem drm/xe: use devm_add_action_or_reset() helper drm/xe: fix UAF around queue destruction drm/xe/guc_submit: add missing locking in wedged_fini drm/xe: Restore pci state upon resume drm/amd/display: Fix system hang while resume with TBT monitor drm/amd/display: Enable idle workqueue for more IPS modes drm/amd/display: Add HDR workaround for specific eDP drm/amd/display: avoid set dispclk to 0 ...
2024-10-04Merge tag 'fsnotify_for_v6.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fixes from Jan Kara: "Fixes for an inotify deadlock and a data race in fsnotify" * tag 'fsnotify_for_v6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: inotify: Fix possible deadlock in fsnotify_destroy_mark fsnotify: Avoid data race between fsnotify_recalc_mask() and fsnotify_object_watched()
2024-10-04Merge tag 'for-6.12-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - in incremental send, fix invalid clone operation for file that got its size decreased - fix __counted_by() annotation of send path cache entries, we do not store the terminating NUL - fix a longstanding bug in relocation (and quite hard to hit by chance), drop back reference cache that can get out of sync after transaction commit - wait for fixup worker kthread before finishing umount - add missing raid-stripe-tree extent for NOCOW files, zoned mode cannot have NOCOW files but RST is meant to be a standalone feature - handle transaction start error during relocation, avoid potential NULL pointer dereference of relocation control structure (reported by syzbot) - disable module-wide rate limiting of debug level messages - minor fix to tracepoint definition (reported by checkpatch.pl) * tag 'for-6.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: disable rate limiting when debug enabled btrfs: wait for fixup workers before stopping cleaner kthread during umount btrfs: fix a NULL pointer dereference when failed to start a new trasacntion btrfs: send: fix invalid clone operation for file that got its size decreased btrfs: tracepoints: end assignment with semicolon at btrfs_qgroup_extent event class btrfs: drop the backref cache during relocation if we commit btrfs: also add stripe entries for NOCOW writes btrfs: send: fix buffer overflow detection when copying path to cache entry
2024-10-04Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull close_range() fix from Al Viro: "Fix the logic in descriptor table trimming" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: close_range(): fix the logics in descriptor table trimming
2024-10-04HID: add per device quirk to force bind to hid-genericBenjamin Tissoires
We already have the possibility to force not binding to hid-generic and rely on a dedicated driver, but we couldn't do the other way around. This is useful for BPF programs where we are fixing the report descriptor and the events, but want to avoid a specialized driver to come after BPF which would unwind everything that is done there. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Link: https://patch.msgid.link/20241001-hid-bpf-hid-generic-v3-8-2ef1019468df@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-10-04HID: bpf: move HID-BPF report descriptor fixup earlierBenjamin Tissoires
Currently, hid_bpf_rdesc_fixup() is called once the match between the HID device and the driver is done. This can be problematic in case the driver selected by the kernel would change the report descriptor after the fact. To give a chance for hid_bpf_rdesc_fixup() to provide hints on to how to select a dedicated driver or not, move the call to that BPF hook earlier in the .probe() process, when we get the first match. However, this means that we might get called more than once (typically once for hid-generic, and once for hid-vendor-specific). So we store the result of HID-BPF fixup in struct hid_device. Basically, this means that ->bpf_rdesc can replace ->dev_rdesc when it was used in the code. In order to not grow struct hid_device, some fields are re-ordered. This was the output of pahole for the first 128 bytes: struct hid_device { __u8 * dev_rdesc; /* 0 8 */ unsigned int dev_rsize; /* 8 4 */ /* XXX 4 bytes hole, try to pack */ __u8 * rdesc; /* 16 8 */ unsigned int rsize; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ struct hid_collection * collection; /* 32 8 */ unsigned int collection_size; /* 40 4 */ unsigned int maxcollection; /* 44 4 */ unsigned int maxapplication; /* 48 4 */ __u16 bus; /* 52 2 */ __u16 group; /* 54 2 */ __u32 vendor; /* 56 4 */ __u32 product; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ __u32 version; /* 64 4 */ enum hid_type type; /* 68 4 */ unsigned int country; /* 72 4 */ /* XXX 4 bytes hole, try to pack */ struct hid_report_enum report_enum[3]; /* 80 6216 */ Basically, we got three holes of 4 bytes. We can reorder things a little and makes those 3 holes a continuous 12 bytes hole, which can be replaced by the new pointer and the new unsigned int we need. In terms of code allocation, when not using HID-BPF, we are back to kernel v6.2 in hid_open_report(). These multiple kmemdup() calls will be fixed in a later commit. Link: https://patch.msgid.link/20241001-hid-bpf-hid-generic-v3-1-2ef1019468df@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
2024-10-04usb: chipidea: add CI_HDRC_HAS_SHORT_PKT_LIMIT flagXu Yang
Currently, the imx deivice controller has below limitations: 1. can't generate short packet interrupt if IOC not set in dTD. So if one request span more than one dTDs and only the last dTD set IOC, the usb request will pending there if no more data comes. 2. the controller can't accurately deliver data to differtent usb requests in some cases due to short packet. For example: one usb request span 3 dTDs, then if the controller received a short packet the next packet will go to 2nd dTD of current request rather than the first dTD of next request. 3. can't build a bus packet use multiple dTDs. For example: controller needs to send one packet of 512 bytes use dTD1 (200 bytes) + dTD2 (312 bytes), actually the host side will see 200 bytes short packet. Based on these limits, add CI_HDRC_HAS_SHORT_PKT_LIMIT flag and use it on imx platforms. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/20240923081203.2851768-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-04appletalk: Remove deadcodeDr. David Alan Gilbert
alloc_ltalkdev in net/appletalk/dev.c is dead since commit 00f3696f7555 ("net: appletalk: remove cops support") Removing it (and it's helper) leaves dev.c and if_ltalk.h empty; remove them and the Makefile entry. tun.c was including that if_ltalk.h but actually wanted the uapi version for LTALK_ALEN, fix up the path. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-10-04arm64/ptrace: Expose GCS via ptrace and core filesMark Brown
Provide a new register type NT_ARM_GCS reporting the current GCS mode and pointer for EL0. Due to the interactions with allocation and deallocation of Guarded Control Stacks we do not permit any changes to the GCS mode via ptrace, only GCSPR_EL0 may be changed. Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-27-222b78d87eee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-04mm: Define VM_SHADOW_STACK for arm64 when we support GCSMark Brown
Use VM_HIGH_ARCH_5 for guarded control stack pages. Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-14-222b78d87eee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-04mman: Add map_shadow_stack() flagsMark Brown
In preparation for adding arm64 GCS support make the map_shadow_stack() SHADOW_STACK_SET_TOKEN flag generic and add _SET_MARKER. The existing flag indicates that a token usable for stack switch should be added to the top of the newly mapped GCS region while the new flag indicates that a top of stack marker suitable for use by unwinders should be added above that. For arm64 the top of stack marker is all bits 0. Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Yury Khrustalev <yury.khrustalev@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-5-222b78d87eee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-04prctl: arch-agnostic prctl for shadow stackMark Brown
Three architectures (x86, aarch64, riscv) have announced support for shadow stacks with fairly similar functionality. While x86 is using arch_prctl() to control the functionality neither arm64 nor riscv uses that interface so this patch adds arch-agnostic prctl() support to get and set status of shadow stacks and lock the current configuation to prevent further changes, with support for turning on and off individual subfeatures so applications can limit their exposure to features that they do not need. The features are: - PR_SHADOW_STACK_ENABLE: Tracking and enforcement of shadow stacks, including allocation of a shadow stack if one is not already allocated. - PR_SHADOW_STACK_WRITE: Writes to specific addresses in the shadow stack. - PR_SHADOW_STACK_PUSH: Push additional values onto the shadow stack. These features are expected to be inherited by new threads and cleared on exec(), unknown features should be rejected for enable but accepted for locking (in order to allow for future proofing). This is based on a patch originally written by Deepak Gupta but modified fairly heavily, support for indirect landing pads is removed, additional modes added and the locking interface reworked. The set status prctl() is also reworked to just set flags, if setting/reading the shadow stack pointer is required this could be a separate prctl. Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Yury Khrustalev <yury.khrustalev@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Deepak Gupta <debug@rivosinc.com> Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-4-222b78d87eee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-04mm: Define VM_HIGH_ARCH_6Mark Brown
The addition of protection keys means that on arm64 we now use all of the currently defined VM_HIGH_ARCH_x bits. In order to allow us to allocate a new flag for GCS pages define VM_HIGH_ARCH_6. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-2-222b78d87eee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>