summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2020-06-16Merge tag 'flex-array-conversions-5.8-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull flexible-array member conversions from Gustavo A. R. Silva: "Replace zero-length arrays with flexible-array members. Notice that all of these patches have been baking in linux-next for two development cycles now. There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. C99 introduced “flexible array members”, which lacks a numeric size for the array declaration entirely: struct something { size_t count; struct foo items[]; }; This is the way the kernel expects dynamically sized trailing elements to be declared. It allows the compiler to generate errors when the flexible array does not occur last in the structure, which helps to prevent some kind of undefined behavior[3] bugs from being inadvertently introduced to the codebase. It also allows the compiler to correctly analyze array sizes (via sizeof(), CONFIG_FORTIFY_SOURCE, and CONFIG_UBSAN_BOUNDS). For instance, there is no mechanism that warns us that the following application of the sizeof() operator to a zero-length array always results in zero: struct something { size_t count; struct foo items[0]; }; struct something *instance; instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL); instance->count = count; size = sizeof(instance->items) * instance->count; memcpy(instance->items, source, size); At the last line of code above, size turns out to be zero, when one might have thought it represents the total size in bytes of the dynamic memory recently allocated for the trailing array items. Here are a couple examples of this issue[4][5]. Instead, flexible array members have incomplete type, and so the sizeof() operator may not be applied[6], so any misuse of such operators will be immediately noticed at build time. The cleanest and least error-prone way to implement this is through the use of a flexible array member: struct something { size_t count; struct foo items[]; }; struct something *instance; instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL); instance->count = count; size = sizeof(instance->items[0]) * instance->count; memcpy(instance->items, source, size); instead" [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") [4] commit f2cd32a443da ("rndis_wlan: Remove logically dead code") [5] commit ab91c2a89f86 ("tpm: eventlog: Replace zero-length array with flexible-array member") [6] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html * tag 'flex-array-conversions-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (41 commits) w1: Replace zero-length array with flexible-array tracing/probe: Replace zero-length array with flexible-array soc: ti: Replace zero-length array with flexible-array tifm: Replace zero-length array with flexible-array dmaengine: tegra-apb: Replace zero-length array with flexible-array stm class: Replace zero-length array with flexible-array Squashfs: Replace zero-length array with flexible-array ASoC: SOF: Replace zero-length array with flexible-array ima: Replace zero-length array with flexible-array sctp: Replace zero-length array with flexible-array phy: samsung: Replace zero-length array with flexible-array RxRPC: Replace zero-length array with flexible-array rapidio: Replace zero-length array with flexible-array media: pwc: Replace zero-length array with flexible-array firmware: pcdp: Replace zero-length array with flexible-array oprofile: Replace zero-length array with flexible-array block: Replace zero-length array with flexible-array tools/testing/nvdimm: Replace zero-length array with flexible-array libata: Replace zero-length array with flexible-array kprobes: Replace zero-length array with flexible-array ...
2020-06-17close_range: add CLOSE_RANGE_UNSHAREChristian Brauner
One of the use-cases of close_range() is to drop file descriptors just before execve(). This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE. This expands {dup,unshare)_fd() to take a max_fds argument that indicates the maximum number of file descriptors to copy from the old struct files. When the user requests that all file descriptors are supposed to be closed via close_range(min, max) then we can cap via unshare_fd(min) and hence don't need to do any of the heavy fput() work for everything above min. The patch makes it so that if CLOSE_RANGE_UNSHARE is requested and we do in fact currently share our file descriptor table we create a new private copy. We then close all fds in the requested range and finally after we're done we install the new fd table. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-06-17arch: wire-up close_range()Christian Brauner
This wires up the close_range() syscall into all arches at once. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Jann Horn <jannh@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Dmitry V. Levin <ldv@altlinux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Florian Weimer <fweimer@redhat.com> Cc: linux-api@vger.kernel.org Cc: linux-alpha@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-ia64@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org
2020-06-17open: add close_range()Christian Brauner
This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. I was contacted by FreeBSD as they wanted to have the same close_range() syscall as we proposed here. We've coordinated this and in the meantime, Kyle was fast enough to merge close_range() into FreeBSD already in April: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 and the current plan is to backport close_range() to FreeBSD 12.2 (cf. [2]) once its merged in Linux too. Python is in the process of switching to close_range() on FreeBSD and they are waiting on us to merge this to switch on Linux as well: https://bugs.python.org/issue38061 The syscall came up in a recent discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall (cf. [1]). Note, a syscall in this manner has been requested by various people over time. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). (Please note, unshare(CLONE_FILES) should only be needed if the calling task is multi-threaded and shares the file descriptor table with another thread in which case two threads could race with one thread allocating file descriptors and the other one closing them via close_range(). For the general case close_range() before the execve() is sufficient.) Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor. From looking at various large(ish) userspace code bases this or similar patterns are very common in: - service managers (cf. [4]) - libcs (cf. [6]) - container runtimes (cf. [5]) - programming language runtimes/standard libraries - Python (cf. [2]) - Rust (cf. [7], [8]) As Dmitry pointed out there's even a long-standing glibc bug about missing kernel support for this task (cf. [3]). In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery (cf. comment [8] on Rust). The performance is striking. For good measure, comparing the following simple close_all_fds() userspace implementation that is essentially just glibc's version in [6]: static int close_all_fds(void) { int dir_fd; DIR *dir; struct dirent *direntp; dir = opendir("/proc/self/fd"); if (!dir) return -1; dir_fd = dirfd(dir); while ((direntp = readdir(dir))) { int fd; if (strcmp(direntp->d_name, ".") == 0) continue; if (strcmp(direntp->d_name, "..") == 0) continue; fd = atoi(direntp->d_name); if (fd == dir_fd || fd == 0 || fd == 1 || fd == 2) continue; close(fd); } closedir(dir); return 0; } to close_range() yields: 1. closing 4 open files: - close_all_fds(): ~280 us - close_range(): ~24 us 2. closing 1000 open files: - close_all_fds(): ~5000 us - close_range(): ~800 us close_range() is designed to allow for some flexibility. Specifically, it does not simply always close all open file descriptors of a task. Instead, callers can specify an upper bound. This is e.g. useful for scenarios where specific file descriptors are created with well-known numbers that are supposed to be excluded from getting closed. For extra paranoia close_range() comes with a flags argument. This can e.g. be used to implement extension. Once can imagine userspace wanting to stop at the first error instead of ignoring errors under certain circumstances. There might be other valid ideas in the future. In any case, a flag argument doesn't hurt and keeps us on the safe side. From an implementation side this is kept rather dumb. It saw some input from David and Jann but all nonsense is obviously my own! - Errors to close file descriptors are currently ignored. (Could be changed by setting a flag in the future if needed.) - __close_range() is a rather simplistic wrapper around __close_fd(). My reasoning behind this is based on the nature of how __close_fd() needs to release an fd. But maybe I misunderstood specifics: We take the files_lock and rcu-dereference the fdtable of the calling task, we find the entry in the fdtable, get the file and need to release files_lock before calling filp_close(). In the meantime the fdtable might have been altered so we can't just retake the spinlock and keep the old rcu-reference of the fdtable around. Instead we need to grab a fresh reference to the fdtable. If my reasoning is correct then there's really no point in fancyfying __close_range(): We just need to rcu-dereference the fdtable of the calling task once to cap the max_fd value correctly and then go on calling __close_fd() in a loop. /* References */ [1]: https://lore.kernel.org/lkml/20190516165021.GD17978@ZenIV.linux.org.uk/ [2]: https://github.com/python/cpython/blob/9e4f2f3a6b8ee995c365e86d976937c141d867f8/Modules/_posixsubprocess.c#L220 [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=10353#c7 [4]: https://github.com/systemd/systemd/blob/5238e9575906297608ff802a27e2ff9effa3b338/src/basic/fd-util.c#L217 [5]: https://github.com/lxc/lxc/blob/ddf4b77e11a4d08f09b7b9cd13e593f8c047edc5/src/lxc/start.c#L236 [6]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/grantpt.c;h=2030e07fa6e652aac32c775b8c6e005844c3c4eb;hb=HEAD#l17 Note that this is an internal implementation that is not exported. Currently, libc seems to not provide an exported version of this because of missing kernel support to do this. Note, in a recent patch series Florian made grantpt() a nop thereby removing the code referenced here. [7]: https://github.com/rust-lang/rust/issues/12148 [8]: https://github.com/rust-lang/rust/blob/5f47c0613ed4eb46fca3633c1297364c09e5e451/src/libstd/sys/unix/process2.rs#L303-L308 Rust's solution is slightly different but is equally unperformant. Rust calls getdtablesize() which is a glibc library function that simply returns the current RLIMIT_NOFILE or OPEN_MAX values. Rust then goes on to call close() on each fd. That's obviously overkill for most tasks. Rarely, tasks - especially non-demons - hit RLIMIT_NOFILE or OPEN_MAX. Let's be nice and assume an unprivileged user with RLIMIT_NOFILE set to 1024. Even in this case, there's a very high chance that in the common case Rust is calling the close() syscall 1021 times pointlessly if the task just has 0, 1, and 2 open. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kyle Evans <self@kyle-evans.net> Cc: Jann Horn <jannh@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Dmitry V. Levin <ldv@altlinux.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: linux-api@vger.kernel.org
2020-06-16EDAC: Remove edac_get_dimm_by_index()Borislav Petkov
It is unused now. Signed-off-by: Borislav Petkov <bp@suse.de>
2020-06-16gpu: host1x: Correct trivial kernel-doc inconsistenciesColton Lewis
Silence documentation build warnings by adding kernel-doc fields. ./include/linux/host1x.h:69: warning: Function parameter or member 'parent' not described in 'host1x_client' ./include/linux/host1x.h:69: warning: Function parameter or member 'usecount' not described in 'host1x_client' ./include/linux/host1x.h:69: warning: Function parameter or member 'lock' not described in 'host1x_client' Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2020-06-16libceph: move away from global osd_req_flagsIlya Dryomov
osd_req_flags is overly general and doesn't suit its only user (read_from_replica option) well: - applying osd_req_flags in account_request() affects all OSD requests, including linger (i.e. watch and notify). However, linger requests should always go to the primary even though some of them are reads (e.g. notify has side effects but it is a read because it doesn't result in mutation on the OSDs). - calls to class methods that are reads are allowed to go to the replica, but most such calls issued for "rbd map" and/or exclusive lock transitions are requested to be resent to the primary via EAGAIN, doubling the latency. Get rid of global osd_req_flags and set read_from_replica flag only on specific OSD requests instead. Fixes: 8ad44d5e0d1e ("libceph: read_from_replica option") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2020-06-16compiler_attributes.h: Support no_sanitize_undefined check with GCC 4Marco Elver
UBSAN is supported since GCC 4.9, which unfortunately did not yet have __has_attribute(). To work around, the __GCC4_has_attribute workaround requires defining which compiler version supports the given attribute. In the case of no_sanitize_undefined, it is the first version that supports UBSAN, which is GCC 4.9. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Link: https://lkml.kernel.org/r/20200615231529.GA119644@google.com
2020-06-16reset: simple: Add reset callbackMaxime Ripard
The reset-simple code lacks a reset callback that is still pretty easy to implement. The only real thing to consider is the delay needed for a device to be reset, so let's expose that as part of the reset-simple driver data. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2020-06-16reset: Move reset-simple header out of drivers/resetMaxime Ripard
The reset-simple code can be useful for drivers outside of drivers/reset that have a few reset controls as part of their features. Let's move it to include/linux/reset. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2020-06-15tifm: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15sctp: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15RxRPC: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15libata: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15kprobes: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15keys: encrypted-type: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15kexec: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15KVM: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15FS-Cache: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15cb710: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15drm/edid: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15can: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15dmaengine: Replace zero-length array with flexible-arrayGustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://github.com/KSPP/linux/issues/21 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATAChristoph Hellwig
SAS drivers can be compiled with ata support disabled. Provide a stub so that the drivers don't have to ifdef around wiring up ata_scsi_dma_need_drain. Link: https://lore.kernel.org/r/20200615064624.37317-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-06-15ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methodsVaibhav Jain
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm module and add the command family NVDIMM_FAMILY_PAPR to the white list of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the nvdimm command mask and implement necessary scaffolding in the module to handle ND_CMD_CALL ioctl and PDSM requests that we receive. The layout of the PDSM request as we expect from libnvdimm/libndctl is described in newly introduced uapi header 'papr_pdsm.h' which defines a 'struct nd_pkg_pdsm' and a maximal union named 'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg' for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is communicated by member 'struct nd_cmd_pkg.nd_command' together with other information on the pdsm payload (size-in, size-out). The patch also introduces 'struct pdsm_cmd_desc' instances of which are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd and corresponding access function pdsm_cmd_desc() is introduced. 'struct pdsm_cdm_desc' holds the service function for a given PDSM and corresponding payload in/out sizes. A new function papr_scm_service_pdsm() is introduced and is called from papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL command from libnvdimm. The function performs validation on the PDSM payload based on info present in corresponding PDSM descriptor and if valid calls the 'struct pdcm_cmd_desc.service' function to service the PDSM. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20200615124407.32596-6-vaibhav@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-06-15netfilter: flowtable: Make nf_flow_table_offload_add/del_cb inlineAlaa Hleihel
Currently, nf_flow_table_offload_add/del_cb are exported by nf_flow_table module, therefore modules using them will have hard-dependency on nf_flow_table and will require loading it all the time. This can lead to an unnecessary overhead on systems that do not use this API. To relax the hard-dependency between the modules, we unexport these functions and make them static inline. Fixes: 978703f42549 ("netfilter: flowtable: Add API for registering to flow table events") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15net/sched: act_ct: Make tcf_ct_flow_table_restore_skb inlineAlaa Hleihel
Currently, tcf_ct_flow_table_restore_skb is exported by act_ct module, therefore modules using it will have hard-dependency on act_ct and will require loading it all the time. This can lead to an unnecessary overhead on systems that do not use hardware connection tracking action (ct_metadata action) in the first place. To relax the hard-dependency between the modules, we unexport this function and make it a static inline one. Fixes: 30b0cf90c6dd ("net/sched: act_ct: Support restoring conntrack info on skbs") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-16bpf: Fix definition of bpf_ringbuf_output() helper in UAPI commentsAndrii Nakryiko
Fix definition of bpf_ringbuf_output() in UAPI header comments, which is used to generate libbpf's bpf_helper_defs.h header. Return value is a number (error code), not a pointer. Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200615214926.3638836-1-andriin@fb.com
2020-06-15trace/events/block.h: drop kernel-doc for dropped function parameterRandy Dunlap
Fix kernel-doc warning: the parameter was removed, so also remove the kernel-doc notation for it. ../include/trace/events/block.h:278: warning: Excess function parameter 'error' description in 'trace_block_bio_complete' Fixes: d24de76af836 ("block: remove the error argument to the block_bio_complete tracepoint") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-15spi: altera: add platform data for slave information.Xu Yilun
This patch introduces platform data for slave information, it allows spi-altera to add new spi devices once master registration is done. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/1591845911-10197-4-git-send-email-yilun.xu@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-15spi: altera: add SPI core parameters support via platform data.Xu Yilun
This patch introduced SPI core parameters in platform data, it allows passing these SPI core parameters via platform data. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/1591845911-10197-3-git-send-email-yilun.xu@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-15Merge series "Add support for voltage regulator on ChromeOS EC." from ↵Mark Brown
Pi-Hsun Shih <pihsun@chromium.org>: Add support for controlling voltage regulator that is connected and controlled by ChromeOS EC. Kernel controls these regulators through newly added EC host commands. Changes from v5: * Move new host command to a separate patch. * Use devm_regulator_register. * Address review comments. Changes from v4: * Change compatible name from regulator-cros-ec to cros-ec-regulator. Changes from v3: * Fix dt bindings file name. * Remove check around CONFIG_OF in driver. * Add new host commands to cros_ec_trace. * Address review comments. Changes from v2: * Add 'depends on OF' to Kconfig. * Add Kconfig description about compiling as module. Changes from v1: * Change compatible string to google,regulator-cros-ec. * Use reg property in device tree. * Change license for dt binding according to checkpatch.pl. * Address comments on code styles. Pi-Hsun Shih (3): dt-bindings: regulator: Add DT binding for cros-ec-regulator platform/chrome: cros_ec: Add command for regulator control. regulator: Add driver for cros-ec-regulator .../regulator/google,cros-ec-regulator.yaml | 51 ++++ drivers/platform/chrome/cros_ec_trace.c | 5 + drivers/regulator/Kconfig | 10 + drivers/regulator/Makefile | 1 + drivers/regulator/cros-ec-regulator.c | 257 ++++++++++++++++++ .../linux/platform_data/cros_ec_commands.h | 82 ++++++ 6 files changed, 406 insertions(+) create mode 100644 Documentation/devicetree/bindings/regulator/google,cros-ec-regulator.yaml create mode 100644 drivers/regulator/cros-ec-regulator.c base-commit: b791d1bdf9212d944d749a5c7ff6febdba241771 -- 2.27.0.290.gba653c62da-goog
2020-06-15platform/chrome: cros_ec: Add command for regulator control.Pi-Hsun Shih
Add host commands for voltage regulator control through ChromeOS EC. Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org> Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20200612040526.192878-3-pihsun@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-15regmap: convert all regmap_update_bits() and co. macros to static inlinesBartosz Golaszewski
There's no reason to have these as macros. Let's convert them all to static inlines for better readability and stronger typing. Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20200615072313.11106-1-brgl@bgdev.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-15Merge tag 'ext4-for-linus-5.8-rc1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull more ext4 updates from Ted Ts'o: "This is the second round of ext4 commits for 5.8 merge window [1]. It includes the per-inode DAX support, which was dependant on the DAX infrastructure which came in via the XFS tree, and a number of regression and bug fixes; most notably the "BUG: using smp_processor_id() in preemptible code in ext4_mb_new_blocks" reported by syzkaller" [1] The pull request actually came in 15 minutes after I had tagged the rc1 release. Tssk, tssk, late.. - Linus * tag 'ext4-for-linus-5.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4, jbd2: ensure panic by fix a race between jbd2 abort and ext4 error handlers ext4: support xattr gnu.* namespace for the Hurd ext4: mballoc: Use this_cpu_read instead of this_cpu_ptr ext4: avoid utf8_strncasecmp() with unstable name ext4: stop overwrite the errcode in ext4_setup_super ext4: fix partial cluster initialization when splitting extent ext4: avoid race conditions when remounting with options that change dax Documentation/dax: Update DAX enablement for ext4 fs/ext4: Introduce DAX inode flag fs/ext4: Remove jflag variable fs/ext4: Make DAX mount option a tri-state fs/ext4: Only change S_DAX on inode load fs/ext4: Update ext4_should_use_dax() fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS fs/ext4: Disallow verity if inode is DAX fs/ext4: Narrow scope of DAX check in setflags
2020-06-15Merge existing fixes from spi/for-5.8Mark Brown
2020-06-15spi: uapi: spidev: Use TABs for alignmentGeert Uytterhoeven
The UAPI <linux/spi/spidev.h> uses TABs for alignment. Convert the recently introduced spaces to TABs to restore consistency. Fixes: 7bb64402a092136 ("spi: tools: Add macro definitions to fix build errors") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20200613073755.15906-1-geert+renesas@glider.be Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-15ASoC: soc-devres: add devm_snd_soc_register_dai()Pierre-Louis Bossart
The registration of DAIs may be done at two distinct times, once during a component registration and later when loading a topology. Since devm_ managed resources are freed in the reverse order they were allocated, when a component starts unregistering DAIs by walking through the DAI list, the memory allocated for the topology-registered DAIs was freed already, which leads to 100% reproducible KASAN use-after-free reports. This patch suggests a new devm_ function to force the DAI list to be updated prior to freeing the memory chunks referenced by the list pointers. Suggested-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> BugLink: https://github.com/thesofproject/linux/issues/2186 Link: https://lore.kernel.org/r/20200612205938.26415-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-06-15efi: Replace zero-length array and use struct_size() helperGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. Lastly, make use of the sizeof_field() helper instead of an open-coded version. This issue was found with the help of Coccinelle and audited _manually_. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200527171425.GA4053@embeddedor Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-06-15efi/tpm: Verify event log header before parsingFabian Vogt
It is possible that the first event in the event log is not actually a log header at all, but rather a normal event. This leads to the cast in __calc_tpm2_event_size being an invalid conversion, which means that the values read are effectively garbage. Depending on the first event's contents, this leads either to apparently normal behaviour, a crash or a freeze. While this behaviour of the firmware is not in accordance with the TCG Client EFI Specification, this happens on a Dell Precision 5510 with the TPM enabled but hidden from the OS ("TPM On" disabled, state otherwise untouched). The EFI firmware claims that the TPM is present and active and that it supports the TCG 2.0 event log format. Fortunately, this can be worked around by simply checking the header of the first event and the event log header signature itself. Commit b4f1874c6216 ("tpm: check event log version before reading final events") addressed a similar issue also found on Dell models. Fixes: 6b0326190205 ("efi: Attempt to get the TCG2 event log in the boot stub") Signed-off-by: Fabian Vogt <fvogt@suse.de> Link: https://lore.kernel.org/r/1927248.evlx2EsYKh@linux-e202.suse.de Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1165773 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-06-15x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*()Peter Zijlstra
The UBSAN instrumentation only inserts external CALLs when things go 'BAD', much like WARN(). So treat them similar to WARN()s for noinstr, that is: allow them, at the risk of taking the machine down, to get their message out. Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Marco Elver <elver@google.com>
2020-06-15compiler_types.h: Add __no_sanitize_{address,undefined} to noinstrMarco Elver
Adds the portable definitions for __no_sanitize_address, and __no_sanitize_undefined, and subsequently changes noinstr to use the attributes to disable instrumentation via KASAN or UBSAN. Reported-by: syzbot+dc1fa714cb070b184db5@syzkaller.appspotmail.com Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Link: https://lore.kernel.org/lkml/000000000000d2474c05a6c938fe@google.com/
2020-06-15x86, kcsan: Add __no_kcsan to noinstrPeter Zijlstra
The 'noinstr' function attribute means no-instrumentation, this should very much include *SAN. Because lots of that is broken at present, only include KCSAN for now, as that is limited to clang11, which has sane function attribute behaviour. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-06-15kcsan: Remove __no_kcsan_or_inlinePeter Zijlstra
There are no more user of this function attribute, also, with us now actively supporting '__no_kcsan inline' it doesn't make sense to have in any case. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2020-06-15sched/deadline: Impose global limits on sched_attr::sched_periodPeter Zijlstra
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726161357.397880775@infradead.org
2020-06-15isolcpus: Affine unbound kernel threads to housekeeping cpusMarcelo Tosatti
This is a kernel enhancement that configures the cpu affinity of kernel threads via kernel boot option nohz_full=. When this option is specified, the cpumask is immediately applied upon kthread launch. This does not affect kernel threads that specify cpu and node. This allows CPU isolation (that is not allowing certain threads to execute on certain CPUs) without using the isolcpus=domain parameter, making it possible to enable load balancing on such CPUs during runtime (see kernel-parameters.txt). Note-1: this is based off on Wind River's patch at https://github.com/starlingx-staging/stx-integ/blob/master/kernel/kernel-std/centos/patches/affine-compute-kernel-threads.patch Difference being that this patch is limited to modifying kernel thread cpumask. Behaviour of other threads can be controlled via cgroups or sched_setaffinity. Note-2: Wind River's patch was based off Christoph Lameter's patch at https://lwn.net/Articles/565932/ with the only difference being the kernel parameter changed from kthread to kthread_cpus. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200527142909.23372-3-frederic@kernel.org
2020-06-15psi: eliminate kthread_worker from psi trigger scheduling mechanismSuren Baghdasaryan
Each psi group requires a dedicated kthread_delayed_work and kthread_worker. Since no other work can be performed using psi_group's kthread_worker, the same result can be obtained using a task_struct and a timer directly. This makes psi triggering simpler by removing lists and locks involved with kthread_worker usage and eliminates the need for poll_scheduled atomic use in the hot path. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200528195442.190116-1-surenb@google.com
2020-06-15sched/debug: Add new tracepoints to track util_estVincent Donnefort
The util_est signals are key elements for EAS task placement and frequency selection. Having tracepoints to track these signals enables load-tracking and schedutil testing and/or debugging by a toolkit. Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/1590597554-370150-1-git-send-email-vincent.donnefort@arm.com
2020-06-15sched/cputime: Improve cputime_adjust()Oleg Nesterov
People report that utime and stime from /proc/<pid>/stat become very wrong when the numbers are big enough, especially if you watch these counters incrementally. Specifically, the current implementation of: stime*rtime/total, results in a saw-tooth function on top of the desired line, where the teeth grow in size the larger the values become. IOW, it has a relative error. The result is that, when watching incrementally as time progresses (for large values), we'll see periods of pure stime or utime increase, irrespective of the actual ratio we're striving for. Replace scale_stime() with a math64.h helper: mul_u64_u64_div_u64() that is far more accurate. This also allows architectures to override the implementation -- for instance they can opt for the old algorithm if this new one turns out to be too expensive for them. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200519172506.GA317395@hirez.programming.kicks-ass.net
2020-06-15ftrace: Add perf ksymbol events for ftrace trampolinesAdrian Hunter
Symbols are needed for tools to describe instruction addresses. Pages allocated for ftrace's purposes need symbols to be created for them. Add such symbols to be visible via perf ksymbol events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200512121922.8997-8-adrian.hunter@intel.com