summaryrefslogtreecommitdiff
path: root/drivers/base
AgeCommit message (Collapse)Author
2020-10-05firmware: Add request_partial_firmware_into_buf()Scott Branden
Add request_partial_firmware_into_buf() to allow for portions of a firmware file to be read into a buffer. This is needed when large firmware must be loaded in portions from a file on memory constrained systems. Signed-off-by: Scott Branden <scott.branden@broadcom.com> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201002173828.2099543-16-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05firmware: Store opt_flags in fw_privKees Cook
Instead of passing opt_flags around so much, store it in the private structure so it can be examined by internals without needing to add more arguments to functions. Co-developed-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201002173828.2099543-15-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05fs/kernel_file_read: Add "offset" arg for partial readsKees Cook
To perform partial reads, callers of kernel_read_file*() must have a non-NULL file_size argument and a preallocated buffer. The new "offset" argument can then be used to seek to specific locations in the file to fill the buffer to, at most, "buf_size" per call. Where possible, the LSM hooks can report whether a full file has been read or not so that the contents can be reasoned about. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201002173828.2099543-14-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05firmware_loader: Use security_post_load_data()Kees Cook
Now that security_post_load_data() is wired up, use it instead of the NULL file argument style of security_post_read_file(), and update the security_kernel_load_data() call to indicate that a security_kernel_post_load_data() call is expected. Wire up the IMA check to match earlier logic. Perhaps a generalized change to ima_post_load_data() might look something like this: return process_buffer_measurement(buf, size, kernel_load_data_id_str(load_id), read_idmap[load_id] ?: FILE_CHECK, 0, NULL); Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Link: https://lore.kernel.org/r/20201002173828.2099543-10-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05LSM: Introduce kernel_post_load_data() hookKees Cook
There are a few places in the kernel where LSMs would like to have visibility into the contents of a kernel buffer that has been loaded or read. While security_kernel_post_read_file() (which includes the buffer) exists as a pairing for security_kernel_read_file(), no such hook exists to pair with security_kernel_load_data(). Earlier proposals for just using security_kernel_post_read_file() with a NULL file argument were rejected (i.e. "file" should always be valid for the security_..._file hooks, but it appears at least one case was left in the kernel during earlier refactoring. (This will be fixed in a subsequent patch.) Since not all cases of security_kernel_load_data() can have a single contiguous buffer made available to the LSM hook (e.g. kexec image segments are separately loaded), there needs to be a way for the LSM to reason about its expectations of the hook coverage. In order to handle this, add a "contents" argument to the "kernel_load_data" hook that indicates if the newly added "kernel_post_load_data" hook will be called with the full contents once loaded. That way, LSMs requiring full contents can choose to unilaterally reject "kernel_load_data" with contents=false (which is effectively the existing hook coverage), but when contents=true they can allow it and later evaluate the "kernel_post_load_data" hook once the buffer is loaded. With this change, LSMs can gain coverage over non-file-backed data loads (e.g. init_module(2) and firmware userspace helper), which will happen in subsequent patches. Additionally prepare IMA to start processing these cases. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/r/20201002173828.2099543-9-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05fs/kernel_read_file: Add file_size output argumentKees Cook
In preparation for adding partial read support, add an optional output argument to kernel_read_file*() that reports the file size so callers can reason more easily about their reading progress. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Scott Branden <scott.branden@broadcom.com> Link: https://lore.kernel.org/r/20201002173828.2099543-8-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05fs/kernel_read_file: Remove redundant size argumentKees Cook
In preparation for refactoring kernel_read_file*(), remove the redundant "size" argument which is not needed: it can be included in the return code, with callers adjusted. (VFS reads already cannot be larger than INT_MAX.) Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Scott Branden <scott.branden@broadcom.com> Link: https://lore.kernel.org/r/20201002173828.2099543-6-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05fs/kernel_read_file: Split into separate include fileScott Branden
Move kernel_read_file* out of linux/fs.h to its own linux/kernel_read_file.h include file. That header gets pulled in just about everywhere and doesn't really need functions not related to the general fs interface. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/20200706232309.12010-2-scott.branden@broadcom.com Link: https://lore.kernel.org/r/20201002173828.2099543-4-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05fs/kernel_read_file: Remove FIRMWARE_EFI_EMBEDDED enumKees Cook
The "FIRMWARE_EFI_EMBEDDED" enum is a "where", not a "what". It should not be distinguished separately from just "FIRMWARE", as this confuses the LSMs about what is being loaded. Additionally, there was no actual validation of the firmware contents happening. Fixes: e4c2c0ff00ec ("firmware: Add new platform fallback mechanism and firmware_request_platform()") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Scott Branden <scott.branden@broadcom.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201002173828.2099543-3-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05fs/kernel_read_file: Remove FIRMWARE_PREALLOC_BUFFER enumKees Cook
FIRMWARE_PREALLOC_BUFFER is a "how", not a "what", and confuses the LSMs that are interested in filtering between types of things. The "how" should be an internal detail made uninteresting to the LSMs. Fixes: a098ecd2fa7d ("firmware: support loading into a pre-allocated buffer") Fixes: fd90bc559bfb ("ima: based on policy verify firmware signatures (pre-allocated buffer)") Fixes: 4f0496d8ffa3 ("ima: based on policy warn about loading firmware (pre-allocated buffer)") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Scott Branden <scott.branden@broadcom.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201002173828.2099543-2-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05Merge back cpufreq material for 5.10.Rafael J. Wysocki
2020-10-02PM: domains: Allow to abort power off when no ->power_off() callbackUlf Hansson
In genpd_power_off() we may decide to abort the power off of the PM domain, even beyond the point when the governor would accept it. The abort is done if it turns out that a child domain has been requested to be powered on, which means it's waiting for the lock of the parent to be released. However, the abort is currently only considered if the genpd in question has a ->power_off() callback assigned. This is unnecessary limiting, especially if the genpd would have a parent of its own. Let's remove the limitation and make the behaviour consistent. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-02PM: domains: Rename power state enums for genpdUlf Hansson
To clarify the code a bit, let's rename GPD_STATE_ACTIVE into GENPD_STATE_ON and GPD_STATE_POWER_OFF to GENPD_STATE_OFF. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-02ACPI: Support Generic Initiator only domainsJonathan Cameron
Generic Initiators are a new ACPI concept that allows for the description of proximity domains that contain a device which performs memory access (such as a network card) but neither host CPU nor Memory. This patch has the parsing code and provides the infrastructure for an architecture to associate these new domains with their nearest memory processing node. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-02regmap: debugfs: use semicolons rather than commas to separate statementsJulia Lawall
Replace commas with semicolons. What is done is essentially described by the following Coccinelle semantic patch (http://coccinelle.lip6.fr/): // <smpl> @@ expression e1,e2; @@ e1 -, +; e2 ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/1601233948-11629-15-git-send-email-Julia.Lawall@inria.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02drivers core: node: Use a more typical macro definition style for ACCESS_ATTRJoe Perches
Remove the trailing semicolon from the macro and add it to its uses. Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/faf51a671160cf884efa68fb458d3e8a44b1a7a7.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02drivers core: Use sysfs_emit for shared_cpu_map_show and shared_cpu_list_showJoe Perches
Do not indirect the bitmap printing of these shared_cpu show functions by using cpumap_print_to_pagebuf/bitmap_print_to_pagebuf. Use the more typical style with the vsnprintf %*pb and %*pbl extensions directly so there is no possible mixup about the use of offset_in_page(buf) by bitmap_print_to_pagebuf. Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/80457b467ab6cde13a173cfd8a4f49cd8467a7fd.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02mm: and drivers core: Convert hugetlb_report_node_meminfo to sysfs_emitJoe Perches
Convert the unbound sprintf in hugetlb_report_node_meminfo to use sysfs_emit_at so that no possible overrun of a PAGE_SIZE buf can occur. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Link: https://lore.kernel.org/r/894b351b82da6013cde7f36ff4b5493cd0ec30d0.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02drivers core: Miscellaneous changes for sysfs_emitJoe Perches
Change additional instances that could use sysfs_emit and sysfs_emit_at that the coccinelle script could not convert. o macros creating show functions with ## concatenation o unbound sprintf uses with buf+len for start of output to sysfs_emit_at o returns with ?: tests and sprintf to sysfs_emit o sysfs output with struct class * not struct device * arguments Miscellanea: o remove unnecessary initializations around these changes o consistently use int len for return length of show functions o use octal permissions and not S_<FOO> o rename a few show function names so DEVICE_ATTR_<FOO> can be used o use DEVICE_ATTR_ADMIN_RO where appropriate o consistently use const char *output for strings o checkpatch/style neatening Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/8bc24444fe2049a9b2de6127389b57edfdfe324d.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02drivers core: Reindent a couple uses around sysfs_emitJoe Perches
Just a couple of whitespace realignment to open parenthesis for multi-line statements. Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/33224191421dbb56015eded428edfddcba997d63.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02drivers core: Remove strcat uses around sysfs_emit and neatenJoe Perches
strcat is no longer necessary for sysfs_emit and sysfs_emit_at uses. Convert the strcat uses to sysfs_emit calls and neaten other block uses of direct returns to use an intermediate const char *. Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/5d606519698ce4c8f1203a2b35797d8254c6050a.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-02drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functionsJoe Perches
Convert the various sprintf fmaily calls in sysfs device show functions to sysfs_emit and sysfs_emit_at for PAGE_SIZE buffer safety. Done with: $ spatch -sp-file sysfs_emit_dev.cocci --in-place --max-width=80 . And cocci script: $ cat sysfs_emit_dev.cocci @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - sprintf(buf, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - snprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - scnprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; expression chr; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... return - strcpy(buf, chr); + sysfs_emit(buf, chr); ...> } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... len = - sprintf(buf, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... len = - snprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... len = - scnprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { <... - len += scnprintf(buf + len, PAGE_SIZE - len, + len += sysfs_emit_at(buf, len, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; expression chr; @@ ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf) { ... - strcpy(buf, chr); - return strlen(buf); + return sysfs_emit(buf, chr); } Signed-off-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/r/3d033c33056d88bbe34d4ddb62afd05ee166ab9a.1600285923.git.joe@perches.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-28Merge tag 'regmap-field-bulk-api' into regmap-5.10Mark Brown
regmap: Add a bulk field API Useful for devices with many fields.
2020-09-28regmap: add support to regmap_field_bulk_alloc/free apisSrinivas Kandagatla
Usage of regmap_field_alloc becomes much overhead when number of fields exceed more than 3. QCOM LPASS driver has extensively converted to use regmap_fields. Using new bulk api to allocate fields makes it much more cleaner code to read! Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Tested-by: Srinivasa Rao Mandadapu <srivasam@codeaurora.org> Link: https://lore.kernel.org/r/20200925164856.10315-2-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-28Merge series "use semicolons rather than commas to separate statements" from ↵Mark Brown
Julia Lawall <Julia.Lawall@inria.fr>: These patches replace commas by semicolons. This was done using the Coccinelle semantic patch (http://coccinelle.lip6.fr/) shown below. This semantic patch ensures that commas inside for loop headers will not be transformed. It also doesn't touch macro definitions. Coccinelle ensures that braces are added as needed when a single-statement branch turns into a multi-statement one. This semantic patch has a few false positives, for variable delcarations such as: LIST_HEAD(x), *y; The semantic patch could be improved to avoid these, but for the moment they have been removed manually (2 occurrences). // <smpl> @initialize:ocaml@ @@ let infunction p = (* avoid macros *) (List.hd p).current_element <> "something_else" let combined p1 p2 = (List.hd p1).line_end = (List.hd p2).line || (((List.hd p1).line_end < (List.hd p2).line) && ((List.hd p1).col < (List.hd p2).col)) @bad@ statement S; declaration d; position p; @@ S@p d // special cases where newlines are needed (hope for no more than 5) @@ expression e1,e2; statement S; position p != bad.p; position p1; position p2 : script:ocaml(p1) { infunction p1 && combined p1 p2 }; @@ - e1@p1,@S@p e2@p2; + e1; e2; @@ expression e1,e2; statement S; position p != bad.p; position p1; position p2 : script:ocaml(p1) { infunction p1 && combined p1 p2 }; @@ - e1@p1,@S@p e2@p2; + e1; e2; @@ expression e1,e2; statement S; position p != bad.p; position p1; position p2 : script:ocaml(p1) { infunction p1 && combined p1 p2 }; @@ - e1@p1,@S@p e2@p2; + e1; e2; @@ expression e1,e2; statement S; position p != bad.p; position p1; position p2 : script:ocaml(p1) { infunction p1 && combined p1 p2 }; @@ - e1@p1,@S@p e2@p2; + e1; e2; @@ expression e1,e2; statement S; position p != bad.p; position p1; position p2 : script:ocaml(p1) { infunction p1 && combined p1 p2 }; @@ - e1@p1,@S@p e2@p2; + e1; e2; @r@ expression e1,e2; statement S; position p != bad.p; @@ e1 ,@S@p e2; @@ expression e1,e2; position p1; position p2 : script:ocaml(p1) { infunction p1 && not(combined p1 p2) }; statement S; position r.p; @@ e1@p1 -,@S@p +; e2@p2 ... when any // </smpl> --- drivers/acpi/processor_idle.c | 4 +++- drivers/ata/pata_icside.c | 21 +++++++++++++-------- drivers/base/regmap/regmap-debugfs.c | 2 +- drivers/bcma/driver_pci_host.c | 4 ++-- drivers/block/drbd/drbd_receiver.c | 6 ++++-- drivers/char/agp/amd-k7-agp.c | 2 +- drivers/char/agp/nvidia-agp.c | 2 +- drivers/char/agp/sworks-agp.c | 2 +- drivers/char/hw_random/iproc-rng200.c | 8 ++++---- drivers/char/hw_random/mxc-rnga.c | 6 +++--- drivers/char/hw_random/stm32-rng.c | 8 ++++---- drivers/char/ipmi/bt-bmc.c | 6 +++--- drivers/clk/meson/meson-aoclk.c | 2 +- drivers/clk/mvebu/ap-cpu-clk.c | 2 +- drivers/clk/uniphier/clk-uniphier-cpugear.c | 2 +- drivers/clk/uniphier/clk-uniphier-mux.c | 2 +- drivers/clocksource/mps2-timer.c | 6 +++--- drivers/clocksource/timer-armada-370-xp.c | 8 ++++---- drivers/counter/ti-eqep.c | 2 +- drivers/crypto/amcc/crypto4xx_alg.c | 2 +- drivers/crypto/atmel-tdes.c | 2 +- drivers/crypto/hifn_795x.c | 4 ++-- drivers/crypto/talitos.c | 8 ++++---- 23 files changed, 60 insertions(+), 51 deletions(-)
2020-09-28regmap: destroy mutex (if used) in regmap_exit()Bartosz Golaszewski
While not destroying mutexes doesn't lead to memory leaks, it's still the correct thing to do for mutex debugging accounting. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20200928120614.23172-1-brgl@bgdev.pl Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-28regmap: debugfs: use semicolons rather than commas to separate statementsJulia Lawall
Replace commas with semicolons. What is done is essentially described by the following Coccinelle semantic patch (http://coccinelle.lip6.fr/): // <smpl> @@ expression e1,e2; @@ e1 -, +; e2 ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/1601233948-11629-15-git-send-email-Julia.Lawall@inria.fr Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-26mm: don't rely on system state to detect hot-plug operationsLaurent Dufour
In register_mem_sect_under_node() the system_state's value is checked to detect whether the call is made during boot time or during an hot-plug operation. Unfortunately, that check against SYSTEM_BOOTING is wrong because regular memory is registered at SYSTEM_SCHEDULING state. In addition, memory hot-plug operation can be triggered at this system state by the ACPI [1]. So checking against the system state is not enough. The consequence is that on system with interleaved node's ranges like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] This can be seen on PowerPC LPAR after multiple memory hot-plug and hot-unplug operations are done. At the next reboot the node's memory ranges can be interleaved and since the call to link_mem_sections() is made in topology_init() while the system is in the SYSTEM_SCHEDULING state, the node's id is not checked, and the sections registered to multiple nodes: $ ls -l /sys/devices/system/memory/memory21/node* total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 In that case, the system is able to boot but if later one of theses memory blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is detected and this is triggering a BUG_ON(): kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This patch addresses the root cause by not relying on the system_state value to detect whether the call is due to a hot-plug operation. An extra parameter is added to link_mem_sections() detailing whether the operation is due to a hot-plug operation. [1] According to Oscar Salvador, using this qemu command line, ACPI memory hotplug operations are raised at SYSTEM_SCHEDULING state: $QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \ -m size=$MEM,slots=255,maxmem=4294967296k \ -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \ -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \ -object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \ -object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \ -object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \ -object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \ -object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \ -object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \ Fixes: 4fbce633910e ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-25Merge tag 'regmap-fix-v5.9-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fixes from Mark Brown: "Two issues here - one is a fix for use after free issues in the case where a regmap overrides its name using something dynamically generated, the other is that we weren't handling access checks non-incrementing I/O on registers within paged register regions correctly resulting in spurious errors. Both of these are quite rare but serious if they occur" * tag 'regmap-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: fix page selection for noinc writes regmap: fix page selection for noinc reads regmap: debugfs: Add back in erroneously removed initialisation of ret regmap: debugfs: Fix handling of name string for debugfs init delays
2020-09-25PM: runtime: Remove link state checks in rpm_get/put_supplier()Xiang Chen
To support runtime PM for hisi SAS driver (the driver is in directory drivers/scsi/hisi_sas), we add device link between scsi_device->sdev_gendev (consumer device) and hisi_hba->dev(supplier device) with flags DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE. After runtime suspended consumers and supplier, unload the dirver which causes a hung. We found that it called function device_release_driver_internal() to release the supplier device (hisi_hba->dev), as the device link was busy, it set the device link state to DL_STATE_SUPPLIER_UNBIND, and then it called device_release_driver_internal() to release the consumer device (scsi_device->sdev_gendev). Then it would try to call pm_runtime_get_sync() to resume the consumer device, but because consumer-supplier relation existed, it would try to resume the supplier first, but as the link state was already DL_STATE_SUPPLIER_UNBIND, so it skipped resuming the supplier and only resumed the consumer which hanged (it sends IOs to resume scsi_device while the SAS controller is suspended). Simple flow is as follows: device_release_driver_internal -> (supplier device) if device_links_busy -> device_links_unbind_consumers -> ... WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND) device_release_driver_internal (consumer device) pm_runtime_get_sync -> (consumer device) ... __rpm_callback -> rpm_get_suppliers -> if link->state == DL_STATE_SUPPLIER_UNBIND -> skip the action of resuming the supplier ... pm_runtime_clean_up_links ... Correct suspend/resume ordering between a supplier device and its consumer devices (resume the supplier device before resuming consumer devices, and suspend consumer devices before suspending the supplier device) should be guaranteed by runtime PM, but the state checks in rpm_get_supplier() and rpm_put_supplier() break this rule, so remove them. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> [ rjw: Subject and changelog edits ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-25Merge branch 'master' of ↵Christoph Hellwig
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into dma-mapping-for-next Pull in the latest 5.9 tree for the commit to revert the V4L2_FLAG_MEMORY_NON_CONSISTENT uapi addition.
2020-09-22printk: move dictionary keys to dev_printk_infoJohn Ogness
Dictionaries are only used for SUBSYSTEM and DEVICE properties. The current implementation stores the property names each time they are used. This requires more space than otherwise necessary. Also, because the dictionary entries are currently considered optional, it cannot be relied upon that they are always available, even if the writer wanted to store them. These issues will increase should new dictionary properties be introduced. Rather than storing the subsystem and device properties in the dict ring, introduce a struct dev_printk_info with separate fields to store only the property values. Embed this struct within the struct printk_info to provide guaranteed availability. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/87mu1jl6ne.fsf@jogness.linutronix.de
2020-09-22regmap: debugfs: Fix more error path regressionsCharles Keepax
Many error paths in __regmap_init rely on ret being pre-initialised to -EINVAL, add an extra initialisation in after the new call to regmap_set_name. Fixes: 94cc89eb8fa5 ("regmap: debugfs: Fix handling of name string for debugfs init delays") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200918152212.22200-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-21regmap: fix page selection for noinc writesDmitry Baryshkov
Non-incrementing writes can fail if register + length crosses page border. However for non-incrementing writes we should not check for page border crossing. Fix this by passing additional flag to _regmap_raw_write and passing length to _regmap_select_page basing on the flag. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Fixes: cdf6b11daa77 ("regmap: Add regmap_noinc_write API") Link: https://lore.kernel.org/r/20200917153405.3139200-2-dmitry.baryshkov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-21regmap: fix page selection for noinc readsDmitry Baryshkov
Non-incrementing reads can fail if register + length crosses page border. However for non-incrementing reads we should not check for page border crossing. Fix this by passing additional flag to _regmap_raw_read and passing length to _regmap_select_page basing on the flag. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Fixes: 74fe7b551f33 ("regmap: Add regmap_noinc_read API") Link: https://lore.kernel.org/r/20200917153405.3139200-1-dmitry.baryshkov@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-18arch_topology, arm, arm64: define arch_scale_freq_invariant()Valentin Schneider
arch_scale_freq_invariant() is used by schedutil to determine whether the scheduler's load-tracking signals are frequency invariant. Its definition is overridable, though by default it is hardcoded to 'true' if arch_scale_freq_capacity() is defined ('false' otherwise). This behaviour is not overridden on arm, arm64 and other users of the generic arch topology driver, which is somewhat precarious: arch_scale_freq_capacity() will always be defined, yet not all cpufreq drivers are guaranteed to drive the frequency invariance scale factor setting. In other words, the load-tracking signals may very well *not* be frequency invariant. Now that cpufreq can be queried on whether the current driver is driving the Frequency Invariance (FI) scale setting, the current situation can be improved. This combines the query of whether cpufreq supports the setting of the frequency scale factor, with whether all online CPUs are counter-based FI enabled. While cpufreq FI enablement applies at system level, for all CPUs, counter-based FI support could also be used for only a subset of CPUs to set the invariance scale factor. Therefore, if cpufreq-based FI support is present, we consider the system to be invariant. If missing, we require all online CPUs to be counter-based FI enabled in order for the full system to be considered invariant. If the system ends up not being invariant, a new condition is needed in the counter initialization code that disables all scale factor setting based on counters. Precedence of counters over cpufreq use is not important here. The invariant status is only given to the system if all CPUs have at least one method of setting the frequency scale factor. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-18arch_topology, cpufreq: constify arch_* cpumasksValentin Schneider
The passed cpumask arguments to arch_set_freq_scale() and arch_freq_counters_available() are only iterated over, so reflect this in the prototype. This also allows to pass system cpumasks like cpu_online_mask without getting a warning. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-18arch_topology: validate input frequencies to arch_set_freq_scale()Ionela Voinescu
The current frequency passed to arch_set_freq_scale() could end up being 0, signaling an error in setting a new frequency. Also, if the maximum frequency in 0, this will result in a division by 0 error. Therefore, validate these input values before using them for the setting of the frequency scale factor. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-18regmap: debugfs: Add back in erroneously removed initialisation of retCharles Keepax
Fixes: 94cc89eb8fa5 ("regmap: debugfs: Fix handling of name string for debugfs init delays") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200918112002.15216-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-17regmap: debugfs: Fix handling of name string for debugfs init delaysCharles Keepax
In regmap_debugfs_init the initialisation of the debugfs is delayed if the root node isn't ready yet. Most callers of regmap_debugfs_init pass the name from the regmap_config, which is considered temporary ie. may be unallocated after the regmap_init call returns. This leads to a potential use after free, where config->name has been freed by the time it is used in regmap_debugfs_initcall. This situation can be seen on Zynq, where the architecture init_irq callback registers a syscon device, using a local variable for the regmap_config. As init_irq is very early in the platform bring up the regmap debugfs root isn't ready yet. Although this doesn't crash it does result in the debugfs entry not having the correct name. Regmap already sets map->name from config->name on the regmap_init path and the fact that a separate field is used to pass the name to regmap_debugfs_init appears to be an artifact of the debugfs name being added before the map name. As such this patch updates regmap_debugfs_init to use map->name, which is already duplicated from the config avoiding the issue. This does however leave two lose ends, both regmap_attach_dev and regmap_reinit_cache can be called after a regmap is registered and would have had the effect of applying a new name to the debugfs entries. In both of these cases it was chosen to update the map name. In the case of regmap_attach_dev there are 3 users that currently use this function to update the name, thus doing so avoids changes for those users and it seems reasonable that attaching a device would want to set the name of the map. In the case of regmap_reinit_cache the primary use-case appears to be devices that need some register access to identify the device (for example devices in the same family) and then update the cache to match the exact hardware. Whilst no users do currently update the name here, given the use-case it seemed reasonable the name might want to be updated once the device is better identified. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200917120828.12987-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-17regmap: Add support for 12/20 register formattingRicardo Ribalda
Devices such as the AD5628 require 32 bits of data divided in 12 bits for dummy, command and address, and 20 for data and dummy. Eg: XXXXCCCCAAAADDDDDDDDDDDDDDDDXXXX Where X is dont care, C is command, A is address and D is data bits. Which would requierd the following regmap_config: static const struct regmap_config config_dac = { .reg_bits = 12, .val_bits = 20, .max_register = 0xff, }; Signed-off-by: Ricardo Ribalda <ribalda@kernel.org> Link: https://lore.kernel.org/r/20200917114727.1120373-1-ribalda@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-17dma-mapping: introduce DMA range map, supplanting dma_pfn_offsetJim Quinlan
The new field 'dma_range_map' in struct device is used to facilitate the use of single or multiple offsets between mapping regions of cpu addrs and dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only capable of holding a single uniform offset and had no region bounds checking. The function of_dma_get_range() has been modified so that it takes a single argument -- the device node -- and returns a map, NULL, or an error code. The map is an array that holds the information regarding the DMA regions. Each range entry contains the address offset, the cpu_start address, the dma_start address, and the size of the region. of_dma_configure() is the typical manner to set range offsets but there are a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel driver code. These cases now invoke the function dma_direct_set_offset(dev, cpu_addr, dma_addr, size). Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [hch: various interface cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
2020-09-17driver core: force NOIO allocations during unplugOliver Neukum
There is one overlooked situation under which a driver must not do IO to allocate memory. You cannot do that while disconnecting a device. A device being disconnected is no longer functional in most cases, yet IO may fail only when the handler runs. Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://lore.kernel.org/r/20200916191544.5104-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-14Merge 5.9-rc5 into driver-core-nextGreg Kroah-Hartman
We need the driver core changes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-13Merge tag 'driver-core-5.9-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are some small driver core and debugfs fixes for 5.9-rc5 Included in here are: - firmware loader memory leak fix - firmware loader testing fixes for non-EFI systems - device link locking fixes found by lockdep - kobject_del() bugfix that has been affecting some callers - debugfs minor fix All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: test_firmware: Test platform fw loading on non-EFI systems PM: <linux/device.h>: fix @em_pd kernel-doc warning kobject: Drop unneeded conditional in __kobject_del() driver core: Fix device_pm_lock() locking for device links MAINTAINERS: Add the security document to SECURITY CONTACT driver code: print symbolic error code debugfs: Fix module state check condition kobject: Restore old behaviour of kobject_del(NULL) firmware_loader: fix memory leak for paged buffer
2020-09-10platform_device: switch to simpler IDA interfaceBartosz Golaszewski
We don't need to specify any ranges when allocating IDs so we can switch to ida_alloc() and ida_free() instead of the ida_simple_ counterparts. ida_simple_get(ida, 0, 0, gfp) is equivalent to ida_alloc_range(ida, 0, UINT_MAX, gfp) which is equivalent to ida_alloc(ida, gfp). Note: IDR will never actually allocate an ID larger than INT_MAX. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Link: https://lore.kernel.org/r/20200909180248.10093-1-brgl@bgdev.pl Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-10driver core: platform: Document return type of more functionsStephen Boyd
I can't always remember the return values of these functions, and so I usually jump to the function to read the kernel-doc and see that it doesn't tell me. Then I have to spend more time reading the code to jump to the function that actually tells me the return values. Let's document it here so that we don't all have to spend time digging through the code to understand the return values. Cc: <linux-doc@vger.kernel.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20200910060440.2302925-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-08devres: provide devm_krealloc()Bartosz Golaszewski
Implement the managed variant of krealloc(). This function works with all memory allocated by devm_kmalloc() (or devres functions using it implicitly like devm_kmemdup(), devm_kstrdup() etc.). Managed realloc'ed chunks can be manually released with devm_kfree(). Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200824173859.4910-2-brgl@bgdev.pl Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-08syscore: Use pm_pr_dbg() for syscore_{suspend,resume}()Stephen Boyd
The debug messages about what syscore suspend/resume hooks are called are only present if you have initcall debugging enabled. Let's move these messages to pm_pr_dbg() so that the syscore PM messages are included along with all the other PM debugging info that can be seen during suspend/resume debugging. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20200806214633.204472-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-08driver core: Use the ktime_us_delta() helperZenghui Yu
Use the ktime_us_delta() helper to measure the driver probe time. Given the helpers already returns an s64 value, let's drop the unnecessary casting to s64 as well. There is no functional change. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20200803033343.1178-1-yuzenghui@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>