summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-01-04selftests/bpf: Double the size of test_loader logIlya Leoshkevich
Testing long jumps requires having >32k instructions. That many instructions require the verifier log buffer of 2 megabytes. The regular test_progs run doesn't need an increased buffer, since gotol test with 40k instructions doesn't request a log, but test_progs -v will set the verifier log level. Hence to avoid breaking gotol test with -v increase the buffer size. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240102193531.3169422-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04um: Make errors to stop ptraced child fatal during startupBenjamin Berg
For the detection code to check whether SYSEMU_SINGLESTEP works correctly we needed some error cases while stopping to be non-fatal. However, at this point stop_ptraced_child must always succeed, and we can therefore simplify it slightly to exit immediately on error. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04um: Drop NULL check from start_userspaceBenjamin Berg
start_userspace is only called from exactly one location, and the passed pointer for the userspace process stack cannot be NULL. Remove the check, without changing the control flow. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04um: Drop support for hosts without SYSEMU_SINGLESTEP supportBenjamin Berg
These features have existed since Linux 2.6.14 and can be considered widely available at this point. Also drop the backward compatibility code for PTRACE_SETOPTIONS. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> ---- v2: * Continue to define PTRACE_SYSEMU_SINGLESTEP as glibc only added it in version 2.27. Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04Merge tag 'ieee802154-for-net-next-2023-12-20' of ↵Jakub Kicinski
gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next Miquel Raynal says: ==================== This pull request mainly brings support for dynamic associations in the WPAN world. Thanks to the recent improvements it was possible to discover nearby devices, it is now also possible to associate with them to form a sub-network using a specific PAN ID. The support includes several functions, such as: * Requesting an association to a coordinator, waiting for the response * Sending a disassociation notification to a coordinator * Receiving an association request when we are coordinator, answering the request (for now all devices are accepted up to a limit, to be refined) * Sending a disassociation notification to a child * Users may request the list of associated devices (the parent and the children). Here are a few example of userspace calls that can be made: # iwpan dev <dev> associate pan_id 2 coord $COORD # iwpan dev <dev> list_associations # iwpan dev <dev> disassociate ext_addr $COORD There are as well two patches from Uwe turning remove callbacks into void functions. * tag 'ieee802154-for-net-next-2023-12-20' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next: mac802154: Avoid new associations while disassociating ieee802154: Avoid confusing changes after associating mac802154: Only allow PAN controllers to process association requests mac802154: Use the PAN coordinator parameter when stamping packets mac80254: Provide real PAN coordinator info in beacons ieee802154: Give the user the association list mac802154: Handle disassociation notifications from peers mac802154: Follow the number of associated devices ieee802154: Add support for limiting the number of associated devices mac802154: Handle association requests from peers mac802154: Handle disassociations ieee802154: Add support for user disassociation requests mac802154: Handle associating ieee802154: Add support for user association requests ieee802154: Internal PAN management ieee802154: Let PAN IDs be reset ieee802154: hwsim: Convert to platform remove callback returning void ieee802154: fakelb: Convert to platform remove callback returning void ==================== Link: https://lore.kernel.org/r/20231220095556.4d9cef91@xps-13 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04eventfs: Shortcut eventfs_iterate() by skipping entries already readSteven Rostedt (Google)
As the ei->entries array is fixed for the duration of the eventfs_inode, it can be used to skip over already read entries in eventfs_iterate(). That is, if ctx->pos is greater than zero, there's no reason in doing the loop across the ei->entries array for the entries less than ctx->pos. Instead, start the lookup of the entries at the current ctx->pos. Link: https://lore.kernel.org/all/CAHk-=wiKwDUDv3+jCsv-uacDcHDVTYsXtBR9=6sGM5mqX+DhOg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.494956957@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-01-04eventfs: Read ei->entries before ei->children in eventfs_iterate()Steven Rostedt (Google)
In order to apply a shortcut to skip over the current ctx->pos immediately, by using the ei->entries array, the reading of that array should be first. Moving the array reading before the linked list reading will make the shortcut change diff nicer to read. Link: https://lore.kernel.org/all/CAHk-=wiKwDUDv3+jCsv-uacDcHDVTYsXtBR9=6sGM5mqX+DhOg@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.333115095@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-01-04eventfs: Do ctx->pos update for all iterations in eventfs_iterate()Steven Rostedt (Google)
The ctx->pos was only updated when it added an entry, but the "skip to current pos" check (c--) happened for every loop regardless of if the entry was added or not. This inconsistency caused readdir to be incorrect. It was due to: for (i = 0; i < ei->nr_entries; i++) { if (c > 0) { c--; continue; } mutex_lock(&eventfs_mutex); /* If ei->is_freed then just bail here, nothing more to do */ if (ei->is_freed) { mutex_unlock(&eventfs_mutex); goto out; } r = entry->callback(name, &mode, &cdata, &fops); mutex_unlock(&eventfs_mutex); [..] ctx->pos++; } But this can cause the iterator to return a file that was already read. That's because of the way the callback() works. Some events may not have all files, and the callback can return 0 to tell eventfs to skip the file for this directory. for instance, we have: # ls /sys/kernel/tracing/events/ftrace/function format hist hist_debug id inject and # ls /sys/kernel/tracing/events/sched/sched_switch/ enable filter format hist hist_debug id inject trigger Where the function directory is missing "enable", "filter" and "trigger". That's because the callback() for events has: static int event_callback(const char *name, umode_t *mode, void **data, const struct file_operations **fops) { struct trace_event_file *file = *data; struct trace_event_call *call = file->event_call; [..] /* * Only event directories that can be enabled should have * triggers or filters, with the exception of the "print" * event that can have a "trigger" file. */ if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE)) { if (call->class->reg && strcmp(name, "enable") == 0) { *mode = TRACE_MODE_WRITE; *fops = &ftrace_enable_fops; return 1; } if (strcmp(name, "filter") == 0) { *mode = TRACE_MODE_WRITE; *fops = &ftrace_event_filter_fops; return 1; } } if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE) || strcmp(trace_event_name(call), "print") == 0) { if (strcmp(name, "trigger") == 0) { *mode = TRACE_MODE_WRITE; *fops = &event_trigger_fops; return 1; } } [..] return 0; } Where the function event has the TRACE_EVENT_FL_IGNORE_ENABLE set. This means that the entries array elements for "enable", "filter" and "trigger" when called on the function event will have the callback return 0 and not 1, to tell eventfs to skip these files for it. Because the "skip to current ctx->pos" check happened for all entries, but the ctx->pos++ only happened to entries that exist, it would confuse the reading of a directory. Which would cause: # ls /sys/kernel/tracing/events/ftrace/function/ format hist hist hist_debug hist_debug id inject inject The missing "enable", "filter" and "trigger" caused ls to show "hist", "hist_debug" and "inject" twice. Update the ctx->pos for every iteration to keep its update and the "skip" update consistent. This also means that on error, the ctx->pos needs to be decremented if it was incremented without adding something. Link: https://lore.kernel.org/all/20240104150500.38b15a62@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.172295263@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: 493ec81a8fb8e ("eventfs: Stop using dcache_readdir() for getdents()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-01-04eventfs: Have eventfs_iterate() stop immediately if ei->is_freed is setSteven Rostedt (Google)
If ei->is_freed is set in eventfs_iterate(), it means that the directory that is being iterated on is in the process of being freed. Just exit the loop immediately when that is ever detected, and separate out the return of the entry->callback() from ei->is_freed. Link: https://lore.kernel.org/linux-trace-kernel/20240104220048.016261289@goodmis.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-01-04NFSv4.1: Use the nfs_client's rpc timeouts for backchannelBenjamin Coddington
For backchannel requests that lookup the appropriate nfs_client, use the state-management rpc_clnt's rpc_timeout parameters for the backchannel's response. When the nfs_client cannot be found, fall back to using the xprt's default timeout parameters. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-01-04SUNRPC: Fixup v4.1 backchannel request timeoutsBenjamin Coddington
After commit 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on the sending list"), any 4.1 backchannel tasks placed on the sending queue would immediately return with -ETIMEDOUT since their req timers are zero. Initialize the backchannel's rpc_rqst timeout parameters from the xprt's default timeout settings. Fixes: 59464b262ff5 ("SUNRPC: SOFTCONN tasks should time out when on the sending list") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-01-04cxl/port: Fix missing target list lockDan Williams
cxl_port_setup_targets() modifies the ->targets[] array of a switch decoder. target_list_show() expects to be able to emit a coherent snapshot of that array by "holding" ->target_lock for read. The target_lock is held for write during initialization of the ->targets[] array, but it is not held for write during cxl_port_setup_targets(). The ->target_lock() predates the introduction of @cxl_region_rwsem. That semaphore protects changes to host-physical-address (HPA) decode which is precisely what writes to a switch decoder's target list affects. Replace ->target_lock with @cxl_region_rwsem. Now the side-effect of snapshotting a unstable view of a decoder's target list is likely benign so the Fixes: tag is presumptive. Fixes: 27b3f8d13830 ("cxl/region: Program target lists") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-04cxl/port: Fix decoder initialization when nr_targets > interleave_waysHuang Ying
The decoder_populate_targets() helper walks all of the targets in a port and makes sure they can be looked up in @target_map. Where @target_map is a lookup table from target position to target id (corresponding to a cxl_dport instance). However @target_map is only responsible for conveying the active dport instances as indicated by interleave_ways. When nr_targets > interleave_ways it results in decoder_populate_targets() walking off the end of the valid entries in @target_map. Given target_map is initialized to 0 it results in the dport lookup failing if position 0 is not mapped to a dport with an id of 0: cxl_port port3: Failed to populate active decoder targets cxl_port port3: Failed to add decoder cxl_port port3: Failed to add decoder3.0 cxl_bus_probe: cxl_port port3: probe: -6 This bug also highlights that when the decoder's ->targets[] array is written in cxl_port_setup_targets() it is missing a hold of the targets_lock to synchronize against sysfs readers of the target list. A fix for that is saved for a later patch. Fixes: a5c258021689 ("cxl/bus: Populate the target list at decoder create") Cc: <stable@vger.kernel.org> Signed-off-by: Huang, Ying <ying.huang@intel.com> [djbw: rewrite the changelog, find the Fixes: tag] Co-developed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-04selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socketMickaël Salaün
The IPv6 network stack first checks the sockaddr length (-EINVAL error) before checking the family (-EAFNOSUPPORT error). This was discovered thanks to commit a549d055a22e ("selftests/landlock: Add network tests"). Cc: Eric Paris <eparis@parisplace.org> Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Closes: https://lore.kernel.org/r/0584f91c-537c-4188-9e4f-04f192565667@collabora.com Fixes: 0f8db8cc73df ("selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()") Signed-off-by: Mickaël Salaün <mic@digikod.net> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-01-04perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vmThomas Richter
perf test 17 'Setup struct perf_event_attr' fails on s390 z/VM guest, using linux-next kernel. Root cause is the fall-back from hardware counter cycles perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ADDR|PERIOD|DATA_SRC read_format ID|LOST which returns -ENOENT on s390 z/VM guest. This causes the code to fall back to software counter task-clock, as can be seen in the debug output: ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0x1 (PERF_COUNT_SW_TASK_CLOCK) <-here { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ADDR|PERIOD|DATA_SRC read_format ID|LOST This succeeds on s390 z/VM guest. This successful installation of the counter task-clock is not listed in the expected results and the test case fails. This is caused by commit eb2eac0c7b618033 ("perf evsel: Fallback to "task-clock" when not system wide") which introduced fall back from event 'cycles' to event 'task-clock'. To fix this on s390 allow event number 0 (cycles) and event number 1 (task-clock) as expected result. Output before: # ./perf test -Fv 17 17: Setup struct perf_event_attr : --- start --- running './tests/attr/test-stat-group1' unsupp './tests/attr/test-stat-group1' running './tests/attr/test-record-graph-default' test limitation '!aarch64' excluded architecture list ['aarch64'] expected config=0, got 1 FAILED './tests/attr/test-record-graph-default' - match failure ---- end ---- Setup struct perf_event_attr: FAILED! # Output after: # ./perf test -F 17 17: Setup struct perf_event_attr : Ok # Fixes: eb2eac0c7b618033 ("perf evsel: Fallback to "task-clock" when not system wide") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20231219143235.1075522-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04perf db-export: Fix missing reference count get in call_path_from_sample()Ben Gainey
The addr_location map and maps fields in the inner loop were missing calls to map__get()/maps__get(). The subsequent addr_location__exit() call in each loop puts the map/maps fields causing use-after-free aborts. This issue reproduces on at least arm64 and x86_64 with something simple like `perf record -g ls` followed by `perf script -s script.py` with the following script: perf_db_export_mode = True perf_db_export_calls = False perf_db_export_callchains = True def sample_table(*args): print(f'sample_table({args})') def call_path_table(*args): print(f'call_path_table({args}') Committer testing: This test, just introduced by Ian Rogers, now passes, not segfaulting anymore: # perf test "perf script tests" 95: perf script tests : Ok # Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions") Signed-off-by: Ben Gainey <ben.gainey@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231207140911.3240408-1-ben.gainey@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04perf tests: Add perf script testIan Rogers
Start a new set of shell tests for testing perf script. The initial contribution is checking that some perf db-export functionality works as reported in this regression by Ben Gainey <ben.gainey@arm.com>: https://lore.kernel.org/lkml/20231207140911.3240408-1-ben.gainey@arm.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ben Gainey <ben.gainey@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231207174057.1482161-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04libsubcmd: Fix memory leak in uniq()Ian Rogers
uniq() will write one command name over another causing the overwritten string to be leaked. Fix by doing a pass that removes duplicates and a second that removes the holes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chenyuan Mi <cymi20@fudan.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231208000515.1693746-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04perf TUI: Don't ignore job controlAhelenia Ziemiańska
In its infinite wisdom, by default, SLang sets susp undef, and this can only be un-done by calling SLtty_set_suspend_state(true). After every SLang_init_tty(). Additionally, no provisions are made for maintaining the teletype attributes across suspend/continue (outside of curses emulation mode(?!), which provides full support, naturally), so we need to save and restore the flags ourselves, as well as reset the text colours when going under. We need to also re-draw the screen, and raising SIGWINCH, shockingly, Just Works. The correct solution would be to Not Use SLang, but as a stop-gap, this makes TUI 'perf report' usable. Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: yaowenbin <yaowenbin1@huawei.com> Link: https://lore.kernel.org/r/0354dcae23a8713f75f4fed609e0caec3c6e3cd5.1672174189.git.nabijaczleweli@nabijaczleweli.xyz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04net: phy: aquantia: switch to crc_itu_t()Stephen Rothwell
After merging the net-next tree, today's linux-next build (x86_64 allmodconfig) failed like this: drivers/net/phy/aquantia/aquantia_firmware.c: In function 'aqr_fw_load_memory': drivers/net/phy/aquantia/aquantia_firmware.c:135:23: error: implicit declaration of function 'crc_ccitt_false'; did you mean 'crc_ccitt_byte'? [-Werror=implicit-function-declaration] 135 | crc = crc_ccitt_false(crc, crc_data, sizeof(crc_data)); | ^~~~~~~~~~~~~~~ | crc_ccitt_byte Caused by commit e93984ebc1c8 ("net: phy: aquantia: add firmware load support") interacting with commit ("lib: crc_ccitt_false() is identical to crc_itu_t()") from the mm tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/20231221130946.7ed9a805@canb.auug.org.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04um: document arch_futex_atomic_op_inuserAnton Ivanov
arch_futex_atomic_op_inuser was not documented correctly resulting in build time warnings. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04Revert "octeon_ep_vf: add octeon_ep_vf driver"Jakub Kicinski
This reverts commit c902ba322cfda8ebe54ffd53392ef7e2ef5d1c65. This reverts commit 50648968b3e3c193b45eaca07840111c9d4fdb74. This reverts commit 77cef1e02104529f54c5b8b4126317eda3ff132d. This reverts commit 8f8d322bc47c1c5ecab1f2238b644e30f69cc475. This reverts commit 6ca7b5486ebd5e7985f0c98a2ac7ae49078043a4. This reverts commit db468f92c3b9437dfeb1dcf55d9b7d1b97769a6c. This reverts commit 5f8c64c2344c888a03fa4b7fd8c3b5e0c235d879. This reverts commit ebdc193b2ce209bfc1ebec2f777cd7bac00b547c. The driver needs more work. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04perf vendor events intel: Update sapphirerapids events to v1.17Ian Rogers
Update to v1.17 released in: https://github.com/intel/perfmon/pull/123 Add events FP_ARITH_DISPATCHED.V0, FP_ARITH_DISPATCHED.V1, FP_ARITH_DISPATCHED.V2, UNC_IIO_IOMMU0.1G_HITS, UNC_IIO_IOMMU0.2M_HITS and UNC_IIO_IOMMU0.4K_HITS. Description updates. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240104074259.653219-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04perf vendor events intel: Update icelakex events to v1.23Ian Rogers
Update to v1.23 released in: https://github.com/intel/perfmon/pull/123 Updates to event descriptions. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240104074259.653219-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04perf vendor events intel: Update emeraldrapids events to v1.02Ian Rogers
Update to v1.02 released in: https://github.com/intel/perfmon/pull/123 Removes events AMX_OPS_RETIRED.BF16 and AMX_OPS_RETIRED.INT8. Add events FP_ARITH_DISPATCHED.V0, FP_ARITH_DISPATCHED.V1, FP_ARITH_DISPATCHED.V2, UNC_IIO_IOMMU0.1G_HITS, UNC_IIO_IOMMU0.2M_HITS and UNC_IIO_IOMMU0.4K_HITS. Description updates. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240104074259.653219-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04perf vendor events intel: Alderlake/rocketlake metric fixesIan Rogers
Fix that the core PMU is being specified for 2 uncore events. Specify a PMU for the alderlake UNCORE_FREQ metric. Conversion script updated in: https://github.com/intel/perfmon/pull/126 Committer testing: Before this patch the "perf all metricgroups test" was failing, now: root@number:~# perf test metric 10: PMU events : 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 10.5: Parsing of metric thresholds with fake PMUs : Ok 61: Parse and process metrics : Ok 98: perf stat metrics (shadow stat) test : Skip 101: perf all metricgroups test : Ok 102: perf all metrics test : FAILED! 107: perf metrics value validation : Ok root@number:~# Test 102 is failing for another reason, not being able to get as many counters as needed, Ian Rogers suggested disabling the NMI watchdog to have more counters available: root@number:/home/acme# cat /proc/sys/kernel/nmi_watchdog 1 root@number:/home/acme# echo 0 > /proc/sys/kernel/nmi_watchdog root@number:/home/acme# perf test 102 102: perf all metrics test : Ok root@number:/home/acme# Closes: https://lore.kernel.org/lkml/ZZWOdHXJJ_oecWwm@kernel.org/ Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240104074259.653219-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-04um: mmu: remove stub_pagesJohannes Berg
I removed all the users of this some time ago, but evidently forgot the pointers. Remove them from the data structure too. Fixes: bfc58e2b98e9 ("um: remove process stub VMA") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04IB/iser: Prevent invalidating wrong MRSergey Gorenko
The iser_reg_resources structure has two pointers to MR but only one mr_valid field. The implementation assumes that we use only *sig_mr when pi_enable is true. Otherwise, we use only *mr. However, it is only sometimes correct. Read commands without protection information occur even when pi_enble is true. For example, the following SCSI commands have a Data-In buffer but never have protection information: READ CAPACITY (16), INQUIRY, MODE SENSE(6), MAINTENANCE IN. So, we use *sig_mr for some SCSI commands and *mr for the other SCSI commands. In most cases, it works fine because the remote invalidation is applied. However, there are two cases when the remote invalidation is not applicable. 1. Small write commands when all data is sent as an immediate. 2. The target does not support the remote invalidation feature. The lazy invalidation is used if the remote invalidation is impossible. Since, at the lazy invalidation, we always invalidate the MR we want to use, the wrong MR may be invalidated. To fix the issue, we need a field per MR that indicates the MR needs invalidation. Since the ib_mr structure already has such a field, let's use ib_mr.need_inval instead of iser_reg_resources.mr_valid. Fixes: b76a439982f8 ("IB/iser: Use IB_WR_REG_MR_INTEGRITY for PI handover") Link: https://lore.kernel.org/r/20231219072311.40989-1-sergeygo@nvidia.com Acked-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-01-04um: Fix naming clash between UML and schedulerAnton Ivanov
__cant_sleep was already used and exported by the scheduler. The name had to be changed to a UML specific one. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04um: virt-pci: fix platform map offsetVincent Whitchurch
The offset is currently always zero so the backend can't distinguish between accesses to different ioremapped areas. Fixes: 522c532c4fe7 ("virt-pci: add platform bus support") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2024-01-04Input: da9063_onkey - avoid explicitly setting input's parentDmitry Torokhov
devm_input_allocate_device() already sets parent of the new input device, there's no need to set it up explicitly. Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/ZYOseYfVgg0Ve6Zl@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-01-04regulator: event: Ensure atomicity for sequence numberNaresh Solanki
Previously, the sequence number in the regulator event subsystem was updated without atomic operations, potentially leading to race conditions. This commit addresses the issue by making the sequence number atomic. Signed-off-by: Naresh Solanki <naresh.solanki@9elements.com> Link: https://msgid.link/r/20240104141314.3337037-1-naresh.solanki@9elements.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-01-04Input: da9063_onkey - avoid using OF-specific APIsDmitry Torokhov
There is nothing OF-specific in the driver, so switch from OF properties helpers to generic device helpers. Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/r/ZYOsUfKceOFXuCt5@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-01-04s390/bpf: Fix gotol with large offsetsIlya Leoshkevich
The gotol implementation uses a wrong data type for the offset: it should be s32, not s16. Fixes: c690191e23d8 ("s390/bpf: Implement unconditional jump with 32-bit offset") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20240102193531.3169422-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgdWill Deacon
In commit f320bc742bc23 ("KVM: arm64: Prepare the creation of s1 mappings at EL2"), pKVM switches from a temporary host-provided page-table to its own page-table at EL2. Since there is only a single TTBR for the nVHE hypervisor, this involves disabling and re-enabling the MMU in __pkvm_init_switch_pgd(). Unfortunately, the memory barriers here are not quite correct. Specifically: - A DSB is required to complete the TLB invalidation executed while the MMU is disabled. - An ISB is required to make the new TTBR value visible to the page-table walker before the MMU is enabled in the SCTLR. An earlier version of the patch actually got this correct: https://lore.kernel.org/lkml/20210304184717.GB21795@willie-the-truck/ but thanks to some badly worded review comments from yours truly, these were dropped for the version that was eventually merged. Bring back the barriers and fix the potential issue (but note that this was found by code inspection). Cc: Quentin Perret <qperret@google.com> Fixes: f320bc742bc23 ("KVM: arm64: Prepare the creation of s1 mappings at EL2") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240104164220.7968-1-will@kernel.org
2024-01-04Merge branch kvm-arm64/vgic-6.8 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/vgic-6.8: : . : Fix for the GICv4.1 vSGI pending state being set/cleared from : userspace, and some cleanup to the MMIO and userspace accessors : for the pending state. : : Also a fix for a potential UAF in the ITS translation cache. : . KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache KVM: arm64: vgic-v3: Reinterpret user ISPENDR writes as I{C,S}PENDR KVM: arm64: vgic: Use common accessor for writes to ICPENDR KVM: arm64: vgic: Use common accessor for writes to ISPENDR KVM: arm64: vgic-v4: Restore pending state on host userspace write Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-01-04KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cacheOliver Upton
There is a potential UAF scenario in the case of an LPI translation cache hit racing with an operation that invalidates the cache, such as a DISCARD ITS command. The root of the problem is that vgic_its_check_cache() does not elevate the refcount on the vgic_irq before dropping the lock that serializes refcount changes. Have vgic_its_check_cache() raise the refcount on the returned vgic_irq and add the corresponding decrement after queueing the interrupt. Cc: stable@vger.kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240104183233.3560639-1-oliver.upton@linux.dev
2024-01-04io_uring: ensure local task_work is run on wait timeoutJens Axboe
A previous commit added an earlier break condition here, which is fine if we're using non-local task_work as it'll be run on return to userspace. However, if DEFER_TASKRUN is used, then we could be leaving local task_work that is ready to process in the ctx list until next time that we enter the kernel to wait for events. Move the break condition to _after_ we have run task_work. Cc: stable@vger.kernel.org Fixes: 846072f16eed ("io_uring: mimimise io_cqring_wait_schedule") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-04Merge tag 'platform-drivers-x86-v6.7-7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fix from Ilpo Järvinen: "Unfortunately the P2SB deadlock fix broke some older HW and we need some time to figure out the best way to fix the issue so reverting the deadlock fix for now" * tag 'platform-drivers-x86-v6.7-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"
2024-01-04Merge tag 'sound-6.7-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "It became more than wished, partly because of vacations. But all changes are fairly device-specific and should be safe to apply: - A regression fix for Oops at ASoC HD-audio probe - A series of TAS2781 HD-audio codec fixes - A random build regression fix with SPI helpers - Minor endianness fix for USB-audio mixer code - ASoC FSL driver error handling fix - ASoC Mediatek driver register fix - A series of ASoC meson g12a driver fixes - A few usual HD-audio oneliner quirks" * tag 'sound-6.7-final' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6 ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux ASoC: meson: g12a-toacodec: Fix event generation ASoC: meson: g12a-tohdmitx: Validate written enum values ASoC: meson: g12a-toacodec: Validate written enum values ASoC: SOF: Intel: hda-codec: Delay the codec device registration ALSA: hda: cs35l41: fix building without CONFIG_SPI ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBook ALSA: hda/realtek: enable SND_PCI_QUIRK for hp pavilion 14-ec1xxx series ASoC: mediatek: mt8186: fix AUD_PAD_TOP register and offset ALSA: scarlett2: Convert meter levels from little-endian ALSA: hda/tas2781: remove sound controls in unbind ALSA: hda/tas2781: move set_drv_data outside tasdevice_init ALSA: hda/tas2781: fix typos in comment ALSA: hda/tas2781: do not use regcache ASoC: fsl_rpmsg: Fix error handler with pm_runtime_enable
2024-01-04Merge tag 'drm-fixes-2024-01-04' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "These were from over the holiday period, mainly i915, a couple of qaic, bridge and an mgag200. qaic: - fix GEM import - add quirk for soc version bridge: - parade-ps8640, ti-sn65dsi86: fix aux reads bounds mgag200: - fix gamma LUT init i915: - Fix bogus DPCD rev usage for DP phy test pattern setup - Fix handling of MMIO triggered reports in the OA buffer" * tag 'drm-fixes-2024-01-04' of git://anongit.freedesktop.org/drm/drm: drm/i915/perf: Update handling of MMIO triggered reports drm/i915/dp: Fix passing the correct DPCD_REV for drm_dp_set_phy_test_pattern drm/mgag200: Fix gamma lut not initialized for G200ER, G200EV, G200SE drm/bridge: ps8640: Fix size mismatch warning w/ len drm/bridge: ti-sn65dsi86: Never store more than msg->size bytes in AUX xfer drm/bridge: parade-ps8640: Never store more than msg->size bytes in AUX xfer accel/qaic: Implement quirk for SOC_HW_VERSION accel/qaic: Fix GEM import path code
2024-01-04bpfilter: remove bpfilterQuentin Deslandes
bpfilter was supposed to convert iptables filtering rules into BPF programs on the fly, from the kernel, through a usermode helper. The base code for the UMH was introduced in 2018, and couple of attempts (2, 3) tried to introduce the BPF program generate features but were abandoned. bpfilter now sits in a kernel tree unused and unusable, occasionally causing confusion amongst Linux users (4, 5). As bpfilter is now developed in a dedicated repository on GitHub (6), it was suggested a couple of times this year (LSFMM/BPF 2023, LPC 2023) to remove the deprecated kernel part of the project. This is the purpose of this patch. [1]: https://lore.kernel.org/lkml/20180522022230.2492505-1-ast@kernel.org/ [2]: https://lore.kernel.org/bpf/20210829183608.2297877-1-me@ubique.spb.ru/#t [3]: https://lore.kernel.org/lkml/20221224000402.476079-1-qde@naccy.de/ [4]: https://dxuuu.xyz/bpfilter.html [5]: https://github.com/linuxkit/linuxkit/pull/3904 [6]: https://github.com/facebook/bpfilter Signed-off-by: Quentin Deslandes <qde@naccy.de> Link: https://lore.kernel.org/r/20231226130745.465988-1-qde@naccy.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04bpf: Remove unnecessary cpu == 0 check in memallocYonghong Song
After merging the patch set [1] to reduce memory usage for bpf_global_percpu_ma, Alexei found a redundant check (cpu == 0) in function bpf_mem_alloc_percpu_unit_init() ([2]). Indeed, the check is unnecessary since c->unit_size will be all NULL or all non-NULL for all cpus before for_each_possible_cpu() loop. Removing the check makes code less confusing. [1] https://lore.kernel.org/all/20231222031729.1287957-1-yonghong.song@linux.dev/ [2] https://lore.kernel.org/all/20231222031745.1289082-1-yonghong.song@linux.dev/ Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240104165744.702239-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04net/tcp: Only produce AO/MD5 logs if there are any keysDmitry Safonov
User won't care about inproper hash options in the TCP header if they don't use neither TCP-AO nor TCP-MD5. Yet, those logs can add up in syslog, while not being a real concern to the host admin: > kernel: TCP: TCP segment has incorrect auth options set for XX.20.239.12.54681->XX.XX.90.103.80 [S] Keep silent and avoid logging when there aren't any keys in the system. Side-note: I also defined static_branch_tcp_*() helpers to avoid more ifdeffery, going to remove more ifdeffery further with their help. Reported-by: Christian Kujau <lists@nerdbynature.de> Closes: https://lore.kernel.org/all/f6b59324-1417-566f-a976-ff2402718a8d@nerdbynature.de/ Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Fixes: 2717b5adea9e ("net/tcp: Add tcp_hash_fail() ratelimited logs") Link: https://lore.kernel.org/r/20240104-tcp_hash_fail-logs-v1-1-ff3e1f6f9e72@arista.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-01-03 (i40e, ice, igc) This series contains updates to i40e, ice, and igc drivers. Ke Xiao fixes use after free for unicast filters on i40e. Andrii restores VF MSI-X flag after PCI reset on i40e. Paul corrects admin queue link status structure to fulfill firmware expectations for ice. Rodrigo Cataldo corrects value used for hicredit calculations on igc. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igc: Fix hicredit calculation ice: fix Get link status data length i40e: Restore VF MSI-X state during PCI reset i40e: fix use-after-free in i40e_aqc_add_filters() ==================== Link: https://lore.kernel.org/r/20240103193254.822968-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04net: Implement missing SO_TIMESTAMPING_NEW cmsg supportThomas Lange
Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new socket option SO_TIMESTAMPING_NEW. However, it was never implemented in __sock_cmsg_send thus breaking SO_TIMESTAMPING cmsg for platforms using SO_TIMESTAMPING_NEW. Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") Link: https://lore.kernel.org/netdev/6a7281bf-bc4a-4f75-bb88-7011908ae471@app.fastmail.com/ Signed-off-by: Thomas Lange <thomas@corelatus.se> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240104085744.49164-1-thomas@corelatus.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04drm/rockchip: vop2: Drop superfluous includeCristian Ciocaltea
The rockchip_drm_fb.h header contains just a single function which is not directly used by the VOP2 driver. Drop the unnecessary include. Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240104143951.85219-1-cristian.ciocaltea@collabora.com
2024-01-04Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"Shin'ichiro Kawasaki
This reverts commit b28ff7a7c3245d7f62acc20f15b4361292fe4117. The commit introduced P2SB device scan and resource cache during the boot process to avoid deadlock. But it caused detection failure of IDE controllers on old systems [1]. The IDE controllers on old systems and P2SB devices on newer systems have same PCI DEVFN. It is suspected the confusion between those two is the failure cause. Revert the change at this moment until the proper solution gets ready. Link: https://lore.kernel.org/platform-driver-x86/CABq1_vjfyp_B-f4LAL6pg394bP6nDFyvg110TOLHHb0x4aCPeg@mail.gmail.com/T/#m07b30468d9676fc5e3bb2122371121e4559bb383 [1] Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240104114050.3142690-1-shinichiro.kawasaki@wdc.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-01-04kernfs: convert kernfs_idr_lock to an irq safe raw spinlockAndrea Righi
bpf_cgroup_from_id() is basically a wrapper to cgroup_get_from_id(), that is relying on kernfs to determine the right cgroup associated to the target id. As a kfunc, it has the potential to be attached to any function through BPF, particularly in contexts where certain locks are held. However, kernfs is not using an irq safe spinlock for kernfs_idr_lock, that means any kernfs function that is acquiring this lock can be interrupted and potentially hit bpf_cgroup_from_id() in the process, triggering a deadlock. For example, it is really easy to trigger a lockdep splat between kernfs_idr_lock and rq->_lock, attaching a small BPF program to __set_cpus_allowed_ptr_locked() that just calls bpf_cgroup_from_id(): ===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 6.7.0-rc7-virtme #5 Not tainted ----------------------------------------------------- repro/131 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: ffffffffb2dc4578 (kernfs_idr_lock){+.+.}-{2:2}, at: kernfs_find_and_get_node_by_id+0x1d/0x80 and this task is already holding: ffff911cbecaf218 (&rq->__lock){-.-.}-{2:2}, at: task_rq_lock+0x50/0xc0 which would create a new lock dependency: (&rq->__lock){-.-.}-{2:2} -> (kernfs_idr_lock){+.+.}-{2:2} but this new dependency connects a HARDIRQ-irq-safe lock: (&rq->__lock){-.-.}-{2:2} ... which became HARDIRQ-irq-safe at: lock_acquire+0xbf/0x2b0 _raw_spin_lock_nested+0x2e/0x40 scheduler_tick+0x5d/0x170 update_process_times+0x9c/0xb0 tick_periodic+0x27/0xe0 tick_handle_periodic+0x24/0x70 __sysvec_apic_timer_interrupt+0x64/0x1a0 sysvec_apic_timer_interrupt+0x6f/0x80 asm_sysvec_apic_timer_interrupt+0x1a/0x20 memcpy+0xc/0x20 arch_dup_task_struct+0x15/0x30 copy_process+0x1ce/0x1eb0 kernel_clone+0xac/0x390 kernel_thread+0x6f/0xa0 kthreadd+0x199/0x230 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x1b/0x30 to a HARDIRQ-irq-unsafe lock: (kernfs_idr_lock){+.+.}-{2:2} ... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0xbf/0x2b0 _raw_spin_lock+0x30/0x40 __kernfs_new_node.isra.0+0x83/0x280 kernfs_create_root+0xf6/0x1d0 sysfs_init+0x1b/0x70 mnt_init+0xd9/0x2a0 vfs_caches_init+0xcf/0xe0 start_kernel+0x58a/0x6a0 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0xc5/0xe0 secondary_startup_64_no_verify+0x178/0x17b other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kernfs_idr_lock); local_irq_disable(); lock(&rq->__lock); lock(kernfs_idr_lock); <Interrupt> lock(&rq->__lock); *** DEADLOCK *** Prevent this deadlock condition converting kernfs_idr_lock to a raw irq safe spinlock. The performance impact of this change should be negligible and it also helps to prevent similar deadlock conditions with any other subsystems that may depend on kernfs. Fixes: 332ea1f697be ("bpf: Add bpf_cgroup_from_id() kfunc") Cc: stable <stable@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20231229074916.53547-1-andrea.righi@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-04class: fix use-after-free in class_register()Jing Xia
The lock_class_key is still registered and can be found in lock_keys_hash hlist after subsys_private is freed in error handler path.A task who iterate over the lock_keys_hash later may cause use-after-free.So fix that up and unregister the lock_class_key before kfree(cp). On our platform, a driver fails to kset_register because of creating duplicate filename '/class/xxx'.With Kasan enabled, it prints a invalid-access bug report. KASAN bug report: BUG: KASAN: invalid-access in lockdep_register_key+0x19c/0x1bc Write of size 8 at addr 15ffff808b8c0368 by task modprobe/252 Pointer tag: [15], memory tag: [fe] CPU: 7 PID: 252 Comm: modprobe Tainted: G W 6.6.0-mainline-maybe-dirty #1 Call trace: dump_backtrace+0x1b0/0x1e4 show_stack+0x2c/0x40 dump_stack_lvl+0xac/0xe0 print_report+0x18c/0x4d8 kasan_report+0xe8/0x148 __hwasan_store8_noabort+0x88/0x98 lockdep_register_key+0x19c/0x1bc class_register+0x94/0x1ec init_module+0xbc/0xf48 [rfkill] do_one_initcall+0x17c/0x72c do_init_module+0x19c/0x3f8 ... Memory state around the buggy address: ffffff808b8c0100: 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a 8a ffffff808b8c0200: 8a 8a 8a 8a 8a 8a 8a 8a fe fe fe fe fe fe fe fe >ffffff808b8c0300: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe ^ ffffff808b8c0400: 03 03 03 03 03 03 03 03 03 03 03 03 03 03 03 03 As CONFIG_KASAN_GENERIC is not set, Kasan reports invalid-access not use-after-free here.In this case, modprobe is manipulating the corrupted lock_keys_hash hlish where lock_class_key is already freed before. It's worth noting that this only can happen if lockdep is enabled, which is not true for normal system. Fixes: dcfbb67e48a2 ("driver core: class: use lock_class_key already present in struct subsys_private") Cc: stable <stable@kernel.org> Signed-off-by: Jing Xia <jing.xia@unisoc.com> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com> Link: https://lore.kernel.org/r/20231220024603.186078-1-jing.xia@unisoc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>