linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-06-22	igc: Work around HW bug causing missing timestamps	Vinicius Costa Gomes
	There's an hardware issue that can cause missing timestamps. The bug is that the interrupt is only cleared if the IGC_TXSTMPH_0 register is read. The bug can cause a race condition if a timestamp is captured at the wrong time, and we will miss that timestamp. To reduce the time window that the problem is able to happen, in case no timestamp was ready, we read the "previous" value of the timestamp registers, and we compare with the "current" one, if it didn't change we can be reasonably sure that no timestamp was captured. If they are different, we use the new value as the captured timestamp. The HW bug is not easy to reproduce, got to reproduce it when smashing the NIC with timestamping requests from multiple applications (e.g. multiple ntpperf instances + ptp4l), after 10s of minutes. This workaround has more impact when multiple timestamp registers are used, and the IGC_TXSTMPH_0 register always need to be read, so the interrupt is cleared. Fixes: 2c344ae24501 ("igc: Add support for TX timestamping") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-22	igc: Retrieve TX timestamp during interrupt handling	Vinicius Costa Gomes
	When the interrupt is handled, the TXTT_0 bit in the TSYNCTXCTL register should already be set and the timestamp value already loaded in the appropriate register. This simplifies the handling, and reduces the latency for retrieving the TX timestamp, which increase the amount of TX timestamps that can be handled in a given time period. As the "work" function doesn't run in a workqueue anymore, rename it to something more sensible, a event handler. Using ntpperf[1] we can see the following performance improvements: Before: $ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o -37 \| responses \| TX timestamp offset (ns) rate clients \| lost invalid basic xleave \| min mean max stddev 1000 100 0.00% 0.00% 0.00% 100.00% -56 +9 +52 19 1500 150 0.00% 0.00% 0.00% 100.00% -40 +30 +75 22 2250 225 0.00% 0.00% 0.00% 100.00% -11 +29 +72 15 3375 337 0.00% 0.00% 0.00% 100.00% -18 +40 +88 22 5062 506 0.00% 0.00% 0.00% 100.00% -19 +23 +77 15 7593 759 0.00% 0.00% 0.00% 100.00% +7 +47 +5168 43 11389 1138 0.00% 0.00% 0.00% 100.00% -11 +41 +5240 39 17083 1708 0.00% 0.00% 0.00% 100.00% +19 +60 +5288 50 25624 2562 0.00% 0.00% 0.00% 100.00% +1 +56 +5368 58 38436 3843 0.00% 0.00% 0.00% 100.00% -84 +12 +8847 66 57654 5765 0.00% 0.00% 100.00% 0.00% 86481 8648 0.00% 0.00% 100.00% 0.00% 129721 12972 0.00% 0.00% 100.00% 0.00% 194581 16384 0.00% 0.00% 100.00% 0.00% 291871 16384 27.35% 0.00% 72.65% 0.00% 437806 16384 50.05% 0.00% 49.95% 0.00% After: $ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o -37 \| responses \| TX timestamp offset (ns) rate clients \| lost invalid basic xleave \| min mean max stddev 1000 100 0.00% 0.00% 0.00% 100.00% -44 +0 +61 19 1500 150 0.00% 0.00% 0.00% 100.00% -6 +39 +81 16 2250 225 0.00% 0.00% 0.00% 100.00% -22 +25 +69 15 3375 337 0.00% 0.00% 0.00% 100.00% -28 +15 +56 14 5062 506 0.00% 0.00% 0.00% 100.00% +7 +78 +143 27 7593 759 0.00% 0.00% 0.00% 100.00% -54 +24 +144 47 11389 1138 0.00% 0.00% 0.00% 100.00% -90 -33 +28 21 17083 1708 0.00% 0.00% 0.00% 100.00% -50 -2 +35 14 25624 2562 0.00% 0.00% 0.00% 100.00% -62 +7 +66 23 38436 3843 0.00% 0.00% 0.00% 100.00% -33 +30 +5395 36 57654 5765 0.00% 0.00% 100.00% 0.00% 86481 8648 0.00% 0.00% 100.00% 0.00% 129721 12972 0.00% 0.00% 100.00% 0.00% 194581 16384 19.50% 0.00% 80.50% 0.00% 291871 16384 35.81% 0.00% 64.19% 0.00% 437806 16384 55.40% 0.00% 44.60% 0.00% [1] https://github.com/mlichvar/ntpperf Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-22	igc: Check if hardware TX timestamping is enabled earlier	Vinicius Costa Gomes
	Before requesting a packet transmission to be hardware timestamped, check if the user has TX timestamping enabled. Fixes an issue that if a packet was internally forwarded to the NIC, and it had the SKBTX_HW_TSTAMP flag set, the driver would mark that timestamp as skipped. In reality, that timestamp was "not for us", as TX timestamp could never be enabled in the NIC. Checking if the TX timestamping is enabled earlier has a secondary effect that when TX timestamping is disabled, there's no need to check for timestamp timeouts. We should only take care to free any pending timestamp when TX timestamping is disabled, as that skb would never be released otherwise. Fixes: 2c344ae24501 ("igc: Add support for TX timestamping") Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-22	igc: Fix race condition in PTP tx code	Vinicius Costa Gomes
	Currently, the igc driver supports timestamping only one tx packet at a time. During the transmission flow, the skb that requires hardware timestamping is saved in adapter->ptp_tx_skb. Once hardware has the timestamp, an interrupt is delivered, and adapter->ptp_tx_work is scheduled. In igc_ptp_tx_work(), we read the timestamp register, update adapter->ptp_tx_skb, and notify the network stack. While the thread executing the transmission flow (the user process running in kernel mode) and the thread executing ptp_tx_work don't access adapter->ptp_tx_skb concurrently, there are two other places where adapter->ptp_tx_skb is accessed: igc_ptp_tx_hang() and igc_ptp_suspend(). igc_ptp_tx_hang() is executed by the adapter->watchdog_task worker thread which runs periodically so it is possible we have two threads accessing ptp_tx_skb at the same time. Consider the following scenario: right after __IGC_PTP_TX_IN_PROGRESS is set in igc_xmit_frame_ring(), igc_ptp_tx_hang() is executed. Since adapter->ptp_tx_start hasn't been written yet, this is considered a timeout and adapter->ptp_tx_skb is cleaned up. This patch fixes the issue described above by adding the ptp_tx_lock to protect access to ptp_tx_skb and ptp_tx_start fields from igc_adapter. Since igc_xmit_frame_ring() called in atomic context by the networking stack, ptp_tx_lock is defined as a spinlock, and the irq safe variants of lock/unlock are used. With the introduction of the ptp_tx_lock, the __IGC_PTP_TX_IN_PROGRESS flag doesn't provide much of a use anymore so this patch gets rid of it. Fixes: 2c344ae24501 ("igc: Add support for TX timestamping") Signed-off-by: Andre Guedes <andre.guedes@intel.com> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-22	block: don't return -EINVAL for not found names in devt_from_devname	Christoph Hellwig
	When we didn't find a device and didn't guess it might be a partition, it might still show up later, so don't disable rootwait for it by returning -EINVAL. Fixes: 079caa35f786 ("init: clear root_wait on all invalid root= strings") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230622150644.600327-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22	btrfs: fix remaining u32 overflows when left shifting stripe_nr	Qu Wenruo
	There was regression caused by a97699d1d610 ("btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LEN") and supposedly fixed by a7299a18a179 ("btrfs: fix u32 overflows when left shifting stripe_nr"). To avoid code churn the fix was open coding the type casts but unfortunately missed one which was still possible to hit [1]. The missing place was assignment of bioc->full_stripe_logical inside btrfs_map_block(). Fix it by adding a helper that does the safe calculation of the offset and use it everywhere even though it may not be strictly necessary due to already using u64 types. This replaces all remaining "<< BTRFS_STRIPE_LEN_SHIFT" calls. [1] https://lore.kernel.org/linux-btrfs/20230622065438.86402-1-wqu@suse.com/ Fixes: a7299a18a179 ("btrfs: fix u32 overflows when left shifting stripe_nr") Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-22	clk: nuvoton: Use clk_parent_data instead of string for parent clock	Jacky Huang
	For the declaration of parent clocks, use struct clk_parent_data instead of a string. Due to the change in the passed arguments, replace the usage of devm_clk_hw_register_mux() with clk_hw_register_mux_parent_data() for all cases. Signed-off-by: Jacky Huang <ychuang3@nuvoton.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-22	clk: nuvoton: Update all constant hex values to lowercase	Jacky Huang
	The constant hex values used to define register offsets were written in uppercase. This patch update all these constant hex values to be lowercase. Signed-off-by: Jacky Huang <ychuang3@nuvoton.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-22	clk: nuvoton: Add clk-ma35d1.h for driver extern functions	Jacky Huang
	Moved the declaration of extern functions ma35d1_reg_clk_pll() and ma35d1_reg_adc_clkdiv() from the .c files to the newly created header file clk-ma35d1.h. Signed-off-by: Jacky Huang <ychuang3@nuvoton.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-22	riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL	Donglin Peng
	The previous patch ("function_graph: Support recording and printing the return value of function") has laid the groundwork for the for the funcgraph-retval, and this modification makes it available on the RISC-V platform. We introduce a new structure called fgraph_ret_regs for the RISC-V platform to hold return registers and the frame pointer. We then fill its content in the return_to_handler and pass its address to the function ftrace_return_to_handler to record the return value. Link: https://lore.kernel.org/linux-trace-kernel/a8d71b12259f90e7e63d0ea654fcac95b0232bbc.1680954589.git.pengdonglin@sangfor.com.cn Signed-off-by: Donglin Peng <pengdonglin@sangfor.com.cn> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-22	tracing/boot: Replace strlcpy with strscpy	Azeem Shaikh
	strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). Direct replacement is safe here since return value of -E2BIG is used to check for truncation instead of sizeof(dest). [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Link: https://lore.kernel.org/linux-trace-kernel/20230613004125.3539934-1-azeemshaikh38@gmail.com Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-22	tracing/timerlat: Add user-space interface	Daniel Bristot de Oliveira
	Going a step further, we propose a way to use any user-space workload as the task waiting for the timerlat timer. This is done via a per-CPU file named osnoise/cpu$id/timerlat_fd file. The tracef_fd allows a task to open at a time. When a task reads the file, the timerlat timer is armed for future osnoise/timerlat_period_us time. When the timer fires, it prints the IRQ latency and wakes up the user-space thread waiting in the timerlat_fd. The thread then starts to run, executes the timerlat measurement, prints the thread scheduling latency and returns to user-space. When the thread rereads the timerlat_fd, the tracer will print the user-ret(urn) latency, which is an additional metric. This additional metric is also traced by the tracer and can be used, for example of measuring the context switch overhead from kernel-to-user and user-to-kernel, or the response time for an arbitrary execution in user-space. The tracer supports one thread per CPU, the thread must be pinned to the CPU, and it cannot migrate while holding the timerlat_fd. The reason is that the tracer is per CPU (nothing prohibits the tracer from allowing migrations in the future). The tracer monitors the migration of the thread and disables the tracer if detected. The timerlat_fd is only available for opening/reading when timerlat tracer is enabled, and NO_OSNOISE_WORKLOAD is set. The simplest way to activate this feature from user-space is: -------------------------------- %< ----------------------------------- int main(void) { char buffer[1024]; int timerlat_fd; int retval; long cpu = 0; /* place in CPU 0 */ cpu_set_t set; CPU_ZERO(&set); CPU_SET(cpu, &set); if (sched_setaffinity(gettid(), sizeof(set), &set) == -1) return 1; snprintf(buffer, sizeof(buffer), "/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd", cpu); timerlat_fd = open(buffer, O_RDONLY); if (timerlat_fd < 0) { printf("error opening %s: %s\n", buffer, strerror(errno)); exit(1); } for (;;) { retval = read(timerlat_fd, buffer, 1024); if (retval < 0) break; } close(timerlat_fd); exit(0); } -------------------------------- >% ----------------------------------- When disabling timerlat, if there is a workload holding the timerlat_fd, the SIGKILL will be sent to the thread. Link: https://lkml.kernel.org/r/69fe66a863d2792ff4c3a149bf9e32e26468bb3a.1686063934.git.bristot@kernel.org Cc: Juri Lelli <juri.lelli@redhat.com> Cc: William White <chwhite@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-22	tracing/osnoise: Skip running osnoise if all instances are off	Daniel Bristot de Oliveira
	In the case of all tracing instances being off, sleep for the entire period. Q: Why not kill all threads so? A: It is valid and useful to start the threads with tracing off. For example, rtla disables tracing, starts the tracer, applies the scheduling setup to the threads, e.g., sched priority and cgroup, and then begin tracing with all set. Skipping the period helps to speed up rtla setup and save the trace after a stop tracing. Link: https://lkml.kernel.org/r/aa4dd9b7e76fcb63901fe5407e15ec002b318599.1686063934.git.bristot@kernel.org Cc: Juri Lelli <juri.lelli@redhat.com> Cc: William White <chwhite@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-22	tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable	Daniel Bristot de Oliveira
	Currently, osnoise/timerlat threads run with PF_NO_SETAFFINITY set. It works well, however, cgroups do not allow PF_NO_SETAFFINITY threads to be accepted, and this creates a limitation to osnoise/timerlat. To avoid this limitation, disable migration of the threads as soon as they start to run, and then clean the PF_NO_SETAFFINITY flag (still) used during thread creation. If for some reason a thread migration is requested, e.g., via sched_settafinity, the tracer thread will notice and exit. Link: https://lkml.kernel.org/r/8ba8bc9c15b3ea40cf73cf67a9bc061a264609f0.1686063934.git.bristot@kernel.org Cc: Juri Lelli <juri.lelli@redhat.com> Cc: William White <chwhite@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-22	ftrace: Show all functions with addresses in available_filter_functions_addrs	Jiri Olsa
	Adding new available_filter_functions_addrs file that shows all available functions (same as available_filter_functions) together with addresses, like: # cat available_filter_functions_addrs \| head ffffffff81000770 __traceiter_initcall_level ffffffff810007c0 __traceiter_initcall_start ffffffff81000810 __traceiter_initcall_finish ffffffff81000860 trace_initcall_finish_cb ... Note displayed address is the patch-site address and can differ from /proc/kallsyms address. It's useful to have address avilable for traceable symbols, so we don't need to allways cross check kallsyms with available_filter_functions (or the other way around) and have all the data in single file. For backwards compatibility reasons we can't change the existing available_filter_functions file output, but we need to add new file. The problem is that we need to do 2 passes: - through available_filter_functions and find out if the function is traceable - through /proc/kallsyms to get the address for traceable function Having available_filter_functions symbols together with addresses allow us to skip the kallsyms step and we are ok with the address in available_filter_functions_addr not being the function entry, because kprobe_multi uses fprobe and that handles both entry and patch-site address properly. We have 2 interfaces how to create kprobe_multi link: a) passing symbols to kernel 1) user gathers symbols and need to ensure that they are trace-able -> pass through available_filter_functions file 2) kernel takes those symbols and translates them to addresses through kallsyms api 3) addresses are passed to fprobe/ftrace through: register_fprobe_ips -> ftrace_set_filter_ips b) passing addresses to kernel 1) user gathers symbols and needs to ensure that they are trace-able -> pass through available_filter_functions file 2) user takes those symbols and translates them to addresses through /proc/kallsyms 3) addresses are passed to the kernel and kernel calls: register_fprobe_ips -> ftrace_set_filter_ips The new available_filter_functions_addrs file helps us with option b), because we can make 'b 1' and 'b 2' in one step - while filtering traceable functions, we get the address directly. Link: https://lore.kernel.org/linux-trace-kernel/20230611130029.1202298-1-jolsa@kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Tested-by: Jackie Liu <liuyun01@kylinos.cn> # x86 Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-22	Merge tag 'amlogic-drivers-for-v6.5' of ↵	Arnd Bergmann
	https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux into soc/drivers Amlogic Drivers changes for v6.5: - tag some powers domains as always-on for secure-pwrc - fix MAINTAINERS entry for PHY drivers & bindings - Amlogic Meson GPIO interrupt controller binding to yaml conversion * tag 'amlogic-drivers-for-v6.5' of https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux: dt-bindings: interrupt-controller: Convert Amlogic Meson GPIO interrupt controller binding MAINTAINERS: add PHY-related files to Amlogic SoC file list drivers: meson: secure-pwrc: always enable DMA domain Link: https://lore.kernel.org/r/a10ea420-7599-3f41-dfd8-1742ef436ca0@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-22	spi: Create a helper to derive adaptive timeouts	Miquel Raynal
	Big transfers might take a bit of time, too constraining timeouts might lead to false positives. In order to simplify the drivers work and with the goal of factorizing code in mind, let's add a helper that can be used by any spi controller driver to derive a relevant per-transfer timeout value. The logic is simple: we know how much time it would take to transfer a byte, we can easily derive the total theoretical amount of time involved for each transfer. We multiply it by two to have a bit of margin and enforce a minimum of 500ms. Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/Message-Id: <20230622090634.3411468-2-miquel.raynal@bootlin.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-22	cdrom: Fix spectre-v1 gadget	Jordy Zomer
	This patch fixes a spectre-v1 gadget in cdrom. The gadget could be triggered by speculatively bypassing the cdi->capacity check. Signed-off-by: Jordy Zomer <jordyzomer@google.com> Link: https://lore.kernel.org/all/20230612110040.849318-2-jordyzomer@google.com Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/all/ZI1+1OG9Ut1MqsUC@equinox Signed-off-by: Phillip Potter <phil@philpotter.co.uk> Link: https://lore.kernel.org/r/20230617113828.1230-2-phil@philpotter.co.uk Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22	block: make sure local irq is disabled when calling __blkcg_rstat_flush	Ming Lei
	When __blkcg_rstat_flush() is called from cgroup_rstat_flush*() code path, interrupt is always disabled. When we start to flush blkcg per-cpu stats list in __blkg_release() for avoiding to leak blkcg_gq's reference in commit 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq"), local irq isn't disabled yet, then lockdep warning may be triggered because the dependent cgroup locks may be acquired from irq(soft irq) handler. Fix the issue by disabling local irq always. Fixes: 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq") Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-block/pz2wzwnmn5tk3pwpskmjhli6g3qly7eoknilb26of376c7kwxy@qydzpvt6zpis/T/#u Cc: stable@vger.kernel.org Cc: Jay Shin <jaeshin@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20230622084249.1208005-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22	erofs: clean up zmap.c	Gao Xiang
	Several trivial cleanups which aren't quite necessary to split: - Rename lcluster load functions as well as justify full indexes since they are typically used for global deduplication for compressed data; - Avoid unnecessary lines, comments for simplicity. No logic changes. Reviewed-by: Guo Xuenan <guoxuenan@huaweicloud.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230615064421.103178-1-hsiangkao@linux.alibaba.com
2023-06-22	erofs: remove unnecessary goto	Yangtao Li
	It's redundant, let's remove it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Link: https://lore.kernel.org/r/20230615034539.14286-1-frank.li@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-06-22	erofs: Fix detection of atomic context	Sandeep Dhavale
	Current check for atomic context is not sufficient as z_erofs_decompressqueue_endio can be called under rcu lock from blk_mq_flush_plug_list(). See the stacktrace [1] In such case we should hand off the decompression work for async processing rather than trying to do sync decompression in current context. Patch fixes the detection by checking for rcu_read_lock_any_held() and while at it use more appropriate !in_task() check than in_atomic(). Background: Historically erofs would always schedule a kworker for decompression which would incur the scheduling cost regardless of the context. But z_erofs_decompressqueue_endio() may not always be in atomic context and we could actually benefit from doing the decompression in z_erofs_decompressqueue_endio() if we are in thread context, for example when running with dm-verity. This optimization was later added in patch [2] which has shown improvement in performance benchmarks. ============================================== [1] Problem stacktrace [name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291 [name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi [name:core&]preempt_count: 0, expected: 0 [name:core&]RCU nest depth: 1, expected: 0 CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S W OE 6.1.25-android14-5-maybe-dirty-mainline #1 Hardware name: MT6897 (DT) Call trace: dump_backtrace+0x108/0x15c show_stack+0x20/0x30 dump_stack_lvl+0x6c/0x8c dump_stack+0x20/0x48 __might_resched+0x1fc/0x308 __might_sleep+0x50/0x88 mutex_lock+0x2c/0x110 z_erofs_decompress_queue+0x11c/0xc10 z_erofs_decompress_kickoff+0x110/0x1a4 z_erofs_decompressqueue_endio+0x154/0x180 bio_endio+0x1b0/0x1d8 __dm_io_complete+0x22c/0x280 clone_endio+0xe4/0x280 bio_endio+0x1b0/0x1d8 blk_update_request+0x138/0x3a4 blk_mq_plug_issue_direct+0xd4/0x19c blk_mq_flush_plug_list+0x2b0/0x354 __blk_flush_plug+0x110/0x160 blk_finish_plug+0x30/0x4c read_pages+0x2fc/0x370 page_cache_ra_unbounded+0xa4/0x23c page_cache_ra_order+0x290/0x320 do_sync_mmap_readahead+0x108/0x2c0 filemap_fault+0x19c/0x52c __do_fault+0xc4/0x114 handle_mm_fault+0x5b4/0x1168 do_page_fault+0x338/0x4b4 do_translation_fault+0x40/0x60 do_mem_abort+0x60/0xc8 el0_da+0x4c/0xe0 el0t_64_sync_handler+0xd4/0xfc el0t_64_sync+0x1a0/0x1a4 [2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/ Reported-by: Will Shiu <Will.Shiu@mediatek.com> Suggested-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Sandeep Dhavale <dhavale@google.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> Link: https://lore.kernel.org/r/20230621220848.3379029-1-dhavale@google.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-06-22	s390/defconfigs: set CONFIG_NET_TC_SKB_EXT=y	Niklas Schnelle
	As made explicit by commit 03a283cdc8c8 ("net/mlx5: Kconfig: Make tc offload depend on tc skb extension") tc skb extension is required for offloading tc as well as bridges on switchdev capable ConnectX devices. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-22	Merge tag 'nf-23-06-21' of ↵	Paolo Abeni
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This is v3, including a crash fix for patch 01/14. The following patchset contains Netfilter/IPVS fixes for net: 1) Fix UDP segmentation with IPVS tunneled traffic, from Terin Stock. 2) Fix chain binding transaction logic, add a bound flag to rule transactions. Remove incorrect logic in nft_data_hold() and nft_data_release(). 3) Add a NFT_TRANS_PREPARE_ERROR deactivate state to deal with releasing the set/chain as a follow up to 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE") 4) Drop map element references from preparation phase instead of set destroy path, otherwise bogus EBUSY with transactions such as: flush chain ip x y delete chain ip x w where chain ip x y contains jump/goto from set elements. 5) Pipapo set type does not regard generation mask from the walk iteration. 6) Fix reference count underflow in set element reference to stateful object. 7) Several patches to tighten the nf_tables API: - disallow set element updates of bound anonymous set - disallow unbound anonymous set/chain at the end of transaction. - disallow updates of anonymous set. - disallow timeout configuration for anonymous sets. 8) Fix module reference leak in chain updates. 9) Fix nfnetlink_osf module autoload. 10) Fix deletion of basechain when NFTA_CHAIN_HOOK is specified as in iptables-nft. This Netfilter batch is larger than usual at this stage, I am aware we are fairly late in the -rc cycle, if you prefer to route them through net-next, please let me know. netfilter pull request 23-06-21 * tag 'nf-23-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: Fix for deleting base chains with payload netfilter: nfnetlink_osf: fix module autoload netfilter: nf_tables: drop module reference after updating chain netfilter: nf_tables: disallow timeout for anonymous sets netfilter: nf_tables: disallow updates of anonymous sets netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: disallow element updates of bound anonymous sets netfilter: nf_tables: fix underflow in object reference counter netfilter: nft_set_pipapo: .walk does not deal with generations netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: fix chain binding transaction logic ipvs: align inner_mac_header for encapsulation ==================== Link: https://lore.kernel.org/r/20230621100731.68068-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	s390/cpum_cf: rework PER_CPU_DEFINE of struct cpu_cf_events	Thomas Richter
	Struct cpu_cf_events is a large data structure and is statically defined for each possible CPU. Rework this and replace it by dynamically allocated data structures created when a perf_event_open() system call is invoked or an access via character device /dev/hwctr takes place. It is replaced by an array of pointers to all possible CPUs and reference counting. The array of pointers is allocated when the first event is created. For each online CPU an event is installed on, a struct cpu_cf_events is allocated and a pointer to struct cpu_cf_events is stored in the array: CPU 0 1 2 3 ... N +---+---+---+---+---+---+ cpu_cf_root::cpucf--> \| * \| \| \| \|...\| \| +-\|-+---+---+---+---+---+ \| \| \\|/ +-------------+ \|cpu_cf_events\| \| \| +-------------+ With this approach the large data structure is only allocated when an event is actually installed and used. Also implement proper reference counting for allocation and removal. During interrupt processing make sure the pointer to cpu_cf_events is valid. The interrupt handler is shared and might be called when no event is active. This requires checking for a valid pointer to struct cpu_cf_events. When the pointer to the per-cpu cpu_cf_events is NULL, simply return. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-22	linux/export.h: rename 'sec' argument to 'license'	Masahiro Yamada
	Now, EXPORT_SYMBOL() is populated in two stages. In the first stage, all of EXPORT_SYMBOL/EXPORT_SYMBOL_GPL go into the same section, '.export_symbol'. 'sec' does not make sense any more. Rename it to 'license'. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	modpost: show offset from symbol for section mismatch warnings	Masahiro Yamada
	Currently, modpost only shows the symbol names and section names, so it repeats the same message if there are multiple relocations in the same symbol. It is common the relocation spans across multiple instructions. It is better to show the offset from the symbol. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	modpost: merge two similar section mismatch warnings	Masahiro Yamada
	In case of section mismatch, modpost shows slightly different messages. For extable section mismatch: "%s(%s+0x%lx): Section mismatch in reference to the %s:%s\n" For the other cases: "%s: section mismatch in reference: %s (section: %s) -> %s (section: %s)\n" They are similar. Merge them. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion	Masahiro Yamada
	When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses the directory tree to determine which EXPORT_SYMBOL to trim. If an EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the second traverse, where some source files are recompiled with their EXPORT_SYMBOL() tuned into a no-op. Linus stated negative opinions about this slowness in commits: - 5cf0fd591f2e ("Kbuild: disable TRIM_UNUSED_KSYMS option") - a555bdd0c58c ("Kbuild: enable TRIM_UNUSED_KSYMS again, with some guarding") We can do this better now. The final data structures of EXPORT_SYMBOL are generated by the modpost stage, so modpost can selectively emit KSYMTAB entries that are really used by modules. Commit f73edc8951b2 ("kbuild: unify two modpost invocations") is another ground-work to do this in a one-pass algorithm. With the list of modules, modpost sets sym->used if it is used by a module. modpost emits KSYMTAB only for symbols with sym->used==true. BTW, Nicolas explained why the trimming was implemented with recursion: https://lore.kernel.org/all/2o2rpn97-79nq-p7s2-nq5-8p83391473r@syhkavp.arg/ Actually, we never achieved that level of optimization where the chain reaction of trimming comes into play because: - CONFIG_LTO_CLANG cannot remove any unused symbols - CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is enabled only for vmlinux, but not modules If deeper trimming is required, we need to revisit this, but I guess that is unlikely to happen. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-06-22	modpost: use null string instead of NULL pointer for default namespace	Masahiro Yamada
	The default namespace is the null string, "". When set, the null string "" is converted to NULL: s->namespace = namespace[0] ? NOFAIL(strdup(namespace)) : NULL; When printed, the NULL pointer is get back to the null string: sym->namespace ?: "" This saves 1 byte memory allocated for "", but loses the readability. In kernel-space, we strive to save memory, but modpost is a userspace tool used to build the kernel. On modern systems, such small piece of memory is not a big deal. Handle the namespace string as is. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	modpost: squash sym_update_namespace() into sym_add_exported()	Masahiro Yamada
	Pass a set of the name, license, and namespace to sym_add_exported(). sym_update_namespace() is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	modpost: check static EXPORT_SYMBOL* by modpost again	Masahiro Yamada
	Commit 31cb50b5590f ("kbuild: check static EXPORT_SYMBOL* by script instead of modpost") moved the static EXPORT_SYMBOL* check from the mostpost to a shell script because I thought it must be checked per compilation unit to avoid false negatives. I came up with an idea to do this in modpost, against combined ELF files. The relocation entries in ELF will find the correct exported symbol even if there exist symbols with the same name in different compilation units. Again, the same sample code. Makefile: obj-y += foo1.o foo2.o foo1.c: #include <linux/export.h> static void foo(void) {} EXPORT_SYMBOL(foo); foo2.c: void foo(void) {} Then, modpost can catch it correctly. MODPOST Module.symvers ERROR: modpost: vmlinux: local symbol 'foo' was exported Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	ia64,export.h: replace EXPORT_DATA_SYMBOL* with EXPORT_SYMBOL*	Masahiro Yamada
	With the previous refactoring, you can always use EXPORT_SYMBOL. Replace two instances in ia64, then remove EXPORT_DATA_SYMBOL. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	kbuild: generate KSYMTAB entries by modpost	Masahiro Yamada
	Commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS") made modpost output CRCs in the same way whether the EXPORT_SYMBOL() is placed in .c or .S. For further cleanups, this commit applies a similar approach to the entire data structure of EXPORT_SYMBOL(). The EXPORT_SYMBOL() compilation is split into two stages. When a source file is compiled, EXPORT_SYMBOL() will be converted into a dummy symbol in the .export_symbol section. For example, EXPORT_SYMBOL(foo); EXPORT_SYMBOL_NS_GPL(bar, BAR_NAMESPACE); will be encoded into the following assembly code: .section ".export_symbol","a" __export_symbol_foo: .asciz "" /* license / .asciz "" / name space / .balign 8 .quad foo / symbol reference / .previous .section ".export_symbol","a" __export_symbol_bar: .asciz "GPL" / license / .asciz "BAR_NAMESPACE" / name space / .balign 8 .quad bar / symbol reference / .previous They are mere markers to tell modpost the name, license, and namespace of the symbols. They will be dropped from the final vmlinux and modules because the (.export_symbol) will go into /DISCARD/ in the linker script. Then, modpost extracts all the information about EXPORT_SYMBOL() from the .export_symbol section, and generates the final C code: KSYMTAB_FUNC(foo, "", ""); KSYMTAB_FUNC(bar, "_gpl", "BAR_NAMESPACE"); KSYMTAB_FUNC() (or KSYMTAB_DATA() if it is data) is expanded to struct kernel_symbol that will be linked to the vmlinux or a module. With this change, EXPORT_SYMBOL() works in the same way for .c and .S files, providing the following benefits. [1] Deprecate EXPORT_DATA_SYMBOL() In the old days, EXPORT_SYMBOL() was only available in C files. To export a symbol in .S, EXPORT_SYMBOL() was placed in a separate .c file. arch/arm/kernel/armksyms.c is one example written in the classic manner. Commit 22823ab419d8 ("EXPORT_SYMBOL() for asm") removed this limitation. Since then, EXPORT_SYMBOL() can be placed close to the symbol definition in .S files. It was a nice improvement. However, as that commit mentioned, you need to use EXPORT_DATA_SYMBOL() for data objects on some architectures. In the new approach, modpost checks symbol's type (STT_FUNC or not), and outputs KSYMTAB_FUNC() or KSYMTAB_DATA() accordingly. There are only two users of EXPORT_DATA_SYMBOL: EXPORT_DATA_SYMBOL_GPL(empty_zero_page) (arch/ia64/kernel/head.S) EXPORT_DATA_SYMBOL(ia64_ivt) (arch/ia64/kernel/ivt.S) They are transformed as follows and output into .vmlinux.export.c KSYMTAB_DATA(empty_zero_page, "_gpl", ""); KSYMTAB_DATA(ia64_ivt, "", ""); The other EXPORT_SYMBOL users in ia64 assembly are output as KSYMTAB_FUNC(). EXPORT_DATA_SYMBOL() is now deprecated. [2] merge <linux/export.h> and <asm-generic/export.h> There are two similar header implementations: include/linux/export.h for .c files include/asm-generic/export.h for .S files Ideally, the functionality should be consistent between them, but they tend to diverge. Commit 8651ec01daed ("module: add support for symbol namespaces.") did not support the namespace for .S files. This commit shifts the essential implementation part to C, which supports EXPORT_SYMBOL_NS() for *.S files. <asm/export.h> and <asm-generic/export.h> will remain as a wrapper of <linux/export.h> for a while. They will be removed after #include <asm/export.h> directives are all replaced with #include <linux/export.h>. [3] Implement CONFIG_TRIM_UNUSED_KSYMS in one-pass algorithm (by a later commit) When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses the directory tree to determine which EXPORT_SYMBOL to trim. If an EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the second traverse, where some source files are recompiled with their EXPORT_SYMBOL() tuned into a no-op. We can do this better now; modpost can selectively emit KSYMTAB entries that are really used by modules. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2023-06-22	selftests/bpf: Fix compilation failure for prog vrf_socket_lookup	Yonghong Song
	When building the latest kernel/selftest with clang17 compiler: make LLVM=1 -j <== for kernel make -C tools/testing/selftests/bpf LLVM=1 -j <== for selftest I hit the following compilation error: [...] In file included from progs/vrf_socket_lookup.c:3: In file included from /usr/include/linux/ip.h:21: In file included from /usr/include/asm/byteorder.h:5: In file included from /usr/include/linux/byteorder/little_endian.h:13: /usr/include/linux/swab.h:136:8: error: unknown type name '__always_inline' 136 \| static __always_inline unsigned long __swab(const unsigned long y) \| ^ /usr/include/linux/swab.h:171:8: error: unknown type name '__always_inline' 171 \| static __always_inline __u16 __swab16p(const __u16 p) \| ^ /usr/include/linux/swab.h:171:29: error: expected ';' after top level declarator 171 \| static __always_inline __u16 __swab16p(const __u16 p) \| ^ [...] Basically, with header files in my local host which is based on 5.12 kernel, __always_inline is not defined and this caused compilation failure. Since __always_inline is defined in bpf_helpers.h, let us move bpf_helpers.h to an early position which fixed the problem. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230622061921.816772-1-yhs@fb.com
2023-06-22	revert "net: align SO_RCVMARK required privileges with SO_MARK"	Maciej Żenczykowski
	This reverts commit 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") because the reasoning in the commit message is not really correct: SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which sets the socket mark and does require privs. Additionally incoming skb->mark may already be visible if sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled. Furthermore, it is easier to block the getsockopt via bpf (either cgroup setsockopt hook, or via syscall filters) then to unblock it if it requires CAP_NET_RAW/ADMIN. On Android the socket mark is (among other things) used to store the network identifier a socket is bound to. Setting it is privileged, but retrieving it is not. We'd like unprivileged userspace to be able to read the network id of incoming packets (where mark is set via iptables [to be moved to bpf])... An alternative would be to add another sysctl to control whether setting SO_RCVMARK is privilged or not. (or even a MASK of which bits in the mark can be exposed) But this seems like over-engineering... Note: This is a non-trivial revert, due to later merged commit e42c7beee71d ("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()") which changed both 'ns_capable' into 'sockopt_ns_capable' calls. Fixes: 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") Cc: Larysa Zaremba <larysa.zaremba@intel.com> Cc: Simon Horman <simon.horman@corigine.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eyal Birger <eyal.birger@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Patrick Rohr <prohr@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	net: wwan: iosm: Convert single instance struct member to flexible array	Kees Cook
	struct mux_adth actually ends with multiple struct mux_adth_dg members. This is seen both in the comments about the member: /** * struct mux_adth - Structure of the Aggregated Datagram Table Header. ... * @dg: datagramm table with variable length / and in the preparation for populating it: adth_dg_size = offsetof(struct mux_adth, dg) + ul_adb->dg_count[i] sizeof(*dg); ... adth_dg_size -= offsetof(struct mux_adth, dg); memcpy(&adth->dg, ul_adb->dg[i], adth_dg_size); This was reported as a run-time false positive warning: memcpy: detected field-spanning write (size 16) of single field "&adth->dg" at drivers/net/wwan/iosm/iosm_ipc_mux_codec.c:852 (size 8) Adjust the struct mux_adth definition and associated sizeof() math; no binary output differences are observed in the resulting object file. Reported-by: Florian Klink <flokli@flokli.de> Closes: https://lore.kernel.org/lkml/dbfa25f5-64c8-5574-4f5d-0151ba95d232@gmail.com/ Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Cc: M Chetan Kumar <m.chetan.kumar@intel.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Intel Corporation <linuxwwan@intel.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620194234.never.023-kees@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	dt-bindings: mmc: fsl-imx-esdhc: Add imx6ul support	Oleksij Rempel
	Add the 'fsl,imx6ul-usdhc' value to the compatible properties list in the fsl-imx-esdhc.yaml file. This is required to match the compatible strings present in the 'mmc@2190000' node of 'imx6ul-prti6g.dtb'. This commit addresses the following dtbs_check warning: imx6ul-prti6g.dtb:0:0: /soc/bus@2100000/mmc@2190000: failed to match any schema with compatible: ['fsl,imx6ul-usdhc', 'fsl,imx6sx-usdhc'] Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230621093245.78130-2-o.rempel@pengutronix.de Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-06-22	mmc: mmci: Add support for SW busy-end timeouts	Ulf Hansson
	The ux500 variant doesn't have a HW based timeout to use for busy-end IRQs. To avoid hanging and waiting for the card to stop signaling busy, let's schedule a delayed work, according to the corresponding cmd->busy_timeout for the command. If the work gets to run, let's kick the IRQ handler to complete the currently running request/command. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20230620091113.33393-1-ulf.hansson@linaro.org
2023-06-22	sch_netem: acquire qdisc lock in netem_change()	Eric Dumazet
	syzbot managed to trigger a divide error [1] in netem. It could happen if q->rate changes while netem_enqueue() is running, since q->rate is read twice. It turns out netem_change() always lacked proper synchronization. [1] divide error: 0000 [#1] SMP KASAN CPU: 1 PID: 7867 Comm: syz-executor.1 Not tainted 6.1.30-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 RIP: 0010:div64_u64 include/linux/math64.h:69 [inline] RIP: 0010:packet_time_ns net/sched/sch_netem.c:357 [inline] RIP: 0010:netem_enqueue+0x2067/0x36d0 net/sched/sch_netem.c:576 Code: 89 e2 48 69 da 00 ca 9a 3b 42 80 3c 28 00 4c 8b a4 24 88 00 00 00 74 0d 4c 89 e7 e8 c3 4f 3b fd 48 8b 4c 24 18 48 89 d8 31 d2 <49> f7 34 24 49 01 c7 4c 8b 64 24 48 4d 01 f7 4c 89 e3 48 c1 eb 03 RSP: 0018:ffffc9000dccea60 EFLAGS: 00010246 RAX: 000001a442624200 RBX: 000001a442624200 RCX: ffff888108a4f000 RDX: 0000000000000000 RSI: 000000000000070d RDI: 000000000000070d RBP: ffffc9000dcceb90 R08: ffffffff849c5e26 R09: fffffbfff10e1297 R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888108a4f358 R13: dffffc0000000000 R14: 0000001a8cd9a7ec R15: 0000000000000000 FS: 00007fa73fe18700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa73fdf7718 CR3: 000000011d36e000 CR4: 0000000000350ee0 Call Trace: <TASK> [<ffffffff84714385>] __dev_xmit_skb net/core/dev.c:3931 [inline] [<ffffffff84714385>] __dev_queue_xmit+0xcf5/0x3370 net/core/dev.c:4290 [<ffffffff84d22df2>] dev_queue_xmit include/linux/netdevice.h:3030 [inline] [<ffffffff84d22df2>] neigh_hh_output include/net/neighbour.h:531 [inline] [<ffffffff84d22df2>] neigh_output include/net/neighbour.h:545 [inline] [<ffffffff84d22df2>] ip_finish_output2+0xb92/0x10d0 net/ipv4/ip_output.c:235 [<ffffffff84d21e63>] __ip_finish_output+0xc3/0x2b0 [<ffffffff84d10a81>] ip_finish_output+0x31/0x2a0 net/ipv4/ip_output.c:323 [<ffffffff84d10f14>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff84d10f14>] ip_output+0x224/0x2a0 net/ipv4/ip_output.c:437 [<ffffffff84d123b5>] dst_output include/net/dst.h:444 [inline] [<ffffffff84d123b5>] ip_local_out net/ipv4/ip_output.c:127 [inline] [<ffffffff84d123b5>] __ip_queue_xmit+0x1425/0x2000 net/ipv4/ip_output.c:542 [<ffffffff84d12fdc>] ip_queue_xmit+0x4c/0x70 net/ipv4/ip_output.c:556 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620184425.1179809-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	can: isotp: isotp_sendmsg(): fix return error fix on TX path	Oliver Hartkopp
	With commit d674a8f123b4 ("can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path") the missing correct return value in the case of a protocol error was introduced. But the way the error value has been read and sent to the user space does not follow the common scheme to clear the error after reading which is provided by the sock_error() function. This leads to an error report at the following write() attempt although everything should be working. Fixes: d674a8f123b4 ("can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path") Reported-by: Carsten Schmidt <carsten.schmidt-achim@t-online.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230607072708.38809-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-06-22	ARM: mvebu: fix unit address on armada-390-db flash	Arnd Bergmann
	The unit address needs to be changed to match the reg property: arch/arm/boot/dts/marvell/armada-390-db.dts:84.10-106.4: Warning (spi_bus_reg): /soc/spi@10680/flash@1: SPI bus unit address format error, expected "0" Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-22	hrtimer: Add missing sparse annotations to hrtimer locking	Ben Dooks
	Sparse warns about lock imbalance vs. the hrtimer_base lock due to missing sparse annotations: kernel/time/hrtimer.c:175:33: warning: context imbalance in 'lock_hrtimer_base' - wrong count at exit kernel/time/hrtimer.c:1301:28: warning: context imbalance in 'hrtimer_start_range_ns' - unexpected unlock kernel/time/hrtimer.c:1336:28: warning: context imbalance in 'hrtimer_try_to_cancel' - unexpected unlock kernel/time/hrtimer.c:1457:9: warning: context imbalance in '__hrtimer_get_remaining' - unexpected unlock Add the annotations to the relevant functions. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230621075928.394481-1-ben.dooks@codethink.co.uk
2023-06-22	platform/x86/amd/pmf: Register notify handler only if SPS is enabled	Shyam Sundar S K
	Power source notify handler is getting registered even when none of the PMF feature in enabled leading to a crash. ... [ 22.592162] Call Trace: [ 22.592164] <TASK> [ 22.592164] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592166] ? __warn+0x81/0x130 [ 22.592171] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592172] ? report_bug+0x171/0x1a0 [ 22.592175] ? prb_read_valid+0x1b/0x30 [ 22.592177] ? handle_bug+0x3c/0x80 [ 22.592178] ? exc_invalid_op+0x17/0x70 [ 22.592179] ? asm_exc_invalid_op+0x1a/0x20 [ 22.592182] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592183] ? acpi_ut_delete_object_desc+0x86/0xb0 [ 22.592186] ? acpi_ut_update_ref_count.part.0+0x22d/0x930 [ 22.592187] __schedule+0xc0/0x1410 [ 22.592189] ? ktime_get+0x3c/0xa0 [ 22.592191] ? lapic_next_event+0x1d/0x30 [ 22.592193] ? hrtimer_start_range_ns+0x25b/0x350 [ 22.592196] schedule+0x5e/0xd0 [ 22.592197] schedule_hrtimeout_range_clock+0xbe/0x140 [ 22.592199] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 22.592200] usleep_range_state+0x64/0x90 [ 22.592203] amd_pmf_send_cmd+0x106/0x2a0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592207] amd_pmf_update_slider+0x56/0x1b0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592210] amd_pmf_set_sps_power_limits+0x72/0x80 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592213] amd_pmf_pwr_src_notify_call+0x49/0x90 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592216] notifier_call_chain+0x5a/0xd0 [ 22.592218] atomic_notifier_call_chain+0x32/0x50 ... Fix this by moving the registration of source change notify handler only when SPS(Static Slider) is advertised as supported. Reported-by: Allen Zhong <allen@atr.me> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571 Fixes: 4c71ae414474 ("platform/x86/amd/pmf: Add support SPS PMF feature") Tested-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20230622060309.310001-1-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-06-22	selftests: forwarding: Fix race condition in mirror installation	Danielle Ratson
	When mirroring to a gretap in hardware the device expects to be programmed with the egress port and all the encapsulating headers. This requires the driver to resolve the path the packet will take in the software data path and program the device accordingly. If the path cannot be resolved (in this case because of an unresolved neighbor), then mirror installation fails until the path is resolved. This results in a race that causes the test to sometimes fail. Fix this by setting the neighbor's state to permanent in a couple of tests, so that it is always valid. Fixes: 35c31d5c323f ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1d") Fixes: 239e754af854 ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/268816ac729cb6028c7a34d4dda6f4ec7af55333.1687264607.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	remoteproc: stm32: use correct format strings on 64-bit	Arnd Bergmann
	With CONFIG_ARCH_STM32 making it into arch/arm64, a couple of format strings no longer work, since they rely on size_t being compatible with %x, or they print an 'int' using %z: drivers/remoteproc/stm32_rproc.c: In function 'stm32_rproc_mem_alloc': drivers/remoteproc/stm32_rproc.c:122:22: error: format '%x' expects argument of type 'unsigned int', but argument 5 has type 'size_t' {aka 'long unsigned int'} [-Werror=format=] drivers/remoteproc/stm32_rproc.c:122:40: note: format string is defined here 122 \| dev_dbg(dev, "map memory: %pa+%x\n", &mem->dma, mem->len); \| ~^ \| \| \| unsigned int \| %lx drivers/remoteproc/stm32_rproc.c:125:30: error: format '%x' expects argument of type 'unsigned int', but argument 4 has type 'size_t' {aka 'long unsigned int'} [-Werror=format=] drivers/remoteproc/stm32_rproc.c:125:65: note: format string is defined here 125 \| dev_err(dev, "Unable to map memory region: %pa+%x\n", \| ~^ \| \| \| unsigned int \| %lx drivers/remoteproc/stm32_rproc.c: In function 'stm32_rproc_get_loaded_rsc_table': drivers/remoteproc/stm32_rproc.c:646:30: error: format '%zx' expects argument of type 'size_t', but argument 4 has type 'int' [-Werror=format=] drivers/remoteproc/stm32_rproc.c:646:66: note: format string is defined here 646 \| dev_err(dev, "Unable to map memory region: %pa+%zx\n", \| ~~^ \| \| \| long unsigned int \| %x Fix up all three instances to work across architectures, and enable compile testing for this driver to ensure it builds everywhere. Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-22	Merge patch series "can: kvaser_pciefd: Fixes and improvements"	Marc Kleine-Budde
	Jimmy Assarsson <extja@kvaser.com> says: This patch series contains various non critical fixes and improvements for the kvaser_pciefd driver. Link: https://lore.kernel.org/all/20230529134248.752036-1-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-06-22	can: kvaser_pciefd: Use TX FIFO size read from CAN controller	Jimmy Assarsson
	Use the TX FIFO size read from CAN controller register, instead of using hard coded value. Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20230529134248.752036-15-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-06-22	can: kvaser_pciefd: Refactor code	Jimmy Assarsson
	Refactor code; - Format code - Rename variables and macros - Remove intermediate variables - Add/remove blank lines - Reduce scope of variables - Add helper functions Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20230529134248.752036-14-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-06-22	can: kvaser_pciefd: Add len8_dlc support	Jimmy Assarsson
	Add support for the Classical CAN raw DLC functionality to send and receive DLC values from 9 .. 15. Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20230529134248.752036-13-extja@kvaser.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>