summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-17mm/migrate: fix wrongly apply write bit after mkdirty on sparc64Peter Xu
Nick Bowler reported another sparc64 breakage after the young/dirty persistent work for page migration (per "Link:" below). That's after a similar report [2]. It turns out page migration was overlooked, and it wasn't failing before because page migration was not enabled in the initial report test environment. David proposed another way [2] to fix this from sparc64 side, but that patch didn't land somehow. Neither did I check whether there's any other arch that has similar issues. Let's fix it for now as simple as moving the write bit handling to be after dirty, like what we did before. Note: this is based on mm-unstable, because the breakage was since 6.1 and we're at a very late stage of 6.2 (-rc8), so I assume for this specific case we should target this at 6.3. [1] https://lore.kernel.org/all/20221021160603.GA23307@u164.east.ru/ [2] https://lore.kernel.org/all/20221212130213.136267-1-david@redhat.com/ Link: https://lkml.kernel.org/r/20230216153059.256739-1-peterx@redhat.com Fixes: 2e3468778dbe ("mm: remember young/dirty bit for page migrations") Link: https://lore.kernel.org/all/CADyTPExpEqaJiMGoV+Z6xVgL50ZoMJg49B10LcZ=8eg19u34BA@mail.gmail.com/ Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Nick Bowler <nbowler@draconx.ca> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Nick Bowler <nbowler@draconx.ca> Cc: <regressions@lists.linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-17Merge tag 'powerpc-6.2-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Prevent fallthrough to hash TLB flush when using radix Thanks to Benjamin Gray and Erhard Furtner. * tag 'powerpc-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Prevent fallthrough to hash TLB flush when using radix
2023-02-17Merge tag 'nfs-for-6.2-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client fix from Trond Myklebust: "Unfortunately, we found another bug in the NFSv4.2 READ_PLUS code. Since it has not been possible to fix the bug in time for the 6.2 release, let's just revert the Kconfig change that enables it: - Revert 'NFSv4.2: Change the default KConfig value for READ_PLUS'" * tag 'nfs-for-6.2-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "NFSv4.2: Change the default KConfig value for READ_PLUS"
2023-02-17Merge tag 'sound-fix-6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few last-minute fixes. The significant ones are two ASoC SOF regression fixes while the rest are trivial HD-audio quirks. All are small / one-liners and should be pretty safe to take" * tag 'sound-fix-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: SOF: Intel: hda-dai: fix possible stream_tag leak ALSA: hda/realtek: Enable mute/micmute LEDs and speaker support for HP Laptops ALSA: hda/realtek: fix mute/micmute LEDs don't work for a HP platform. ALSA: hda/realtek - fixed wrong gpio assigned ALSA: hda: Fix codec device field initializan ALSA: hda/conexant: add a new hda codec SN6180 ASoC: SOF: ops: refine parameters order in function snd_sof_dsp_update8
2023-02-17Merge remote-tracking branch 'spi/for-6.3' into spi-nextMark Brown
2023-02-17Merge tag 'gpio-fixes-for-v6.2-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: - fix a memory leak in gpio-sim that was triggered every time libgpiod tests are run in user-space * tag 'gpio-fixes-for-v6.2-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: sim: fix a memory leak
2023-02-17Merge tag 'ata-6.2-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: "Three small fixes for 6.2 final: - Disable READ LOG DMA EXT for Samsung MZ7LH drives as these drives choke on that command, from Patrick. - Add Intel Tiger Lake UP{3,4} to the list of supported AHCI controllers (this is not technically a bug fix, but it is trivial enough that I add it here), from Simon. - Fix code comments in the pata_octeon_cf driver as incorrect formatting was causing warnings from kernel-doc, from Randy" * tag 'ata-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: pata_octeon_cf: drop kernel-doc notation ata: ahci: Add Tiger Lake UP{3,4} AHCI controller ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LH
2023-02-17Merge tag 'mmc-v6.2-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix potential resource leaks in SDIO card detection error path MMC host: - jz4740: Decrease maximum clock rate to workaround bug on JZ4760(B) - meson-gx: Fix SDIO support to get some WiFi modules to work again - mmc_spi: Fix error handling in ->probe()" * tag 'mmc-v6.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: jz4740: Work around bug on JZ4760(B) mmc: mmc_spi: fix error handling in mmc_spi_probe() mmc: sdio: fix possible resource leaks in some error paths mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't set
2023-02-17Merge tag 'sched-urgent-2023-02-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix user-after-free bug in call_usermodehelper_exec() - Fix missing user_cpus_ptr update in __set_cpus_allowed_ptr_locked() - Fix PSI use-after-free bug in ep_remove_wait_queue() * tag 'sched-urgent-2023-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/psi: Fix use-after-free in ep_remove_wait_queue() sched/core: Fix a missed update of user_cpus_ptr freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL
2023-02-17selftests/bpf: Add bpf_fib_lookup testMartin KaFai Lau
This patch tests the bpf_fib_lookup helper when looking up a neigh in NUD_FAILED and NUD_STALE state. It also adds test for the new BPF_FIB_LOOKUP_SKIP_NEIGH flag. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230217205515.3583372-2-martin.lau@linux.dev
2023-02-17bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookupMartin KaFai Lau
The bpf_fib_lookup() also looks up the neigh table. This was done before bpf_redirect_neigh() was added. In the use case that does not manage the neigh table and requires bpf_fib_lookup() to lookup a fib to decide if it needs to redirect or not, the bpf prog can depend only on using bpf_redirect_neigh() to lookup the neigh. It also keeps the neigh entries fresh and connected. This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid the double neigh lookup when the bpf prog always call bpf_redirect_neigh() to do the neigh lookup. The params->smac output is skipped together when SKIP_NEIGH is set because bpf_redirect_neigh() will figure out the smac also. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230217205515.3583372-1-martin.lau@linux.dev
2023-02-17riscv, bpf: Add bpf trampoline support for RV64Pu Lehui
BPF trampoline is the critical infrastructure of the BPF subsystem, acting as a mediator between kernel functions and BPF programs. Numerous important features, such as using BPF program for zero overhead kernel introspection, rely on this key component. We can't wait to support bpf trampoline on RV64. The related tests have passed, as well as the test_verifier with no new failure ceses. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/bpf/20230215135205.1411105-5-pulehui@huaweicloud.com
2023-02-17riscv, bpf: Add bpf_arch_text_poke support for RV64Pu Lehui
Implement bpf_arch_text_poke for RV64. For call scenario, to make BPF trampoline compatible with the kernel and BPF context, we follow the framework of RV64 ftrace to reserve 4 nops for BPF programs as function entry, and use auipc+jalr instructions for function call. However, since auipc+jalr call instruction is non-atomic operation, we need to use stop-machine to make sure instructions patching in atomic context. Also, we use auipc+jalr pair and need to patch in stop-machine context for jump scenario. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/bpf/20230215135205.1411105-4-pulehui@huaweicloud.com
2023-02-17riscv, bpf: Factor out emit_call for kernel and bpf contextPu Lehui
The current emit_call function is not suitable for kernel function call as it store return value to bpf R0 register. We can separate it out for common use. Meanwhile, simplify judgment logic, that is, fixed function address can use jal or auipc+jalr, while the unfixed can use only auipc+jalr. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/bpf/20230215135205.1411105-3-pulehui@huaweicloud.com
2023-02-17riscv: Extend patch_text for multiple instructionsPu Lehui
Extend patch_text for multiple instructions. This is the preparaiton for multiple instructions text patching in riscv BPF trampoline, and may be useful for other scenario. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/bpf/20230215135205.1411105-2-pulehui@huaweicloud.com
2023-02-17Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES"Martin KaFai Lau
This reverts commit 6c20822fada1b8adb77fa450d03a0d449686a4a9. build bot failed on arch with different cache line size: https://lore.kernel.org/bpf/50c35055-afa9-d01e-9a05-ea5351280e4f@intel.com/ Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-17perf tests stat_all_metrics: Change true workload to sleep workload for ↵Kajol Jain
system wide check Testcase stat_all_metrics.sh fails in powerpc: 98: perf all metrics test : FAILED! Logs with verbose: [command]# ./perf test 98 -vv 98: perf all metrics test : --- start --- test child forked, pid 13262 Testing BRU_STALL_CPI Testing COMPLETION_STALL_CPI ---- Testing TOTAL_LOCAL_NODE_PUMPS_P23 Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in: Error: Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, enable system wide with '-a'. Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01 Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in: Error: Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, enable system wide with '-a'. ---- Based on above logs, we could see some of the hv-24x7 metric events fails, and logs suggest to run the metric event with -a option. This change happened after the commit a4b8cfcabb1d90ec ("perf stat: Delay metric parsing"), which delayed the metric parsing phase and now before metric parsing phase perf tool identifies, whether target is system-wide or not. With this change, perf_event_open will fails with workload monitoring for uncore events as expected. The perf all metric test case fails as some of the hv-24x7 metric events may need bigger workload with system wide monitoring to get the data. Fix this issue by changing current system wide check from true workload to sleep 0.01 workload. Result with the patch changes in powerpc: 98: perf all metrics test : Ok Fixes: a4b8cfcabb1d90ec ("perf stat: Delay metric parsing") Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230215093827.124921-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-17selftests/bpf: Add global subprog context passing testsAndrii Nakryiko
Add tests validating that it's possible to pass context arguments into global subprogs for various types of programs, including a particularly tricky KPROBE programs (which cover kprobes, uprobes, USDTs, a vast and important class of programs). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230216045954.3002473-4-andrii@kernel.org
2023-02-17selftests/bpf: Convert test_global_funcs test to test_loader frameworkAndrii Nakryiko
Convert 17 test_global_funcs subtests into test_loader framework for easier maintenance and more declarative way to define expected failures/successes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230216045954.3002473-3-andrii@kernel.org
2023-02-17bpf: Fix global subprog context argument resolution logicAndrii Nakryiko
KPROBE program's user-facing context type is defined as typedef bpf_user_pt_regs_t. This leads to a problem when trying to passing kprobe/uprobe/usdt context argument into global subprog, as kernel always strip away mods and typedefs of user-supplied type, but takes expected type from bpf_ctx_convert as is, which causes mismatch. Current way to work around this is to define a fake struct with the same name as expected typedef: struct bpf_user_pt_regs_t {}; __noinline my_global_subprog(struct bpf_user_pt_regs_t *ctx) { ... } This patch fixes the issue by resolving expected type, if it's not a struct. It still leaves the above work-around working for backwards compatibility. Fixes: 91cc1a99740e ("bpf: Annotate context types") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230216045954.3002473-2-andrii@kernel.org
2023-02-17perf vendor events power10: Add JSON metric events to present CPI stall ↵Athira Rajeev
cycles in powerpc Power10 Performance Monitoring Unit (PMU) provides events to understand stall cycles of different pipeline stages. These events along with completed instructions provides useful metrics for application tuning. Patch implements the JSON changes to collect counter statistics to present the high level CPI stall breakdown metrics. New metric group is named as "CPI_STALL_RATIO" and this new metric group presents these stall metrics: - DISPATCHED_CPI ( Dispatch stall cycles per insn ) - ISSUE_STALL_CPI ( Issue stall cycles per insn ) - EXECUTION_STALL_CPI ( Execution stall cycles per insn ) - COMPLETION_STALL_CPI ( Completition stall cycles per insn ) To avoid multipling of events, PM_RUN_INST_CMPL event has been modified to use PMC5(performance monitoring counter5) instead of PMC4. This change is needed, since completion stall event is using PMC4. Usage example: ./perf stat --metric-no-group -M CPI_STALL_RATIO <workload> Performance counter stats for 'workload': 63,056,817,982 PM_CMPL_STALL # 0.28 COMPLETION_STALL_CPI 1,743,988,038,896 PM_ISSUE_STALL # 7.73 ISSUE_STALL_CPI 225,597,495,030 PM_RUN_INST_CMPL # 6.18 DISPATCHED_CPI # 37.48 EXECUTION_STALL_CPI 1,393,916,546,654 PM_DISP_STALL_CYC 8,455,376,836,463 PM_EXEC_STALL "--metric-no-group" is used for forcing PM_RUN_INST_CMPL to be scheduled in all group for more accuracy. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230216061240.18067-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-17dm ioctl: remove unnecessary check when using dm_get_mdptr()Hou Tao
__hash_remove() removes hash_cell with _hash_lock locked, so acquiring _hash_lock can guarantee no-NULL hc returned from dm_get_mdptr() must have not been removed and hc->md must still be md. __hash_remove() also acquires dm_hash_cells_mutex before setting mdptr as NULL. So in dm_copy_name_and_uuid(), after acquiring dm_hash_cells_mutex and ensuring returned hc is not NULL, the returned hc must still be alive and hc->md must still be md. Remove the unnecessary hc->md != md checks when using dm_get_mdptr() with _hash_lock or dm_hash_cells_mutex acquired. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-17dm ioctl: assert _hash_lock is held in __hash_removeMike Snitzer
Also update dm_early_create() to take _hash_lock when calling both __get_name_cell and __hash_remove -- given dm_early_create()'s early boot usecase this locking isn't about correctness but it allows lockdep_assert_held() to be added to __hash_remove. Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-17dm cache: add cond_resched() to various workqueue loopsMike Snitzer
Otherwise on resource constrained systems these workqueues may be too greedy. Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-17dm thin: add cond_resched() to various workqueue loopsMike Snitzer
Otherwise on resource constrained systems these workqueues may be too greedy. Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-17spi: dt-bindings: qcom,spi-qcom-qspi: document OPP and power-domainsKrzysztof Kozlowski
QSPI on Qualcomm SDM845, SC7180 and SC7280 SoCs uses OPP table (both in DTS and Linux driver) and is suuplied by CX power domain. Document missing properties to fix: sc7280-idp2.dtb: spi@88dc000: Unevaluated properties are not allowed ('operating-points-v2', 'power-domains' were unexpected) Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230217155802.848178-1-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-17LoongArch, bpf: Use 4 instructions for function address in JITHengqi Chen
This patch fixes the following issue of function calls in JIT, like: [ 29.346981] multi-func JIT bug 105 != 103 The issus can be reproduced by running the "inline simple bpf_loop call" verifier test. This is because we are emiting 2-4 instructions for 64-bit immediate moves. During the first pass of JIT, the placeholder address is zero, emiting two instructions for it. In the extra pass, the function address is in XKVRANGE, emiting four instructions for it. This change the instruction index in JIT context. Let's always use 4 instructions for function address in JIT. So that the instruction sequences don't change between the first pass and the extra pass for function calls. Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/bpf/20230214152633.2265699-1-hengqi.chen@gmail.com
2023-02-17wifi: rtl8xxxu: add LEDS_CLASS dependencyArnd Bergmann
rtl8xxxu now unconditionally uses LEDS_CLASS, so a Kconfig dependency is required to avoid link errors: aarch64-linux-ld: drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.o: in function `rtl8xxxu_disconnect': rtl8xxxu_core.c:(.text+0x730): undefined reference to `led_classdev_unregister' ERROR: modpost: "led_classdev_unregister" [drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.ko] undefined! ERROR: modpost: "led_classdev_register_ext" [drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.ko] undefined! Fixes: 3be01622995b ("wifi: rtl8xxxu: Register the LED and make it blink") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230217095910.2480356-1-arnd@kernel.org
2023-02-17Merge tag 'nvme-6.2-2022-02-17' of git://git.infradead.org/nvme into block-6.2Jens Axboe
Pull NVMe fix from Christoph: "nvme fix for Linux 6.2 - fix visibility of the CMB sysfs attributes (Keith Busch)" * tag 'nvme-6.2-2022-02-17' of git://git.infradead.org/nvme: nvme-pci: refresh visible attrs for cmb attributes
2023-02-17bpf: bpf_fib_lookup should not return neigh in NUD_FAILED stateMartin KaFai Lau
The bpf_fib_lookup() helper does not only look up the fib (ie. route) but it also looks up the neigh. Before returning the neigh, the helper does not check for NUD_VALID. When a neigh state (neigh->nud_state) is in NUD_FAILED, its dmac (neigh->ha) could be all zeros. The helper still returns SUCCESS instead of NO_NEIGH in this case. Because of the SUCCESS return value, the bpf prog directly uses the returned dmac and ends up filling all zero in the eth header. This patch checks for NUD_VALID and returns NO_NEIGH if the neigh is not valid. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230217004150.2980689-3-martin.lau@linux.dev
2023-02-17bpf: Disable bh in bpf_test_run for xdp and tc progMartin KaFai Lau
Some of the bpf helpers require bh disabled. eg. The bpf_fib_lookup helper that will be used in a latter selftest. In particular, it calls ___neigh_lookup_noref that expects the bh disabled. This patch disables bh before calling bpf_prog_run[_xdp], so the testing prog can also use those helpers. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230217004150.2980689-2-martin.lau@linux.dev
2023-02-17xsk: check IFF_UP earlier in Tx pathMaciej Fijalkowski
Xsk Tx can be triggered via either sendmsg() or poll() syscalls. These two paths share a call to common function xsk_xmit() which has two sanity checks within. A pseudo code example to show the two paths: __xsk_sendmsg() : xsk_poll(): if (unlikely(!xsk_is_bound(xs))) if (unlikely(!xsk_is_bound(xs))) return -ENXIO; return mask; if (unlikely(need_wait)) (...) return -EOPNOTSUPP; xsk_xmit() mark napi id (...) xsk_xmit() xsk_xmit(): if (unlikely(!(xs->dev->flags & IFF_UP))) return -ENETDOWN; if (unlikely(!xs->tx)) return -ENOBUFS; As it can be observed above, in sendmsg() napi id can be marked on interface that was not brought up and this causes a NULL ptr dereference: [31757.505631] BUG: kernel NULL pointer dereference, address: 0000000000000018 [31757.512710] #PF: supervisor read access in kernel mode [31757.517936] #PF: error_code(0x0000) - not-present page [31757.523149] PGD 0 P4D 0 [31757.525726] Oops: 0000 [#1] PREEMPT SMP NOPTI [31757.530154] CPU: 26 PID: 95641 Comm: xdpsock Not tainted 6.2.0-rc5+ #40 [31757.536871] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [31757.547457] RIP: 0010:xsk_sendmsg+0xde/0x180 [31757.551799] Code: 00 75 a2 48 8b 00 a8 04 75 9b 84 d2 74 69 8b 85 14 01 00 00 85 c0 75 1b 48 8b 85 28 03 00 00 48 8b 80 98 00 00 00 48 8b 40 20 <8b> 40 18 89 85 14 01 00 00 8b bd 14 01 00 00 81 ff 00 01 00 00 0f [31757.570840] RSP: 0018:ffffc90034f27dc0 EFLAGS: 00010246 [31757.576143] RAX: 0000000000000000 RBX: ffffc90034f27e18 RCX: 0000000000000000 [31757.583389] RDX: 0000000000000001 RSI: ffffc90034f27e18 RDI: ffff88984cf3c100 [31757.590631] RBP: ffff88984714a800 R08: ffff88984714a800 R09: 0000000000000000 [31757.597877] R10: 0000000000000001 R11: 0000000000000000 R12: 00000000fffffffa [31757.605123] R13: 0000000000000000 R14: 0000000000000003 R15: 0000000000000000 [31757.612364] FS: 00007fb4c5931180(0000) GS:ffff88afdfa00000(0000) knlGS:0000000000000000 [31757.620571] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [31757.626406] CR2: 0000000000000018 CR3: 000000184b41c003 CR4: 00000000007706e0 [31757.633648] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [31757.640894] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [31757.648139] PKRU: 55555554 [31757.650894] Call Trace: [31757.653385] <TASK> [31757.655524] sock_sendmsg+0x8f/0xa0 [31757.659077] ? sockfd_lookup_light+0x12/0x70 [31757.663416] __sys_sendto+0xfc/0x170 [31757.667051] ? do_sched_setscheduler+0xdb/0x1b0 [31757.671658] __x64_sys_sendto+0x20/0x30 [31757.675557] do_syscall_64+0x38/0x90 [31757.679197] entry_SYSCALL_64_after_hwframe+0x72/0xdc [31757.687969] Code: 8e f6 ff 44 8b 4c 24 2c 4c 8b 44 24 20 41 89 c4 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 3a 44 89 e7 48 89 44 24 08 e8 b5 8e f6 ff 48 [31757.707007] RSP: 002b:00007ffd49c73c70 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [31757.714694] RAX: ffffffffffffffda RBX: 000055a996565380 RCX: 00007fb4c5727c16 [31757.721939] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 [31757.729184] RBP: 0000000000000040 R08: 0000000000000000 R09: 0000000000000000 [31757.736429] R10: 0000000000000040 R11: 0000000000000293 R12: 0000000000000000 [31757.743673] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [31757.754940] </TASK> To fix this, let's make xsk_xmit a function that will be responsible for generic Tx, where RCU is handled accordingly and pull out sanity checks and xs->zc handling. Populate sanity checks to __xsk_sendmsg() and xsk_poll(). Fixes: ca2e1a627035 ("xsk: Mark napi_id on sendmsg()") Fixes: 18b1ab7aa76b ("xsk: Fix race at socket teardown") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230215143309.13145-1-maciej.fijalkowski@intel.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2023-02-17Revert "NFSv4.2: Change the default KConfig value for READ_PLUS"Anna Schumaker
This reverts commit 7fd461c47c6cfab4ca4d003790ec276209e52978. Unfortunately, it has come to our attention that there is still a bug somewhere in the READ_PLUS code that can result in nfsroot systems on ARM to crash during boot. Let's do the right thing and revert this change so we don't break people's nfsroot setups. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2023-02-17perf intel-pt: Synthesize cycle eventsSteinar H. Gunderson
There is no good reason why we cannot synthesize "cycle" events from Intel PT just as we can synthesize "instruction" events, in particular when CYC packets are available. This enables using PT to getting much more accurate cycle profiles than regular sampling (record -e cycles) when the work last for very short periods (<10 ms). Thus, add support for this, based off of the existing IPC calculation framework. The new option to --itrace is "y" (for cYcles), as c was taken for calls. Cycle and instruction events can be synthesized together, and are by default. The only real caveat is that CYC packets are only emitted whenever some other packet is, which in practice is when a branch instruction is encountered (and not even all branches). Thus, even at no subsampling (e.g. --itrace=y0ns), it is impossible to get more accuracy than a single basic block, and all cycles spent executing that block will get attributed to the branch instruction that ends the packet. Thus, one cannot know whether the cycles came from e.g. a specific load, a mispredicted branch, or something else. When subsampling (which is the default), the cycle events will get smeared out even more, but will still be generally useful to attribute cycle counts to functions. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Steinar H. Gunderson <sesse@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220322082452.1429091-1-sesse@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-17brd: use radix_tree_maybe_preload instead of radix_tree_preloadPankaj Raghav
Unconditionally calling radix_tree_preload_end() results in a OOPS message as the preload is only conditionally called for gfpflags_allow_blocking(). [ 20.267323] BUG: using smp_processor_id() in preemptible [00000000] code: fio/416 [ 20.267837] caller is brd_insert_page.part.0+0xbe/0x190 [brd] [ 20.269436] Call Trace: [ 20.269598] <TASK> [ 20.269742] dump_stack_lvl+0x32/0x50 [ 20.269982] check_preemption_disabled+0xd1/0xe0 [ 20.270289] brd_insert_page.part.0+0xbe/0x190 [brd] [ 20.270664] brd_submit_bio+0x33f/0xf40 [brd] Use radix_tree_maybe_preload() which does preload only if gfpflags_allow_blocking() is true but also takes the lock. Therefore, unconditionally calling radix_tree_preload_end() should not create any issues and the message disappears. Fixes: 6ded703c56c2 ("brd: check for REQ_NOWAIT and set correct page allocation mask") Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20230217121442.33914-1-p.raghav@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-02-17netfilter: let reset rules clean out conntrack entriesFlorian Westphal
iptables/nftables support responding to tcp packets with tcp resets. The generated tcp reset packet passes through both output and postrouting netfilter hooks, but conntrack will never see them because the generated skb has its ->nfct pointer copied over from the packet that triggered the reset rule. If the reset rule is used for established connections, this may result in the conntrack entry to be around for a very long time (default timeout is 5 days). One way to avoid this would be to not copy the nf_conn pointer so that the rest packet passes through conntrack too. Problem is that output rules might not have the same conntrack zone setup as the prerouting ones, so its possible that the reset skb won't find the correct entry. Generating a template entry for the skb seems error prone as well. Add an explicit "closing" function that switches a confirmed conntrack entry to closed state and wire this up for tcp. If the entry isn't confirmed, no action is needed because the conntrack entry will never be committed to the table. Reported-by: Russel King <linux@armlinux.org.uk> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-02-17Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Some of the devlink bits were tricky, but I think I got it right. Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-17gpio: sim: fix a memory leakBartosz Golaszewski
Fix an inverted logic bug in gpio_sim_remove_hogs() that leads to GPIO hog structures never being freed. Fixes: cb8c474e79be ("gpio: sim: new testing module") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2023-02-17wifi: iwlegacy: avoid fortify warningJohannes Berg
There are two different alive messages, the "init" one is bigger than the other one, so we have a fortify read warn here. Avoid it by copying from the variable-sized 'raw' instead. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230216203444.134310-1-johannes@sipsolutions.net
2023-02-17wifi: iwlwifi: mvm: remove unused iwl_dbgfs_is_match()Johannes Berg
This inline function is unused, remove it. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230216205754.d500dcc2e90c.Id87df297263f86b5bba002f7cbb387abc13adf53@changeid
2023-02-17wifi: rtw89: fix AP mode authentication transmission failedPo-Hao Huang
For some ICs, packets can't be sent correctly without initializing CMAC table first. Previous flow do this initialization after associated, results in authentication response fails to transmit. Move the initialization up front to a proper place to solve this. Signed-off-by: Po-Hao Huang <phhuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230216082807.22285-1-pkshih@realtek.com
2023-02-17wifi: rtw88: use RTW_FLAG_POWERON flag to prevent to power on/off twicePing-Ke Shih
Use power state to decide whether we can enter or leave IPS accurately, and then prevent to power on/off twice. The commit 6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS") would like to prevent this as well, but it still can't entirely handle all cases. The exception is that WiFi gets connected and does suspend/resume, it will power on twice and cause it failed to power on after resuming, like: rtw_8723de 0000:03:00.0: failed to poll offset=0x6 mask=0x2 value=0x2 rtw_8723de 0000:03:00.0: mac power on failed rtw_8723de 0000:03:00.0: failed to power on mac rtw_8723de 0000:03:00.0: leave idle state failed rtw_8723de 0000:03:00.0: failed to leave ips state rtw_8723de 0000:03:00.0: failed to leave idle state rtw_8723de 0000:03:00.0: failed to send h2c command To fix this, introduce new flag RTW_FLAG_POWERON to reflect power state, and call rtw_mac_pre_system_cfg() to configure registers properly between power-off/-on. Reported-by: Paul Gover <pmw.gover@yahoo.co.uk> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217016 Fixes: 6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS") Cc: <Stable@vger.kernel.org> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230216053633.20366-1-pkshih@realtek.com
2023-02-17dma-buf: make kobj_type structure constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definition to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20230217-kobj_type-dma-buf-v1-1-b84a3616522c@weissschuh.net
2023-02-17Merge tag 'asoc-fix-v6.2-rc8-2' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: One more fix for v6.2 One more fix from Peter which he'd very much like to get into v6.2.
2023-02-17nvme-pci: refresh visible attrs for cmb attributesKeith Busch
The sysfs group containing the cmb attributes is registered before the driver knows if they need to be visible or not. Update the group when cmb attributes are known to exist so the visibility setting is correct. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037 Fixes: 86adbf0cdb9ec65 ("nvme: simplify transport specific device attribute handling") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-16mm: memcontrol: rename memcg_kmem_enabled()Roman Gushchin
Currently there are two kmem-related helper functions with a confusing semantics: memcg_kmem_enabled() and mem_cgroup_kmem_disabled(). The problem is that an obvious expectation memcg_kmem_enabled() == !mem_cgroup_kmem_disabled(), can be false. mem_cgroup_kmem_disabled() is similar to mem_cgroup_disabled(): it returns true only if CONFIG_MEMCG_KMEM is not set or the kmem accounting is disabled using a boot time kernel option "cgroup.memory=nokmem". It never changes the value dynamically. memcg_kmem_enabled() is different: it always returns false until the first non-root memory cgroup will get online (assuming the kernel memory accounting is enabled). It's goal is to improve the performance on systems without the cgroupfs mounted/memory controller enabled or on the systems with only the root memory cgroup. To make things more obvious and avoid potential bugs, let's rename memcg_kmem_enabled() to memcg_kmem_online(). Link: https://lkml.kernel.org/r/20230213192922.1146370-1-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Dennis Zhou <dennis@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16sh: initialize max_mapnrMike Rapoport (IBM)
sh never initializes max_mapnr which is used by the generic implementation of pfn_valid(). Initialize max_mapnr with set_max_mapnr() in sh::paging_init(). Link: https://lkml.kernel.org/r/20230214140729.1649961-3-rppt@kernel.org Fixes: e5080a967785 ("mm, arch: add generic implementation of pfn_valid() for FLATMEM") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16m68k/nommu: add missing definition of ARCH_PFN_OFFSETMike Rapoport (IBM)
Patch series "fixups for generic implementation of pfn_valid()". Guenter reported boot failures on m68k-nommu and sh caused by the switch to the generic implementation of pfn_valid(): https://lore.kernel.org/all/20230212173513.GA4052259@roeck-us.net https://lore.kernel.org/all/20230212161320.GA3784076@roeck-us.net These are small fixups that address the issues. This patch (of 2): On m68k/nommu RAM does not necessarily start at 0x0 and when it does not pfn_valid() uses a wrong offset into the memory map which causes silent boot failures. Define ARCH_PFN_OFFSET to make pfn_valid() use the correct offset. Link: https://lkml.kernel.org/r/20230214140729.1649961-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20230214140729.1649961-2-rppt@kernel.org Fixes: d82f07f06cf8 ("m68k: use asm-generic/memory_model.h for both MMU and !MMU") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Rich Felker <dalias@libc.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16mm: percpu: fix incorrect size in pcpu_obj_full_size()Yafang Shao
The extra space which is used to store the obj_cgroup membership is only valid when kmemcg is enabled. The kmemcg can be disabled via the kernel parameter "cgroup.memory=nokmem" at boot time. This helper is also used in non-memcg code, for example the tracepoint, so we should fix it. It was found by code review when I was implementing bpf memory usage[1]. No real issue happens in production environment. [1]. https://lwn.net/Articles/921991/ Link: https://lkml.kernel.org/r/20230214153549.12291-1-laoar.shao@gmail.com Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Vasily Averin <vvs@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-16maple_tree: reduce stack usage with gcc-9 and earlierArnd Bergmann
gcc-10 changed the way inlining works to be less aggressive, but older versions run into an oversized stack frame warning whenever CONFIG_KASAN_STACK is enabled, as that forces variables from inlined callees to be non-overlapping: lib/maple_tree.c: In function 'mas_wr_bnode': lib/maple_tree.c:4320:1: error: the frame size of 1424 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Change the annotations on mas_store_b_node() and mas_commit_b_node() to explicitly forbid inlining in this configuration, which is the same behavior that newer versions already have. Link: https://lkml.kernel.org/r/20230214103030.1051950-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vernon Yang <vernon2gm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>