summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-15KVM: x86/pmu: Disable vPMU support on hybrid CPUs (host PMUs)Sean Christopherson
Disable KVM support for virtualizing PMUs on hosts with hybrid PMUs until KVM gains a sane way to enumeration the hybrid vPMU to userspace and/or gains a mechanism to let userspace opt-in to the dangers of exposing a hybrid vPMU to KVM guests. Virtualizing a hybrid PMU, or at least part of a hybrid PMU, is possible, but it requires careful, deliberate configuration from userspace. E.g. to expose full functionality, vCPUs need to be pinned to pCPUs to prevent migrating a vCPU between a big core and a little core, userspace must enumerate a reasonable topology to the guest, and guest CPUID must be curated per vCPU to enumerate accurate vPMU capabilities. The last point is especially problematic, as KVM doesn't control which pCPU it runs on when enumerating KVM's vPMU capabilities to userspace, i.e. userspace can't rely on KVM_GET_SUPPORTED_CPUID in it's current form. Alternatively, userspace could enable vPMU support by enumerating the set of features that are common and coherent across all cores, e.g. by filtering PMU events and restricting guest capabilities. But again, that requires userspace to take action far beyond reflecting KVM's supported feature set into the guest. For now, simply disable vPMU support on hybrid CPUs to avoid inducing seemingly random #GPs in guests, and punt support for hybrid CPUs to a future enabling effort. Reported-by: Jianfeng Gao <jianfeng.gao@intel.com> Cc: stable@vger.kernel.org Cc: Andrew Cooper <Andrew.Cooper3@citrix.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/all/20220818181530.2355034-1-kan.liang@linux.intel.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230208204230.1360502-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-15x86/build: Make 64-bit defconfig the defaultArnd Bergmann
Running 'make ARCH=x86 defconfig' on anything other than an x86_64 machine currently results in a 32-bit build, which is rarely what anyone wants these days. Change the default so that the 64-bit config gets used unless the user asks for i386_defconfig, uses ARCH=i386 or runs on a system that "uname -m" identifies as i386/i486/i586/i686. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230215091706.1623070-1-arnd@kernel.org
2023-02-15sched/psi: Fix use-after-free in ep_remove_wait_queue()Munehisa Kamata
If a non-root cgroup gets removed when there is a thread that registered trigger and is polling on a pressure file within the cgroup, the polling waitqueue gets freed in the following path: do_rmdir cgroup_rmdir kernfs_drain_open_files cgroup_file_release cgroup_pressure_release psi_trigger_destroy However, the polling thread still has a reference to the pressure file and will access the freed waitqueue when the file is closed or upon exit: fput ep_eventpoll_release ep_free ep_remove_wait_queue remove_wait_queue This results in use-after-free as pasted below. The fundamental problem here is that cgroup_file_release() (and consequently waitqueue's lifetime) is not tied to the file's real lifetime. Using wake_up_pollfree() here might be less than ideal, but it is in line with the comment at commit 42288cb44c4b ("wait: add wake_up_pollfree()") since the waitqueue's lifetime is not tied to file's one and can be considered as another special case. While this would be fixable by somehow making cgroup_file_release() be tied to the fput(), it would require sizable refactoring at cgroups or higher layer which might be more justifiable if we identify more cases like this. BUG: KASAN: use-after-free in _raw_spin_lock_irqsave+0x60/0xc0 Write of size 4 at addr ffff88810e625328 by task a.out/4404 CPU: 19 PID: 4404 Comm: a.out Not tainted 6.2.0-rc6 #38 Hardware name: Amazon EC2 c5a.8xlarge/, BIOS 1.0 10/16/2017 Call Trace: <TASK> dump_stack_lvl+0x73/0xa0 print_report+0x16c/0x4e0 kasan_report+0xc3/0xf0 kasan_check_range+0x2d2/0x310 _raw_spin_lock_irqsave+0x60/0xc0 remove_wait_queue+0x1a/0xa0 ep_free+0x12c/0x170 ep_eventpoll_release+0x26/0x30 __fput+0x202/0x400 task_work_run+0x11d/0x170 do_exit+0x495/0x1130 do_group_exit+0x100/0x100 get_signal+0xd67/0xde0 arch_do_signal_or_restart+0x2a/0x2b0 exit_to_user_mode_prepare+0x94/0x100 syscall_exit_to_user_mode+0x20/0x40 do_syscall_64+0x52/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> Allocated by task 4404: kasan_set_track+0x3d/0x60 __kasan_kmalloc+0x85/0x90 psi_trigger_create+0x113/0x3e0 pressure_write+0x146/0x2e0 cgroup_file_write+0x11c/0x250 kernfs_fop_write_iter+0x186/0x220 vfs_write+0x3d8/0x5c0 ksys_write+0x90/0x110 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 4407: kasan_set_track+0x3d/0x60 kasan_save_free_info+0x27/0x40 ____kasan_slab_free+0x11d/0x170 slab_free_freelist_hook+0x87/0x150 __kmem_cache_free+0xcb/0x180 psi_trigger_destroy+0x2e8/0x310 cgroup_file_release+0x4f/0xb0 kernfs_drain_open_files+0x165/0x1f0 kernfs_drain+0x162/0x1a0 __kernfs_remove+0x1fb/0x310 kernfs_remove_by_name_ns+0x95/0xe0 cgroup_addrm_files+0x67f/0x700 cgroup_destroy_locked+0x283/0x3c0 cgroup_rmdir+0x29/0x100 kernfs_iop_rmdir+0xd1/0x140 vfs_rmdir+0xfe/0x240 do_rmdir+0x13d/0x280 __x64_sys_rmdir+0x2c/0x30 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 0e94682b73bf ("psi: introduce psi monitor") Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Mengchi Cheng <mengcc@amazon.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/lkml/20230106224859.4123476-1-kamatam@amazon.com/ Link: https://lore.kernel.org/r/20230214212705.4058045-1-kamatam@amazon.com
2023-02-15ASoC: cs35l45: Remove separate namespace for tablesCharles Keepax
Now tables isn't a separate module, definitely no need to have a separate namespace for it. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20230215105818.3315925-2-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-15ASoC: cs35l45: Remove separate tables moduleCharles Keepax
There is no reason to have a separate module for the tables file it just holds regmap callbacks and register patches used by the main part of the driver. Remove the separate module and merge it into the main driver module. Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20230215105818.3315925-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-15spi: intel: Check number of chip selects after reading the descriptorMika Westerberg
The flash decriptor contains the number of flash components that we use to figure out how many flash chips there are connected. Therefore we need to read it first before deciding how many chip selects the controller has. Reported-by: Marcin Witkowski <marcin.witkowski@intel.com> Fixes: 3f03c618bebb ("spi: intel: Add support for second flash chip") Cc: stable@vger.kernel.org Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230215110040.42186-1-mika.westerberg@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-15perf record: Fix segfault with --overwrite and --max-sizeYang Jihong
When --overwrite and --max-size options of perf record are used together, a segmentation fault occurs. The following is an example: # perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1 [ perf record: Woken up 1 times to write data ] perf: Segmentation fault Obtained 12 stack frames. ./perf/perf(+0x197673) [0x55f99710b673] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f] ./perf/perf(+0x8eb40) [0x55f997002b40] ./perf/perf(+0x1f6882) [0x55f99716a882] ./perf/perf(+0x794c2) [0x55f996fed4c2] ./perf/perf(+0x7b7c7) [0x55f996fef7c7] ./perf/perf(+0x9074b) [0x55f99700474b] ./perf/perf(+0x12e23c) [0x55f9970a223c] ./perf/perf(+0x12e54a) [0x55f9970a254a] ./perf/perf(+0x7db60) [0x55f996ff1b60] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86] ./perf/perf(+0x7dfe9) [0x55f996ff1fe9] Segmentation fault (core dumped) backtrace of the core file is as follows: (gdb) bt #0 record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234 #1 record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242 #2 record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263 #3 process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618 #4 0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658, from=from@entry=0) at util/synthetic-events.c:1895 #5 0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658) at util/synthetic-events.c:1905 #6 0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997 #7 0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802 #8 0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258 #9 0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330 #10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384 #11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428 #12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562 The reason is that record__bytes_written accesses the freed memory rec->thread_data, The process is as follows: __cmd_record -> record__free_thread_data -> zfree(&rec->thread_data) // free rec->thread_data -> record__synthesize -> perf_event__synthesize_id_index -> process_synthesized_event -> record__write -> record__bytes_written // access rec->thread_data We add a member variable "thread_bytes_written" in the struct "record" to save the data size written by the threads. Fixes: 6d57581659f72299 ("perf record: Add support for limit perf output file size") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-15Documentation/hw-vuln: Fix rST warningPaolo Bonzini
The following warning: Documentation/admin-guide/hw-vuln/cross-thread-rsb.rst:92: ERROR: Unexpected indentation. was introduced by commit 493a2c2d23ca. Fix it by placing everything in the same paragraph and also use a monospace font. Fixes: 493a2c2d23ca ("Documentation/hw-vuln: Add documentation for Cross-Thread Return Predictions") Reported-by: Stephen Rothwell <sfr@canb@auug.org.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-02-15powerpc/kexec_file: print error string on usable memory property update failureSourabh Jain
Print the FDT error description along with the error message if failed to set the "linux,drconf-usable-memory" property in the kdump kernel's FDT. Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221216122708.182154-1-sourabhjain@linux.ibm.com
2023-02-15powerpc/machdep: warn when machine_is() used too earlyNathan Lynch
machine_is() can't provide correct results before probe_machine() has run. Warn when it's used too early in boot, placing the WARN_ON() in a helper function so the reported file:line indicates exactly what went wrong. checkpatch complains about __attribute__((weak)) in the patch, so change that to __weak, and align the line continuations as well. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230210-warn-on-machine-is-before-probe-machine-v2-1-b57f8243c51c@linux.ibm.com
2023-02-15powerpc/64: Replace -mcpu=e500mc64 by -mcpu=e5500Christophe Leroy
E500MC64 is a processor pre-dating E5500 that has never been commercialised. Use -mcpu=e5500 for E5500 core. More details at https://gcc.gnu.org/PR108149 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fa71ed20d22c156225436374f0ab847daac893bc.1671475543.git.christophe.leroy@csgroup.eu
2023-02-15powerpc/eeh: Set channel state after notifying the driversGanesh Goudar
When a PCI error is encountered 6th time in an hour we set the channel state to perm_failure and notify the driver about the permanent failure. However, after upstream commit 38ddc011478e ("powerpc/eeh: Make permanently failed devices non-actionable"), EEH handler stops calling any routine once the device is marked as permanent failure. This issue can lead to fatal consequences like kernel hang with certain PCI devices. Following log is observed with lpfc driver, with and without this change, Without this change kernel hangs, If PCI error is encountered 6 times for a device in an hour. Without the change EEH: Beginning: 'error_detected(permanent failure)' PCI 0132:60:00.0#600000: EEH: not actionable (1,1,1) PCI 0132:60:00.1#600000: EEH: not actionable (1,1,1) EEH: Finished:'error_detected(permanent failure)' With the change EEH: Beginning: 'error_detected(permanent failure)' EEH: Invoking lpfc->error_detected(permanent failure) EEH: lpfc driver reports: 'disconnect' EEH: Invoking lpfc->error_detected(permanent failure) EEH: lpfc driver reports: 'disconnect' EEH: Finished:'error_detected(permanent failure)' To fix the issue, set channel state to permanent failure after notifying the drivers. Fixes: 38ddc011478e ("powerpc/eeh: Make permanently failed devices non-actionable") Suggested-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230209105649.127707-1-ganeshgr@linux.ibm.com
2023-02-15selftests/powerpc: Fix incorrect kernel headers search pathMathieu Desnoyers
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents building against kernel headers from the build environment in scenarios where kernel headers are installed into a specific output directory (O=...). Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230127135755.79929-22-mathieu.desnoyers@efficios.com
2023-02-15iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry()Dan Carpenter
This condition needs to match the previous "if (epcp->state == LISTEN) {" exactly to avoid a NULL dereference of either "listen_ep" or "ep". The problem is that "epcp" has been re-assigned so just testing "if (epcp->state == LISTEN) {" a second time is not sufficient. Fixes: 116aeb887371 ("iw_cxgb4: provide detailed provider-specific CM_ID information") Signed-off-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/Y+usKuWIKr4dimZh@kili Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15usb: gadget: u_ether: Don't warn in gether_setup_name_default()Jon Hunter
The function gether_setup_name_default() is called by various USB ethernet gadget drivers. Calling this function will select a random host and device MAC addresses. A properly working driver should be silent and not warn the user about default MAC addresses selection. Given that the MAC addresses are also printed when the function gether_register_netdev() is called, remove these unnecessary warnings. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com> Link: https://lore.kernel.org/r/20230209125319.18589-2-jonathanh@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-15usb: gadget: u_ether: Convert prints to device printsJon Hunter
The USB ethernet gadget driver implements its own print macros which call printk. Device drivers should use the device prints that print the device name. Fortunately, the same macro names are defined in the header file 'linux/usb/composite.h' and these use the device prints. Therefore, remove the local definitions in the USB ethernet gadget driver and use those in 'linux/usb/composite.h'. The only difference is that now the device name is printed instead of the ethernet interface name. Tested using ethernet gadget on Jetson AGX Orin. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20230209125319.18589-1-jonathanh@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-15dma-mapping: no need to pass a bus_type into get_arch_dma_ops()Greg Kroah-Hartman
The get_arch_dma_ops() arch-specific function never does anything with the struct bus_type that is passed into it, so remove it entirely as it is not needed. Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: linux-alpha@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: iommu@lists.linux.dev Cc: linux-arch@vger.kernel.org Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230214140121.131859-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-15RDMA/mlx5: Use rdma_umem_for_each_dma_block()Jason Gunthorpe
Replace an open coding of rdma_umem_for_each_dma_block() with the proper function. Fixes: b3d47ebd4908 ("RDMA/mlx5: Use mlx5_umr_post_send_wait() to update MR pas") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-c13a5b88359b+556d0-mlx5_umem_block_jgg@nvidia.com Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15HID: logitech-hidpp: Add myself to authorsBastien Nocera
As discussed with HID maintainer Benjamin Tissoires, add myself to the authors list and MAINTAINERS file. Signed-off-by: Bastien Nocera <hadess@hadess.net> Link: https://lore.kernel.org/r/20230209154916.462158-2-hadess@hadess.net Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-02-15HID: logitech-hidpp: Retry commands when device is busyBastien Nocera
Handle the busy error coming from the device or receiver. The documentation says a busy error can be returned when: " Device (or receiver) cannot answer immediately to this request for any reason i.e: - already processing a request from the same or another SW - pipe full " Signed-off-by: Bastien Nocera <hadess@hadess.net> Link: https://lore.kernel.org/r/20230209154916.462158-1-hadess@hadess.net Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-02-15net: phylink: support validated pause and autoneg in fixed-linkIvan Bornyakov
In fixed-link setup phylink_parse_fixedlink() unconditionally sets Pause, Asym_Pause and Autoneg bits to "supported" bitmap, while MAC may not support these. This leads to ethtool reporting: > Supported pause frame use: Symmetric Receive-only > Supports auto-negotiation: Yes regardless of what is actually supported. Instead of unconditionally set Pause, Asym_Pause and Autoneg it is sensible to set them according to validated "supported" bitmap, i.e. the result of phylink_validate(). Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-15net: mpls: fix stale pointer if allocation fails during device renameJakub Kicinski
lianhui reports that when MPLS fails to register the sysctl table under new location (during device rename) the old pointers won't get overwritten and may be freed again (double free). Handle this gracefully. The best option would be unregistering the MPLS from the device completely on failure, but unfortunately mpls_ifdown() can fail. So failing fully is also unreliable. Another option is to register the new table first then only remove old one if the new one succeeds. That requires more code, changes order of notifications and two tables may be visible at the same time. sysctl point is not used in the rest of the code - set to NULL on failures and skip unregister if already NULL. Reported-by: lianhui tang <bluetlh@gmail.com> Fixes: 0fae3bf018d9 ("mpls: handle device renames for per-device sysctls") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-15net: no longer support SOCK_REFCNT_DEBUG featureJason Xing
Commit e48c414ee61f ("[INET]: Generalise the TCP sock ID lookup routines") commented out the definition of SOCK_REFCNT_DEBUG in 2005 and later another commit 463c84b97f24 ("[NET]: Introduce inet_connection_sock") removed it. Since we could track all of them through bpf and kprobe related tools and the feature could print loads of information which might not be that helpful even under a little bit pressure, the whole feature which has been inactive for many years is no longer supported. Link: https://lore.kernel.org/lkml/20230211065153.54116-1-kerneljasonxing@gmail.com/ Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-15net/sched: tcindex: search key must be 16 bitsPedro Tammela
Syzkaller found an issue where a handle greater than 16 bits would trigger a null-ptr-deref in the imperfect hash area update. general protection fault, probably for non-canonical address 0xdffffc0000000015: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af] CPU: 0 PID: 5070 Comm: syz-executor456 Not tainted 6.2.0-rc7-syzkaller-00112-gc68f345b7c42 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023 RIP: 0010:tcindex_set_parms+0x1a6a/0x2990 net/sched/cls_tcindex.c:509 Code: 01 e9 e9 fe ff ff 4c 8b bd 28 fe ff ff e8 0e 57 7d f9 48 8d bb a8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 94 0c 00 00 48 8b 85 f8 fd ff ff 48 8b 9b a8 00 RSP: 0018:ffffc90003d3ef88 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000015 RSI: ffffffff8803a102 RDI: 00000000000000a8 RBP: ffffc90003d3f1d8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88801e2b10a8 R13: dffffc0000000000 R14: 0000000000030000 R15: ffff888017b3be00 FS: 00005555569af300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000056041c6d2000 CR3: 000000002bfca000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcindex_change+0x1ea/0x320 net/sched/cls_tcindex.c:572 tc_new_tfilter+0x96e/0x2220 net/sched/cls_api.c:2155 rtnetlink_rcv_msg+0x959/0xca0 net/core/rtnetlink.c:6132 netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2574 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x91b/0xe10 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xd3/0x120 net/socket.c:734 ____sys_sendmsg+0x334/0x8c0 net/socket.c:2476 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2530 __sys_sendmmsg+0x18f/0x460 net/socket.c:2616 __do_sys_sendmmsg net/socket.c:2645 [inline] __se_sys_sendmmsg net/socket.c:2642 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2642 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 Fixes: ee059170b1f7 ("net/sched: tcindex: update imperfect hash filters respecting rcu") Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-15s390/irq,idle: simplify idle checkHeiko Carstens
Use the per-cpu CIF_ENABLED_WAIT flag to decide if an interrupt occurred while a cpu was idle, instead of checking two conditions within the old psw. Also move clearing of the CIF_ENABLED_WAIT bit to the early interrupt handler, which in turn makes arch_vcpu_is_preempted() also a bit more precise, since the flag is now cleared before interrupt handlers have been called. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-15s390/processor: add test_and_set_cpu_flag() and test_and_clear_cpu_flag()Heiko Carstens
Add test_and_set_cpu_flag() and test_and_clear_cpu_flag() helper functions. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-15s390/processor: let cpu helper functions return boolean valuesHeiko Carstens
Let cpu helper functions return boolean values. This also allows to make the code a bit simpler by getting rid of the "!!" construct. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-15s390/kfence: fix page fault reportingHeiko Carstens
Baoquan He reported lots of KFENCE reports when /proc/kcore is read, e.g. with crash or even simpler with dd: BUG: KFENCE: invalid read in copy_from_kernel_nofault+0x5e/0x120 Invalid read at 0x00000000f4f5149f: copy_from_kernel_nofault+0x5e/0x120 read_kcore+0x6b2/0x870 proc_reg_read+0x9a/0xf0 vfs_read+0x94/0x270 ksys_read+0x70/0x100 __do_syscall+0x1d0/0x200 system_call+0x82/0xb0 The reason for this is that read_kcore() simply reads memory that might have been unmapped by KFENCE with copy_from_kernel_nofault(). Any fault due to pages being unmapped by KFENCE would be handled gracefully by the fault handler (exception table fixup). However the s390 fault handler first reports the fault, and only afterwards would perform the exception table fixup. Most architectures have this in reversed order, which also avoids the false positive KFENCE reports when an unmapped page is accessed. Therefore change the s390 fault handler so it handles exception table fixups before KFENCE page faults are reported. Reported-by: Baoquan He <bhe@redhat.com> Tested-by: Baoquan He <bhe@redhat.com> Acked-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/r/20230213183858.1473681-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-15s390/zcrypt: introduce ctfm field in struct CPRBXHarald Freudenberger
Modify the CPRBX struct to expose a new field ctfm for use with hardware command filtering within a CEX8 crypto card in CCA coprocessor mode. The field replaces a reserved byte padding field so that the layout of the struct and the size does not change. The new field is used only by user space applications which may use this to expose the HW filtering facilities in the crypto firmware layers. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-15drm: panel-orientation-quirks: Add quirk for Lenovo IdeaPad Duet 3 10IGL5Darrell Kavanagh
Another Lenovo convertable where the panel is installed landscape but is reported to the kernel as portrait. Signed-off-by: Darrell Kavanagh <darrell.kavanagh@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230214164659.3583-1-darrell.kavanagh@gmail.com
2023-02-15net/mlx5: Configure IPsec steering for egress RoCEv2 trafficMark Zhang
Add steering table/rule in RDMA_TX domain, to forward all traffic to IPsec crypto table in NIC domain. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15net/mlx5: Configure IPsec steering for ingress RoCEv2 trafficMark Zhang
Add steering tables/rules to check if the decrypted traffic is RoCEv2, if so then forward it to RDMA_RX domain. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15net/mlx5: Add IPSec priorities in RDMA namespacesMark Zhang
Add IPSec flow steering priorities in RDMA namespaces. This allows adding tables/rules to forward RoCEv2 traffic to the IPSec crypto tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15net/mlx5: Implement new destination type TABLE_TYPEMark Zhang
Implement new destination type to support flow transition between different table types. e.g. from NIC_RX to RDMA_RX or from RDMA_TX to NIC_TX. The new destination is described in the tracepoint as follows: "mlx5_fs_add_rule: rule=00000000d53cd0ed fte=0000000048a8a6ed index=0 sw_action=<> [dst] flow_table_type=7 id:262152" Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15net/mlx5: Introduce new destination type TABLE_TYPEPatrisious Haddad
This new destination type supports flow transition between different table types, e.g. from NIC_RX to RDMA_RX or from RDMA_TX to NIC_TX. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-15wifi: rtw89: move H2C of del_pkt_offload before polling FW status readyChin-Yen Lee
The H2C of del_pkt_offload must be called before polling FW status ready, otherwise the following downloading normal FW will fail. Fixes: 5c12bb66b79d ("wifi: rtw89: refine packet offload flow") Signed-off-by: Chin-Yen Lee <timlee@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230214114314.5268-1-pkshih@realtek.com
2023-02-15wifi: rtw89: use readable return 0 in rtw89_mac_cfg_ppdu_status()Ping-Ke Shih
For normal (successful) flow, it must return 0. The original code uses 'return ret', and then we need to backward reference to initial value to know 'ret = 0'. Changing them to 'return 0', because it will be more readable and intuitive. This patch doesn't change logic at all. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202302101023.ctlih5q0-lkp@intel.com/ Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230213091328.25481-1-pkshih@realtek.com
2023-02-15wifi: rtw88: usb: drop now unnecessary URB size checkSascha Hauer
Now that we send URBs with the URB_ZERO_PACKET flag set we no longer need to make sure that the URB sizes are not multiple of the bulkout_size. Drop the check. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230210111632.1985205-4-s.hauer@pengutronix.de
2023-02-15wifi: rtw88: usb: send Zero length packets if necessarySascha Hauer
Zero length packets are necessary when sending URBs with size multiple of bulkout_size, otherwise the hardware just stalls. Fixes: a82dfd33d1237 ("wifi: rtw88: Add common USB chip support") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230210111632.1985205-3-s.hauer@pengutronix.de
2023-02-15wifi: rtw88: usb: Set qsel correctlySascha Hauer
We have to extract qsel from the skb before doing skb_push() on it, otherwise qsel will always be 0. Fixes: a82dfd33d1237 ("wifi: rtw88: Add common USB chip support") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230210111632.1985205-2-s.hauer@pengutronix.de
2023-02-15ksmbd: do not allow the actual frame length to be smaller than the rfc1002 ↵Namjae Jeon
length ksmbd allowed the actual frame length to be smaller than the rfc1002 length. If allowed, it is possible to allocates a large amount of memory that can be limited by credit management and can eventually cause memory exhaustion problem. This patch do not allow it except SMB2 Negotiate request which will be validated when message handling proceeds. Also, Allow a message that padded to 8byte boundary. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-02-15ksmbd: fix wrong data area length for smb2 lock requestNamjae Jeon
When turning debug mode on, The following error message from ksmbd_smb2_check_message() is coming. ksmbd: cli req padded more than expected. Length 112 not 88 for cmd:10 mid:14 data area length calculation for smb2 lock request in smb2_get_data_area_len() is incorrect. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-02-15ksmbd: Fix parameter name and comment mismatchJiapeng Chong
fs/ksmbd/vfs.c:965: warning: Function parameter or member 'attr_value' not described in 'ksmbd_vfs_setxattr'. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3946 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-02-14Merge patch series "Remove toolchain dependencies for Zicbom"Palmer Dabbelt
Conor Dooley <conor@kernel.org> says: From: Conor Dooley <conor.dooley@microchip.com> I've yoinked patch 1 from Drew's series adding support for Zicboz & attached two more patches here that remove the need for, and then drop the toolchain support checks for Zicbom. The goal is to remove the need for checking the presence of toolchain Zicbom support in the work being done to support non instruction based CMOs [1]. I've tested compliation on a number of different configurations with the Zicbom config option enabled. The important ones to call out I guess are: - clang/llvm 14 w/ LLVM=1 which doesn't support Zicbom atm. - gcc 11 w/ binutils 2.37 which doesn't support Zicbom atm either. - clang/llvm 15 w/ LLVM=1 BUT with binutils 2.37's ld. This is the configuration that prompted adding the LD checks as cc/as supports Zicbom, but ld doesn't [2]. - gcc 12 w/ binutils 2.39 & clang 15 w/ LLVM=1, both of these supported Zicbom before and still do. I also checked building the THEAD errata etc with CONFIG_RISCV_ISA_ZICBOM disabled, and there were no build issues there either. * b4-shazam-merge: RISC-V: remove toolchain version checks for Zicbom RISC-V: replace cbom instructions with an insn-def RISC-V: insn-def: Add I-type insn-def Link: https://lore.kernel.org/r/20230108163356.3063839-1-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-14RISC-V: remove toolchain version checks for ZicbomConor Dooley
Commit b8c86872d1dc ("riscv: fix detection of toolchain Zicbom support") fixed building on systems where Zicbom was supported by the compiler/assembler but not by the linker in an easily backportable manner. Now that the we have insn-defs for the 3 instructions, toolchain support is no longer required for Zicbom. Stop emitting "_zicbom" in -march when Zicbom is enabled & drop the version checks entirely. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230108163356.3063839-4-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-14RISC-V: replace cbom instructions with an insn-defConor Dooley
Using the cbom instructions directly in ALT_CMO_OP, requires toolchain support for the instructions. Using an insn-def will allow for removal of toolchain version checks in the build system & simplification of the proposed [1] function-based CMO scheme. Link: https://lore.kernel.org/linux-riscv/fb3b34ae-e35e-4dc2-a8f4-19984a2f58a8@app.fastmail.com/ [1] Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230108163356.3063839-3-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-14RISC-V: insn-def: Add I-type insn-defAndrew Jones
CBO instructions use the I-type of instruction format where the immediate is used to identify the CBO instruction type. Add I-type instruction encoding support to insn-def. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230108163356.3063839-2-conor@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-02-14netlink-specs: add rx-push to ethtool familyJakub Kicinski
Commit 5b4e9a7a71ab ("net: ethtool: extend ringparam set/get APIs for rx_push") added a new attr for configuring rx-push, right after tx-push. Add it to the spec, the ring param operation is covered by the otherwise sparse ethtool spec. Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20230214043246.230518-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-14Merge branch 'net-make-kobj_type-structures-constant'Jakub Kicinski
Thomas Weißschuh says: ==================== net: make kobj_type structures constant Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. ==================== Link: https://lore.kernel.org/r/20230211-kobj_type-net-v2-0-013b59e59bf3@weissschuh.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-14net-sysfs: make kobj_type structures constantThomas Weißschuh
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definitions to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>