summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-27objtool: Add CONFIG_HAVE_UACCESS_VALIDATIONJosh Poimboeuf
Allow an arch specify that it has objtool uaccess validation with CONFIG_HAVE_UACCESS_VALIDATION. For now, doing so unconditionally selects CONFIG_OBJTOOL. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/d393d5e2fe73aec6e8e41d5c24f4b6fe8583f2d8.1650384225.git.jpoimboe@redhat.com
2022-05-27x86/mm: Use PAGE_ALIGNED(x) instead of IS_ALIGNED(x, PAGE_SIZE)Fanjun Kong
The <linux/mm.h> already provides the PAGE_ALIGNED() macro. Let's use this macro instead of IS_ALIGNED() and passing PAGE_SIZE directly. No change in functionality. [ mingo: Tweak changelog. ] Signed-off-by: Fanjun Kong <bh1scw@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20220526142038.1582839-1-bh1scw@gmail.com
2022-05-27x86: Fix all occurences of the "the the" typoBo Liu
Rather than waiting for the bots to fix these one-by-one, fix all occurences of "the the" throughout arch/x86. Signed-off-by: Bo Liu <liubo03@inspur.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20220527061400.5694-1-liubo03@inspur.com
2022-05-27perf/core: Remove unused local variableHaowen Bai
Drop LIST_HEAD() where the variable it declares is never used. Compiler probably never warned us, because the LIST_HEAD() initializer is technically 'usage'. [ mingo: Tweak changelog. ] Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/1653645835-29206-1-git-send-email-baihaowen@meizu.com
2022-05-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following contain more Netfilter fixes for net: 1) syzbot warning in nfnetlink bind, from Florian. 2) Refetch conntrack after __nf_conntrack_confirm(), from Florian Westphal. 3) Move struct nf_ct_timeout back at the bottom of the ctnl_time, to where it before recent update, also from Florian. 4) Add NL_SET_BAD_ATTR() to nf_tables netlink for proper set element commands error reporting. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-27netfilter: nf_tables: set element extended ACK reporting supportPablo Neira Ayuso
Report the element that causes problems via netlink extended ACK for set element commands. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-27netfilter: cttimeout: fix slab-out-of-bounds read in cttimeout_net_exitFlorian Westphal
syzbot reports: BUG: KASAN: slab-out-of-bounds in __list_del_entry_valid+0xcc/0xf0 lib/list_debug.c:42 [..] list_del include/linux/list.h:148 [inline] cttimeout_net_exit+0x211/0x540 net/netfilter/nfnetlink_cttimeout.c:617 No reproducer so far. Looking at recent changes in this area its clear that the free_head must not be at the end of the structure because nf_ct_timeout structure has variable size. Reported-by: <syzbot+92968395eedbdbd3617d@syzkaller.appspotmail.com> Fixes: 78222bacfca9 ("netfilter: cttimeout: decouple unlink and free on netns destruction") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-27netfilter: conntrack: re-fetch conntrack after insertionFlorian Westphal
In case the conntrack is clashing, insertion can free skb->_nfct and set skb->_nfct to the already-confirmed entry. This wasn't found before because the conntrack entry and the extension space used to free'd after an rcu grace period, plus the race needs events enabled to trigger. Reported-by: <syzbot+793a590957d9c1b96620@syzkaller.appspotmail.com> Fixes: 71d8c47fc653 ("netfilter: conntrack: introduce clash resolution on insertion race") Fixes: 2ad9d7747c10 ("netfilter: conntrack: free extension area immediately") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-27netfilter: nfnetlink: fix warn in nfnetlink_unbindFlorian Westphal
syzbot reports following warn: WARNING: CPU: 0 PID: 3600 at net/netfilter/nfnetlink.c:703 nfnetlink_unbind+0x357/0x3b0 net/netfilter/nfnetlink.c:694 The syzbot generated program does this: socket(AF_NETLINK, SOCK_RAW, NETLINK_NETFILTER) = 3 setsockopt(3, SOL_NETLINK, NETLINK_DROP_MEMBERSHIP, [1], 4) = 0 ... which triggers 'WARN_ON_ONCE(nfnlnet->ctnetlink_listeners == 0)' check. Instead of counting, just enable reporting for every bind request and check if we still have listeners on unbind. While at it, also add the needed bounds check on nfnl_group2type[] access. Reported-by: <syzbot+4903218f7fba0a2d6226@syzkaller.appspotmail.com> Reported-by: <syzbot+afd2d80e495f96049571@syzkaller.appspotmail.com> Fixes: 2794cdb0b97b ("netfilter: nfnetlink: allow to detect if ctnetlink listeners exist") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-27xen: switch gnttab_end_foreign_access() to take a struct page pointerJuergen Gross
Instead of a virtual kernel address use a pointer of the associated struct page as second parameter of gnttab_end_foreign_access(). Most users have that pointer available already and are creating the virtual address from it, risking problems in case the memory is located in highmem. gnttab_end_foreign_access() itself won't need to get the struct page from the address again. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-27Merge tag 'timers-v5.19-rc1' of ↵Thomas Gleixner
https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clockevent/clocksource driver updates from Daniel Lezcano: - Add Mediatek MT8186 DT bindings (Allen-KH Cheng) - Remove dead code corresponding of the IXP4xx board removal (Linus Walleij) - Add CLOCK_EVT_FEAT_C3STOP flag for the RISC-V SBI timer (Samuel Holland) - Do not return an error if there are multiple definitions of the sp804 timers in the DT (Andre Przywara) - Add the missing SPDX identifier (Thomas Gleixner) - Remove an unncessary NULL check as it is done right before at probe time for the timer-ti-dm (Dan Carpenter) - Fix the irq_of_parse_and_map() return code check on onexas-nps (Krzysztof Kozlowski) Link: https://lore.kernel.org/lkml/b5a83e54-1ee1-f910-4be4-bc3bf1015243@linaro.org
2022-05-27kbuild: replace $(if A,A,B) with $(or A,B) in scripts/Makefile.modpostMasahiro Yamada
Similar cleanup to commit 5c8166419acf ("kbuild: replace $(if A,A,B) with $(or A,B)"). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27modpost: squash if...else-if in find_elf_symbol2()Masahiro Yamada
if ((addr - sym->st_value) < distance) { distance = addr - sym->st_value; near = sym; } else if ((addr - sym->st_value) == distance) { near = sym; } is equivalent to: if (addr - sym->st_value <= distance) { distance = addr - sym->st_value; near = sym; } (The else-if block can overwrite 'distance' with the same value). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27modpost: reuse ARRAY_SIZE() macro for section_mismatch()Masahiro Yamada
Move ARRAY_SIZE() from file2alias.c to modpost.h to reuse it in section_mismatch(). Also, move the variable 'check' inside the for-loop. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27modpost: remove the unused argument of check_sec_ref()Masahiro Yamada
check_sec_ref() does not use the first parameter 'mod'. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27modpost: fix undefined behavior of is_arm_mapping_symbol()Masahiro Yamada
The return value of is_arm_mapping_symbol() is unpredictable when "$" is passed in. strchr(3) says: The strchr() and strrchr() functions return a pointer to the matched character or NULL if the character is not found. The terminating null byte is considered part of the string, so that if c is specified as '\0', these functions return a pointer to the terminator. When str[1] is '\0', strchr("axtd", str[1]) is not NULL, and str[2] is referenced (i.e. buffer overrun). Test code --------- char str1[] = "abc"; char str2[] = "ab"; strcpy(str1, "$"); strcpy(str2, "$"); printf("test1: %d\n", is_arm_mapping_symbol(str1)); printf("test2: %d\n", is_arm_mapping_symbol(str2)); Result ------ test1: 0 test2: 1 Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27modpost: fix removing numeric suffixesAlexander Lobakin
With the `-z unique-symbol` linker flag or any similar mechanism, it is possible to trigger the following: ERROR: modpost: "param_set_uint.0" [vmlinux] is a static EXPORT_SYMBOL The reason is that for now the condition from remove_dot(): if (m && (s[n + m] == '.' || s[n + m] == 0)) which was designed to test if it's a dot or a '\0' after the suffix is never satisfied. This is due to that `s[n + m]` always points to the last digit of a numeric suffix, not on the symbol next to it (from a custom debug print added to modpost): param_set_uint.0, s[n + m] is '0', s[n + m + 1] is '\0' So it's off-by-one and was like that since 2014. Fix this for the sake of any potential upcoming features, but don't bother stable-backporting, as it's well hidden -- apart from that LD flag, it can be triggered only with GCC LTO which never landed upstream. Fixes: fcd38ed0ff26 ("scripts: modpost: fix compilation warning") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-05-27um: Fix out-of-bounds read in LDT setupVincent Whitchurch
syscall_stub_data() expects the data_count parameter to be the number of longs, not bytes. ================================================================== BUG: KASAN: stack-out-of-bounds in syscall_stub_data+0x70/0xe0 Read of size 128 at addr 000000006411f6f0 by task swapper/1 CPU: 0 PID: 1 Comm: swapper Not tainted 5.18.0+ #18 Call Trace: show_stack.cold+0x166/0x2a7 __dump_stack+0x3a/0x43 dump_stack_lvl+0x1f/0x27 print_report.cold+0xdb/0xf81 kasan_report+0x119/0x1f0 kasan_check_range+0x3a3/0x440 memcpy+0x52/0x140 syscall_stub_data+0x70/0xe0 write_ldt_entry+0xac/0x190 init_new_ldt+0x515/0x960 init_new_context+0x2c4/0x4d0 mm_init.constprop.0+0x5ed/0x760 mm_alloc+0x118/0x170 0x60033f48 do_one_initcall+0x1d7/0x860 0x60003e7b kernel_init+0x6e/0x3d4 new_thread_handler+0x1e7/0x2c0 The buggy address belongs to stack of task swapper/1 and is located at offset 64 in frame: init_new_ldt+0x0/0x960 This frame has 2 objects: [32, 40) 'addr' [64, 80) 'desc' ================================================================== Fixes: 858259cf7d1c443c83 ("uml: maintain own LDT entries") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: chan_user: Fix winch_tramp() return valueJohannes Berg
The previous fix here was only partially correct, it did result in returning a proper error value in case of error, but it also clobbered the pid that we need to return from this function (not just zero for success). As a result, it returned 0 here, but later this is treated as a pid and used to kill the process, but since it's now 0 we kill(0, SIGKILL), which makes UML kill itself rather than just the helper thread. Fix that and make it more obvious by using a separate variable for the pid. Fixes: ccf1236ecac4 ("um: fix error return code in winch_tramp()") Reported-and-tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: virtio_uml: Fix broken device handling in time-travelJohannes Berg
If a device implementation crashes, virtio_uml will mark it as dead by calling virtio_break_device() and scheduling the work that will remove it. This still seems like the right thing to do, but it's done directly while reading the message, and if time-travel is used, this is in the time-travel handler, outside of the normal Linux machinery. Therefore, we cannot acquire locks or do normal "linux-y" things because e.g. lockdep will be confused about the context. Move handling this situation out of the read function and into the actual IRQ handler and response handling instead, so that in the case of time-travel we don't call it in the wrong context. Chances are the system will still crash immediately, since the device implementation crashing may also cause the time- travel controller to go down, but at least all of that now happens without strange warnings from lockdep. Fixes: c8177aba37ca ("um: time-travel: rework interrupt handling in ext mode") Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: line: Use separate IRQs per lineJohannes Berg
Today, all possible serial lines (ssl*=) as well as all possible consoles (con*=) each share a single interrupt (with a fixed number) with others of the same type. Now, if you have two lines, say ssl0 and ssl1, and one of them is connected to an fd you cannot read (e.g. a file), but the other gets a read interrupt, then both of them get the interrupt since it's shared. Then, the read() call will return EOF, since it's a file being written and there's nothing to read (at least not at the current offset, at the end). Unfortunately, this is treated as a read error, and we close this line, losing all the possible output. It might be possible to work around this and make the IRQ sharing work, however, now that we have dynamically allocated IRQs that are easy to use, simply use that to achieve separating between the events; then there's no interrupt for that line and we never attempt the read in the first place, thus not closing the line. This manifested itself in the wifi hostap/hwsim tests where the parallel script communicates via one serial console and the kernel messages go to another (a file) and sending data on the communication console caused the kernel messages to stop flowing into the file. Reported-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: anton ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_registerMiaoqian Lin
of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. mv88e6xxx_mdio_register() pass the device node to of_mdiobus_register(). We don't need the device node after it. Add missing of_node_put() to avoid refcount leak. Fixes: a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-27um: Enable ARCH_HAS_GCOV_PROFILE_ALLVincent Whitchurch
Enable ARCH_HAS_GCOV_PROFILE_ALL so that CONFIG_GCOV_PROFILE_ALL can be selected on UML. I didn't need to explicitly disable GCOV on anything to get this to work on the configs I tested. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27um: Use asm-generic/dma-mapping.hJohannes Berg
If DMA (PCI over virtio) is enabled, then some drivers may enable CONFIG_DMA_OPS as well, and then we pull in the x86 definition of get_arch_dma_ops(), which uses the dma_ops symbol, which isn't defined. Since we don't have real DMA ops nor any kind of IOMMU fix this in the simplest possible way: pull in the asm-generic file instead of inheriting the x86 one. It's not clear why those drivers that do (e.g. VDPA) "select DMA_OPS", and if they'd even work with this, but chances are nobody will be wanting to do that anyway, so fixing the build failure is good enough. Reported-by: Randy Dunlap <rdunlap@infradead.org> Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27net: ethernet: ti: am65-cpsw-nuss: Fix some refcount leaksMiaoqian Lin
of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. am65_cpsw_init_cpts() and am65_cpsw_nuss_probe() don't release the refcount in error case. Add missing of_node_put() to avoid refcount leak. Fixes: b1f66a5bee07 ("net: ethernet: ti: am65-cpsw-nuss: enable packet timestamping support") Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-27um: daemon: Make default socket configurableJohannes Berg
Even if daemon network is deprecated, some configurations may still use it (e.g. Debian), and not want to default to the /tmp/uml.ctl socket location. Allow configuring the default socket location. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Tested-by: Ritesh Raj Sarraf <ritesh@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-27net: ethernet: mtk_eth_soc: out of bounds read in mtk_hwlro_get_fdir_entry()Dan Carpenter
The "fsp->location" variable comes from user via ethtool_get_rxnfc(). Check that it is valid to prevent an out of bounds read. Fixes: 7aab747e5563 ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-27scripts/kallsyms: update usage message of the kallsyms programYuntao Wang
The kallsyms program supports --absolute-percpu option but does not display it in the usage message, fix it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-05-27kbuild: Fix include path in scripts/Makefile.modpostJing Leng
When building an external module, if users don't need to separate the compilation output and source code, they run the following command: "make -C $(LINUX_SRC_DIR) M=$(PWD)". At this point, "$(KBUILD_EXTMOD)" and "$(src)" are the same. If they need to separate them, they run "make -C $(KERNEL_SRC_DIR) O=$(KERNEL_OUT_DIR) M=$(OUT_DIR) src=$(PWD)". Before running the command, they need to copy "Kbuild" or "Makefile" to "$(OUT_DIR)" to prevent compilation failure. So the kernel should change the included path to avoid the copy operation. Signed-off-by: Jing Leng <jleng@ambarella.com> [masahiro: I do not think "M=$(OUT_DIR) src=$(PWD)" is the official way, but this patch is a nice clean up anyway.] Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-05-27um: xterm: Make default terminal emulator configurableJohannes Berg
Make the default terminal emulator configurable so e.g. Debian can set it to x-terminal-emulator instead of the current default of xterm. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Tested-by: Ritesh Raj Sarraf <ritesh@collabora.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2022-05-26Merge tag 'for-5.19/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Enable DM core bioset's per-cpu bio cache if QUEUE_FLAG_POLL set. This change improves DM's hipri bio polling (REQ_POLLED) performance by 7 - 20% depending on the system. - Update DM core to use jump_labels to further reduce cost of unlikely branches for zoned block devices, dm-stats and swap_bios throttling. - Various DM core changes to reduce bio-based DM overhead and simplify IO accounting. - Fundamental DM core improvements to dm_io reference counting and the elimination of using bio_split()+bio_chain() -- instead DM's bio-based IO accounting is updated to account that a split occurred. - Improve DM core's abnormal bio processing to do less work. - Improve DM core's hipri polling support to use a single list rather than an hlist. - Update DM core to pass NULL bdev to bio_alloc_clone() so that initialization that isn't useful for DM can be elided. - Add cond_resched to DM stats' various loops that loop over all entries. - Fix incorrect error code return from DM integrity's constructor. - Make DM crypt's printing of the key constant-time. - Update bio-based DM multipath to provide high-resolution timer to the Historical Service Time (HST) path selector. * tag 'for-5.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits) dm: pass NULL bdev to bio_alloc_clone dm cache metadata: remove unnecessary variable in __dump_mapping dm mpath: provide high-resolution timer to HST for bio-based dm crypt: make printing of the key constant-time dm integrity: fix error code in dm_integrity_ctr() dm stats: add cond_resched when looping over entries dm: improve abnormal bio processing dm: simplify bio-based IO accounting further dm: put all polled dm_io instances into a single list dm: improve dm_io reference counting dm: don't grab target io reference in dm_zone_map_bio dm: improve bio splitting and associated IO accounting dm: switch to bdev based IO accounting interfaces dm: pass dm_io instance to dm_io_acct directly dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct dm: use bio_sectors in dm_aceept_partial_bio dm: simplify basic targets dm: conditionally enable branching for less used features dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio dm: move hot dm_io members to same cacheline as dm_target_io ...
2022-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "Small collection of incremental improvement patches: - Minor code cleanup patches, comment improvements, etc from static tools - Clean the some of the kernel caps, reducing the historical stealth uAPI leftovers - Bug fixes and minor changes for rdmavt, hns, rxe, irdma - Remove unimplemented cruft from rxe - Reorganize UMR QP code in mlx5 to avoid going through the IB verbs layer - flush_workqueue(system_unbound_wq) removal - Ensure rxe waits for objects to be unused before allowing the core to free them - Several rc quality bug fixes for hfi1" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (67 commits) RDMA/rtrs-clt: Fix one kernel-doc comment RDMA/hfi1: Remove all traces of diagpkt support RDMA/hfi1: Consolidate software versions RDMA/hfi1: Remove pointless driver version RDMA/hfi1: Fix potential integer multiplication overflow errors RDMA/hfi1: Prevent panic when SDMA is disabled RDMA/hfi1: Prevent use of lock before it is initialized RDMA/rxe: Fix an error handling path in rxe_get_mcg() IB/core: Fix typo in comment RDMA/core: Fix typo in comment IB/hf1: Fix typo in comment IB/qib: Fix typo in comment IB/iser: Fix typo in comment RDMA/mlx4: Avoid flush_scheduled_work() usage IB/isert: Avoid flush_scheduled_work() usage RDMA/mlx5: Remove duplicate pointer assignment in mlx5_ib_alloc_implicit_mr() RDMA/qedr: Remove unnecessary synchronize_irq() before free_irq() RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx() RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx() RDMA/irdma: Add SW mechanism to generate completions on error ...
2022-05-26Merge tag 'hardening-v5.19-rc1-fix1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening fix from Kees Cook: "This fixes an unlucky build race condition when using the GCC plugins, noticed by a few folks. - Avoid GCC plugins needing utsrelease.h build target (Masahiro Yamada)" * tag 'hardening-v5.19-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: use KERNELVERSION for plugin version
2022-05-26Merge tag 'nfsd-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: "We introduce 'courteous server' in this release. Previously NFSD would purge open and lock state for an unresponsive client after one lease period (typically 90 seconds). Now, after one lease period, another client can open and lock those files and the unresponsive client's lease is purged; otherwise if the unresponsive client's open and lock state is uncontended, the server retains that open and lock state for up to 24 hours, allowing the client's workload to resume after a lengthy network partition. A longstanding issue with NFSv4 file creation is also addressed. Previously a file creation can fail internally, returning an error to the client, but leave the newly created file in place as an artifact. The file creation code path has been reorganized so that internal failures and race conditions are less likely to result in an unwanted file creation. A fault injector has been added to help exercise paths that are run during kernel metadata cache invalidation. These caches contain information maintained by user space about exported filesystems. Many of our test workloads do not trigger cache invalidation. There is one patch that is needed to support PREEMPT_RT and a fix for an ancient 'sleep while spin-locked' splat that seems to have become easier to hit since v5.18-rc3" * tag 'nfsd-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (36 commits) NFSD: nfsd_file_put() can sleep NFSD: Add documenting comment for nfsd4_release_lockowner() NFSD: Modernize nfsd4_release_lockowner() NFSD: Fix possible sleep during nfsd4_release_lockowner() nfsd: destroy percpu stats counters after reply cache shutdown nfsd: Fix null-ptr-deref in nfsd_fill_super() nfsd: Unregister the cld notifier when laundry_wq create failed SUNRPC: Use RMW bitops in single-threaded hot paths NFSD: Clean up the show_nf_flags() macro NFSD: Trace filecache opens NFSD: Move documenting comment for nfsd4_process_open2() NFSD: Fix whitespace NFSD: Remove dprintk call sites from tail of nfsd4_open() NFSD: Instantiate a struct file when creating a regular NFSv4 file NFSD: Clean up nfsd_open_verified() NFSD: Remove do_nfsd_create() NFSD: Refactor NFSv4 OPEN(CREATE) NFSD: Refactor NFSv3 CREATE NFSD: Refactor nfsd_create_setattr() NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() ...
2022-05-26net: sched: fixed barrier to prevent skbuff sticking in qdisc backlogVincent Ray
In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit() does not provide any ordering guarantee as test_bit() is not an atomic operation. This, added to the fact that the spin_trylock() call at the beginning of qdisc_run_begin() does not guarantee acquire semantics if it does not grab the lock, makes it possible for the following statement : if (test_bit(__QDISC_STATE_MISSED, &qdisc->state)) to be executed before an enqueue operation called before qdisc_run_begin(). As a result the following race can happen : CPU 1 CPU 2 qdisc_run_begin() qdisc_run_begin() /* true */ set(MISSED) . /* returns false */ . . /* sees MISSED = 1 */ . /* so qdisc not empty */ . __qdisc_run() . . . pfifo_fast_dequeue() ----> /* may be done here */ . | . clear(MISSED) | . . | . smp_mb __after_atomic(); | . . | . /* recheck the queue */ | . /* nothing => exit */ | enqueue(skb1) | . | qdisc_run_begin() | . | spin_trylock() /* fail */ | . | smp_mb__before_atomic() /* not enough */ | . ---- if (test_bit(MISSED)) return false; /* exit */ In the above scenario, CPU 1 and CPU 2 both try to grab the qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the bypass code path, where it emits its skb then calls __qdisc_run(). CPU1 fails, sets MISSED and goes down the traditionnal enqueue() + dequeue() code path. But when executing qdisc_run_begin() for the second time, after enqueuing its skbuff, it sees the MISSED bit still set (by itself) and consequently chooses to exit early without setting it again nor trying to grab the spinlock again. Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue and found it empty, so it returned. At the end of the sequence, we end up with skb1 enqueued in the backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set, and no __netif_schedule() called made. skb1 will now linger in the qdisc until somebody later performs a full __qdisc_run(). Associated to the bypass capacity of the qdisc, and the ability of the TCP layer to avoid resending packets which it knows are still in the qdisc, this can lead to serious traffic "holes" in a TCP connection. We fix this by replacing the smp_mb__before_atomic() / test_bit() / set_bit() / smp_mb__after_atomic() sequence inside qdisc_run_begin() by a single test_and_set_bit() call, which is more concise and enforces the needed memory barriers. Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc") Signed-off-by: Vincent Ray <vray@kalrayinc.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220526001746.2437669-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-26net: lan966x: check devm_of_phy_get() for -EDEFER_PROBEMichael Walle
At the moment, if devm_of_phy_get() returns an error the serdes simply isn't set. While it is bad to ignore an error in general, there is a particular bug that network isn't working if the serdes driver is compiled as a module. In that case, devm_of_phy_get() returns -EDEFER_PROBE and the error is silently ignored. The serdes is optional, it is not there if the port is using RGMII, in which case devm_of_phy_get() returns -ENODEV. Rearrange the error handling so that -ENODEV will be handled but other error codes will abort the probing. Fixes: d28d6d2e37d1 ("net: lan966x: add port module support") Signed-off-by: Michael Walle <michael@walle.cc> Link: https://lore.kernel.org/r/20220525231239.1307298-1-michael@walle.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix UAF when creating non-stateful expression in set. 2) Set limit cost when cloning expression accordingly, from Phil Sutter. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_limit: Clone packet limits' cost value netfilter: nf_tables: disallow non-stateful expression in sets earlier ==================== Link: https://lore.kernel.org/r/20220526205411.315136-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-26tracing: Fix comments for event_trigger_separate_filter()sunliming
The parameter name in comments of event_trigger_separate_filter() is inconsistent with actual parameter name, fix it. Link: https://lkml.kernel.org/r/20220526072957.165655-1-sunliming@kylinos.cn Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26x86/traceponit: Fix comment about irq vector tracepointssunliming
Commit: 4b9a8dca0e58 ("x86/idt: Remove the tracing IDT completely") removed the 'tracing IDT' from arch/x86/kernel/tracepoint.c, but left related comment. So that the comment become anachronistic. Just remove the comment. Link: https://lkml.kernel.org/r/20220526110831.175743-1-sunliming@kylinos.cn Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26x86,tracing: Remove unused headerssunliming
Commit 4b9a8dca0e58 ("x86/idt: Remove the tracing IDT completely") removed the tracing IDT from the file arch/x86/kernel/tracepoint.c, but left the related headers unused, remove it. Link: https://lkml.kernel.org/r/20220525012827.93464-1-sunliming@kylinos.cn Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26ftrace: Clean up hash direct_functions on register failuresSong Liu
We see the following GPF when register_ftrace_direct fails: [ ] general protection fault, probably for non-canonical address \ 0x200000000000010: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [...] [ ] RIP: 0010:ftrace_find_rec_direct+0x53/0x70 [ ] Code: 48 c1 e0 03 48 03 42 08 48 8b 10 31 c0 48 85 d2 74 [...] [ ] RSP: 0018:ffffc9000138bc10 EFLAGS: 00010206 [ ] RAX: 0000000000000000 RBX: ffffffff813e0df0 RCX: 000000000000003b [ ] RDX: 0200000000000000 RSI: 000000000000000c RDI: ffffffff813e0df0 [ ] RBP: ffffffffa00a3000 R08: ffffffff81180ce0 R09: 0000000000000001 [ ] R10: ffffc9000138bc18 R11: 0000000000000001 R12: ffffffff813e0df0 [ ] R13: ffffffff813e0df0 R14: ffff888171b56400 R15: 0000000000000000 [ ] FS: 00007fa9420c7780(0000) GS:ffff888ff6a00000(0000) knlGS:000000000 [ ] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ ] CR2: 000000000770d000 CR3: 0000000107d50003 CR4: 0000000000370ee0 [ ] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ ] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ ] Call Trace: [ ] <TASK> [ ] register_ftrace_direct+0x54/0x290 [ ] ? render_sigset_t+0xa0/0xa0 [ ] bpf_trampoline_update+0x3f5/0x4a0 [ ] ? 0xffffffffa00a3000 [ ] bpf_trampoline_link_prog+0xa9/0x140 [ ] bpf_tracing_prog_attach+0x1dc/0x450 [ ] bpf_raw_tracepoint_open+0x9a/0x1e0 [ ] ? find_held_lock+0x2d/0x90 [ ] ? lock_release+0x150/0x430 [ ] __sys_bpf+0xbd6/0x2700 [ ] ? lock_is_held_type+0xd8/0x130 [ ] __x64_sys_bpf+0x1c/0x20 [ ] do_syscall_64+0x3a/0x80 [ ] entry_SYSCALL_64_after_hwframe+0x44/0xae [ ] RIP: 0033:0x7fa9421defa9 [ ] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 9 f8 [...] [ ] RSP: 002b:00007ffed743bd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 [ ] RAX: ffffffffffffffda RBX: 00000000069d2480 RCX: 00007fa9421defa9 [ ] RDX: 0000000000000078 RSI: 00007ffed743bd80 RDI: 0000000000000011 [ ] RBP: 00007ffed743be00 R08: 0000000000bb7270 R09: 0000000000000000 [ ] R10: 00000000069da210 R11: 0000000000000246 R12: 0000000000000001 [ ] R13: 00007ffed743c4b0 R14: 00000000069d2480 R15: 0000000000000001 [ ] </TASK> [ ] Modules linked in: klp_vm(OK) [ ] ---[ end trace 0000000000000000 ]--- One way to trigger this is: 1. load a livepatch that patches kernel function xxx; 2. run bpftrace -e 'kfunc:xxx {}', this will fail (expected for now); 3. repeat #2 => gpf. This is because the entry is added to direct_functions, but not removed. Fix this by remove the entry from direct_functions when register_ftrace_direct fails. Also remove the last trailing space from ftrace.c, so we don't have to worry about it anymore. Link: https://lkml.kernel.org/r/20220524170839.900849-1-song@kernel.org Cc: stable@vger.kernel.org Fixes: 763e34e74bb7 ("ftrace: Add register_ftrace_direct()") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing: Fix comments of create_filter()sunliming
The name in comments of parameter "filter_string" in function create_filter is annotated as "filter_str", just fix it. Link: https://lkml.kernel.org/r/20220524063937.52873-1-sunliming@kylinos.cn Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing: Disable kcov on trace_preemptirq.cCongyu Liu
Functions in trace_preemptirq.c could be invoked from early interrupt code that bypasses kcov trace function's in_task() check. Disable kcov on this file to reduce random code coverage. Link: https://lkml.kernel.org/r/20220523063033.1778974-1-liu3101@purdue.edu Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Congyu Liu <liu3101@purdue.edu> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing: Initialize integer variable to prevent garbage return valueGautam Menghani
Initialize the integer variable to 0 to fix the clang scan warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return ret; Link: https://lkml.kernel.org/r/20220522061826.1751-1-gautammenghani201@gmail.com Cc: stable@vger.kernel.org Fixes: 8993665abcce ("tracing/boot: Support multiple handlers for per-event histogram") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Gautam Menghani <gautammenghani201@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26ftrace: Fix typo in commentJulia Lawall
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lkml.kernel.org/r/20220521111145.81697-81-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26ftrace: Remove return value of ftrace_arch_modify_*()Li kunyu
All instances of the function ftrace_arch_modify_prepare() and ftrace_arch_modify_post_process() return zero. There's no point in checking their return value. Just have them be void functions. Link: https://lkml.kernel.org/r/20220518023639.4065-1-kunyu@nfschina.com Signed-off-by: Li kunyu <kunyu@nfschina.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing: Cleanup code by removing init "char *name"liqiong
The pointer is assigned to "type->name" anyway. no need to initialize with "preemption". Link: https://lkml.kernel.org/r/20220513075221.26275-1-liqiong@nfschina.com Signed-off-by: liqiong <liqiong@nfschina.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing: Change "char *" string form to "char []"liqiong
The "char []" string form declares a single variable. It is better than "char *" which creates two variables in the final assembly. Link: https://lkml.kernel.org/r/20220512143230.28796-1-liqiong@nfschina.com Signed-off-by: liqiong <liqiong@nfschina.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing/timerlat: Do not wakeup the thread if the trace stops at the IRQDaniel Bristot de Oliveira
There is no need to wakeup the timerlat/ thread if stop tracing is hit at the timerlat's IRQ handler. Return before waking up timerlat's thread. Link: https://lkml.kernel.org/r/b392356c91b56aedd2b289513cc56a84cf87e60d.1652175637.git.bristot@kernel.org Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-05-26tracing/timerlat: Print stacktrace in the IRQ handler if neededDaniel Bristot de Oliveira
If print_stack and stop_tracing_us are set, and stop_tracing_us is hit with latency higher than or equal to print_stack, print the stack at the IRQ handler as it is useful to define the root cause for the IRQ latency. Link: https://lkml.kernel.org/r/fd04530ce98ae9270e41bb124ee5bf67b05ecfed.1652175637.git.bristot@kernel.org Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>