summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-07-16bpf, selftests: Add test cases for pointer alu from multiple pathsDaniel Borkmann
Add several test cases for checking update_alu_sanitation_state() under multiple paths: # ./test_verifier [...] #1061/u map access: known scalar += value_ptr unknown vs const OK #1061/p map access: known scalar += value_ptr unknown vs const OK #1062/u map access: known scalar += value_ptr const vs unknown OK #1062/p map access: known scalar += value_ptr const vs unknown OK #1063/u map access: known scalar += value_ptr const vs const (ne) OK #1063/p map access: known scalar += value_ptr const vs const (ne) OK #1064/u map access: known scalar += value_ptr const vs const (eq) OK #1064/p map access: known scalar += value_ptr const vs const (eq) OK #1065/u map access: known scalar += value_ptr unknown vs unknown (eq) OK #1065/p map access: known scalar += value_ptr unknown vs unknown (eq) OK #1066/u map access: known scalar += value_ptr unknown vs unknown (lt) OK #1066/p map access: known scalar += value_ptr unknown vs unknown (lt) OK #1067/u map access: known scalar += value_ptr unknown vs unknown (gt) OK #1067/p map access: known scalar += value_ptr unknown vs unknown (gt) OK [...] Summary: 1762 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-16bpf: Fix pointer arithmetic mask tightening under state pruningDaniel Borkmann
In 7fedb63a8307 ("bpf: Tighten speculative pointer arithmetic mask") we narrowed the offset mask for unprivileged pointer arithmetic in order to mitigate a corner case where in the speculative domain it is possible to advance, for example, the map value pointer by up to value_size-1 out-of- bounds in order to leak kernel memory via side-channel to user space. The verifier's state pruning for scalars leaves one corner case open where in the first verification path R_x holds an unknown scalar with an aux->alu_limit of e.g. 7, and in a second verification path that same register R_x, here denoted as R_x', holds an unknown scalar which has tighter bounds and would thus satisfy range_within(R_x, R_x') as well as tnum_in(R_x, R_x') for state pruning, yielding an aux->alu_limit of 3: Given the second path fits the register constraints for pruning, the final generated mask from aux->alu_limit will remain at 7. While technically not wrong for the non-speculative domain, it would however be possible to craft similar cases where the mask would be too wide as in 7fedb63a8307. One way to fix it is to detect the presence of unknown scalar map pointer arithmetic and force a deeper search on unknown scalars to ensure that we do not run into a masking mismatch. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-16bpf: Remove superfluous aux sanitation on subprog rejectionDaniel Borkmann
Follow-up to fe9a5ca7e370 ("bpf: Do not mark insn as seen under speculative path verification"). The sanitize_insn_aux_data() helper does not serve a particular purpose in today's code. The original intention for the helper was that if function-by-function verification fails, a given program would be cleared from temporary insn_aux_data[], and then its verification would be re-attempted in the context of the main program a second time. However, a failure in do_check_subprogs() will skip do_check_main() and propagate the error to the user instead, thus such situation can never occur. Given its interaction is not compatible to the Spectre v1 mitigation (due to comparing aux->seen with env->pass_cnt), just remove sanitize_insn_aux_data() to avoid future bugs in this area. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-07-16ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspendOder Chiou
While using the DMIC recording, the garbled data will be captured by the DMIC. It is caused by the critical power of PLL closed in the jack detect function. Signed-off-by: Oder Chiou <oder_chiou@realtek.com> Link: https://lore.kernel.org/r/20210716085853.20170-1-oder_chiou@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-16ASoC: amd: reverse stop sequence for stoneyridge platformVijendar Mukunda
For Stoneyridge platform, it is required to invoke DMA driver stop first rather than invoking DWC I2S controller stop. Enable dai_link structure stop_dma_fist flag to reverse the stop sequence. Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://lore.kernel.org/r/20210716123015.15697-2-vijendar.mukunda@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-16ASoC: soc-pcm: add a flag to reverse the stop sequenceVijendar Mukunda
On stream stop, currently CPU DAI stop sequence invoked first followed by DMA. For Few platforms, it is required to stop the DMA first before stopping CPU DAI. Introduced new flag in dai_link structure for reordering stop sequence. Based on flag check, ASoC core will re-order the stop sequence. Fixes: 4378f1fbe92405 ("ASoC: soc-pcm: Use different sequence for start/stop trigger") Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://lore.kernel.org/r/20210716123015.15697-1-vijendar.mukunda@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-16ASoC: codecs: wcd938x: setup irq during component bindSrinivas Kandagatla
SoundWire registers are only accessable after sdw components are succesfully binded. Setup irqs at that point instead of doing at probe. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20210716105735.6073-1-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-07-16reiserfs: check directory items on read from diskShreyansh Chouhan
While verifying the leaf item that we read from the disk, reiserfs doesn't check the directory items, this could cause a crash when we read a directory item from the disk that has an invalid deh_location. This patch adds a check to the directory items read from the disk that does a bounds check on deh_location for the directory entries. Any directory entry header with a directory entry offset greater than the item length is considered invalid. Link: https://lore.kernel.org/r/20210709152929.766363-1-chouhan.shreyansh630@gmail.com Reported-by: syzbot+c31a48e6702ccb3d64c9@syzkaller.appspotmail.com Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-16fs/ext2: Avoid page_address on pages returned by ext2_get_pageJavier Pello
Commit 782b76d7abdf02b12c46ed6f1e9bf715569027f7 ("fs/ext2: Replace kmap() with kmap_local_page()") replaced the kmap/kunmap calls in ext2_get_page/ext2_put_page with kmap_local_page/kunmap_local for efficiency reasons. As a necessary side change, the commit also made ext2_get_page (and ext2_find_entry and ext2_dotdot) return the mapping address along with the page itself, as it is required for kunmap_local, and converted uses of page_address on such pages to use the newly returned address instead. However, uses of page_address on such pages were missed in ext2_check_page and ext2_delete_entry, which triggers oopses if kmap_local_page happens to return an address from high memory. Fix this now by converting the remaining uses of page_address to use the right address, as returned by kmap_local_page. Link: https://lore.kernel.org/r/20210714185448.8707ac239e9f12b3a7f5b9f9@urjc.es Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Javier Pello <javier.pello@urjc.es> Fixes: 782b76d7abdf ("fs/ext2: Replace kmap() with kmap_local_page()") Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-16reiserfs: add check for root_inode in reiserfs_fill_superYu Kuai
Our syzcaller report a NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 116e95067 P4D 116e95067 PUD 1080b5067 PMD 0 Oops: 0010 [#1] SMP KASAN CPU: 7 PID: 592 Comm: a.out Not tainted 5.13.0-next-20210629-dirty #67 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-p4 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffff888114e779b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 1ffff110229cef39 RCX: ffffffffaa67e1aa RDX: 0000000000000000 RSI: ffff88810a58ee00 RDI: ffff8881233180b0 RBP: ffffffffac38e9c0 R08: ffffffffaa67e17e R09: 0000000000000001 R10: ffffffffb91c5557 R11: fffffbfff7238aaa R12: ffff88810a58ee00 R13: ffff888114e77aa0 R14: 0000000000000000 R15: ffff8881233180b0 FS: 00007f946163c480(0000) GS:ffff88839f1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000001099c1000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __lookup_slow+0x116/0x2d0 ? page_put_link+0x120/0x120 ? __d_lookup+0xfc/0x320 ? d_lookup+0x49/0x90 lookup_one_len+0x13c/0x170 ? __lookup_slow+0x2d0/0x2d0 ? reiserfs_schedule_old_flush+0x31/0x130 reiserfs_lookup_privroot+0x64/0x150 reiserfs_fill_super+0x158c/0x1b90 ? finish_unfinished+0xb10/0xb10 ? bprintf+0xe0/0xe0 ? __mutex_lock_slowpath+0x30/0x30 ? __kasan_check_write+0x20/0x30 ? up_write+0x51/0xb0 ? set_blocksize+0x9f/0x1f0 mount_bdev+0x27c/0x2d0 ? finish_unfinished+0xb10/0xb10 ? reiserfs_kill_sb+0x120/0x120 get_super_block+0x19/0x30 legacy_get_tree+0x76/0xf0 vfs_get_tree+0x49/0x160 ? capable+0x1d/0x30 path_mount+0xacc/0x1380 ? putname+0x97/0xd0 ? finish_automount+0x450/0x450 ? kmem_cache_free+0xf8/0x5a0 ? putname+0x97/0xd0 do_mount+0xe2/0x110 ? path_mount+0x1380/0x1380 ? copy_mount_options+0x69/0x140 __x64_sys_mount+0xf0/0x190 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae This is because 'root_inode' is initialized with wrong mode, and it's i_op is set to 'reiserfs_special_inode_operations'. Thus add check for 'root_inode' to fix the problem. Link: https://lore.kernel.org/r/20210702040743.1918552-1-yukuai3@huawei.com Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-16dma-mapping: handle vmalloc addresses in dma_common_{mmap,get_sgtable}Roman Skakun
xen-swiotlb can use vmalloc backed addresses for dma coherent allocations and uses the common helpers. Properly handle them to unbreak Xen on ARM platforms. Fixes: 1b65c4e5a9af ("swiotlb-xen: use xen_alloc/free_coherent_pages") Signed-off-by: Roman Skakun <roman_skakun@epam.com> Reviewed-by: Andrii Anisov <andrii_anisov@epam.com> [hch: split the patch, renamed the helpers] Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-07-16drm/amdgpu: workaround failed COW checks for Thunk VMAsFelix Kuehling
KFD Thunk maps invisible VRAM BOs with PROT_NONE, MAP_PRIVATE. is_cow_mapping returns true for these mappings, which causes mmap to fail in ttm_bo_mmap_obj. As a workaround, clear VM_MAYWRITE for PROT_NONE-COW mappings. This should prevent the mapping from ever becoming writable and makes is_cow_mapping(vm_flags) false. Fixes: f91142c62161 ("drm/ttm: nuke VM_MIXEDMAP on BO mappings v3") Suggested-by: Daniel Vetter <daniel.vetter@intel.com> Tested-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210715190537.585456-1-Felix.Kuehling@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-07-15 The following pull-request contains BPF updates for your *net-next* tree. We've added 45 non-merge commits during the last 15 day(s) which contain a total of 52 files changed, 3122 insertions(+), 384 deletions(-). The main changes are: 1) Introduce bpf timers, from Alexei. 2) Add sockmap support for unix datagram socket, from Cong. 3) Fix potential memleak and UAF in the verifier, from He. 4) Add bpf_get_func_ip helper, from Jiri. 5) Improvements to generic XDP mode, from Kumar. 6) Support for passing xdp_md to XDP programs in bpf_prog_run, from Zvi. =================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-16cifs: do not share tcp sessions of dfs connectionsPaulo Alcantara
Make sure that we do not share tcp sessions of dfs mounts when mounting regular shares that connect to same server. DFS connections rely on a single instance of tcp in order to do failover properly in cifs_reconnect(). Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-16zonefs: remove redundant null bio checkXianting Tian
bio_alloc() with __GFP_DIRECT_RECLAIM, which is included in GFP_NOFS, never fails, see comments in bio_alloc_bioset(). Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2021-07-15Merge branch 'sockmap: add sockmap support for unix datagram socket'Alexei Starovoitov
Cong Wang says: ==================== From: Cong Wang <cong.wang@bytedance.com> This is the last patchset of the original large patchset. In the previous patchset, a new BPF sockmap program BPF_SK_SKB_VERDICT was introduced and UDP began to support it too. In this patchset, we add BPF_SK_SKB_VERDICT support to Unix datagram socket, so that we can finally splice Unix datagram socket and UDP socket. Please check each patch description for more details. To see the big picture, the previous patchsets are available here: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=1e0ab70778bd86a90de438cc5e1535c115a7c396 https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=89d69c5d0fbcabd8656459bc8b1a476d6f1efee4 and this patchset is available here: https://github.com/congwang/linux/tree/sockmap3 Acked-by: John Fastabend <john.fastabend@gmail.com> --- v5: lift socket state check for dgram remove ->unhash() case add retries for EAGAIN in all test cases remove an unused parameter of __unix_dgram_recvmsg() rebase on the latest bpf-next v4: fix af_unix disconnect case add unix_unhash() split out two small patches reduce u->iolock critical section remove an unused parameter of __unix_dgram_recvmsg() v3: fix Kconfig dependency make unix_read_sock() static fix a UAF in unix_release() add a missing header unix_bpf.c v2: separate out from the original large patchset rebase to the latest bpf-next clean up unix_read_sock() export sock_map_close() factor out some helpers in selftests for code reuse ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-07-15selftests/bpf: Add test cases for redirection between udp and unixCong Wang
Add two test cases to ensure redirection between udp and unix work bidirectionally. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-12-xiyou.wangcong@gmail.com
2021-07-15selftests/bpf: Add a test case for unix sockmapCong Wang
Add a test case to ensure redirection between two AF_UNIX datagram sockets work. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-11-xiyou.wangcong@gmail.com
2021-07-15selftests/bpf: Factor out add_to_sockmap()Cong Wang
Factor out a common helper add_to_sockmap() which adds two sockets into a sockmap. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-10-xiyou.wangcong@gmail.com
2021-07-15selftests/bpf: Factor out udp_socketpair()Cong Wang
Factor out a common helper udp_socketpair() which creates a pair of connected UDP sockets. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-9-xiyou.wangcong@gmail.com
2021-07-15af_unix: Implement unix_dgram_bpf_recvmsg()Cong Wang
We have to implement unix_dgram_bpf_recvmsg() to replace the original ->recvmsg() to retrieve skmsg from ingress_msg. AF_UNIX is again special here because the lack of sk_prot->recvmsg(). I simply add a special case inside unix_dgram_recvmsg() to call sk->sk_prot->recvmsg() directly. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-8-xiyou.wangcong@gmail.com
2021-07-15af_unix: Implement ->psock_update_sk_prot()Cong Wang
Now we can implement unix_bpf_update_proto() to update sk_prot, especially prot->close(). Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-7-xiyou.wangcong@gmail.com
2021-07-15af_unix: Add a dummy ->close() for sockmapCong Wang
Unlike af_inet, unix_proto is very different, it does not even have a ->close(). We have to add a dummy implementation to satisfy sockmap. Normally it is just a nop, it is introduced only for sockmap to replace it. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-6-xiyou.wangcong@gmail.com
2021-07-15af_unix: Set TCP_ESTABLISHED for datagram sockets tooCong Wang
Currently only unix stream socket sets TCP_ESTABLISHED, datagram socket can set this too when they connect to its peer socket. At least __ip4_datagram_connect() does the same. This will be used to determine whether an AF_UNIX datagram socket can be redirected to in sockmap. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-5-xiyou.wangcong@gmail.com
2021-07-15af_unix: Implement ->read_sock() for sockmapCong Wang
Implement ->read_sock() for AF_UNIX datagram socket, it is pretty much similar to udp_read_sock(). Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-4-xiyou.wangcong@gmail.com
2021-07-15sock_map: Lift socket state restriction for datagram socketsCong Wang
TCP and other connection oriented sockets have accept() for each incoming connection on the server side, hence they can just insert those fd's from accept() to sockmap, which are of course established. Now with datagram sockets begin to support sockmap and redirection, the restriction is no longer applicable to them, as they have no accept(). So we have to lift this restriction for them. This is fine, because inside bpf_sk_redirect_map() we still have another socket status check, sock_map_redirect_allowed(), as a guard. This also means they do not have to be removed from sockmap when disconnecting. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-3-xiyou.wangcong@gmail.com
2021-07-15sock_map: Relax config dependency to CONFIG_NETCong Wang
Currently sock_map still has Kconfig dependency on CONFIG_INET, but there is no actual functional dependency on it after we introduce ->psock_update_sk_prot(). We have to extend it to CONFIG_NET now as we are going to support AF_UNIX. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210704190252.11866-2-xiyou.wangcong@gmail.com
2021-07-15Revert "Makefile: Enable -Wimplicit-fallthrough for Clang"Linus Torvalds
This reverts commit b7eb335e26a9c7f258c96b3962c283c379d3ede0. It turns out that the problem with the clang -Wimplicit-fallthrough warning is not about the kernel source code, but about clang itself, and that the warning is unusable until clang fixes its broken ways. In particular, when you enable this warning for clang, you not only get warnings about implicit fallthroughs. You also get this: warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough] which is completely broken becasue it (a) doesn't even tell you where the problem is (seriously: no line numbers, no filename, no nothing). (b) is fundamentally broken anyway, because there are perfectly valid reasons to have a fallthrough statement even if it turns out that it can perhaps not be reached. In the kernel, an example of that second case is code in the scheduler: switch (state) { case cpuset: if (IS_ENABLED(CONFIG_CPUSETS)) { cpuset_cpus_allowed_fallback(p); state = possible; break; } fallthrough; case possible: where if CONFIG_CPUSETS is enabled you actually never hit the fallthrough case at all. But that in no way makes the fallthrough wrong. So the warning is completely broken, and enabling it for clang is a very bad idea. In the meantime, we can keep the gcc option enabled, and make the gcc build use -Wimplicit-fallthrough=5 which means that we will at least continue to require a proper fallthrough statement, and that gcc won't silently accept the magic comment versions. Because gcc does this all correctly, and while the odd "=5" part is kind of obscure, it's documented in [1]: "-Wimplicit-fallthrough=5 doesn’t recognize any comments as fallthrough comments, only attributes disable the warning" so if clang ever fixes its bad behavior we can try enabling it there again. Link: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html [1] Cc: Kees Cook <keescook@chromium.org> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-15Merge branch 'Add bpf_get_func_ip helper'Alexei Starovoitov
Jiri Olsa says: ==================== Add bpf_get_func_ip helper that returns IP address of the caller function for trampoline and krobe programs. There're 2 specific implementation of the bpf_get_func_ip helper, one for trampoline progs and one for kprobe/kretprobe progs. The trampoline helper call is replaced/inlined by the verifier with simple move instruction. The kprobe/kretprobe is actual helper call that returns prepared caller address. Also available at: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git bpf/get_func_ip v4 changes: - dropped jit/x86 check for get_func_ip tracing check [Alexei] - added code to bpf_get_func_ip_tracing [Alexei] and tested that it works without inlining [Alexei] - changed has_get_func_ip to check_get_func_ip [Andrii] - replaced test assert loop with explicit asserts [Andrii] - adde bpf_program__attach_kprobe_opts function and use it for offset setup [Andrii] - used bpf_program__set_autoload(false) for test6 [Andrii] - added Masami's ack v3 changes: - resend with Masami in cc and v3 in each patch subject v2 changes: - use kprobe_running to get kprobe instead of cpu var [Masami] - added support to add kprobe on function+offset and test for that [Alan] ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-07-15selftests/bpf: Add test for bpf_get_func_ip in kprobe+offset probeJiri Olsa
Adding test for bpf_get_func_ip in kprobe+ofset probe. Because of the offset value it's arch specific, enabling the new test only for x86_64 architecture. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-9-jolsa@kernel.org
2021-07-15libbpf: Allow specification of "kprobe/function+offset"Alan Maguire
kprobes can be placed on most instructions in a function, not just entry, and ftrace and bpftrace support the function+offset notification for probe placement. Adding parsing of func_name into func+offset to bpf_program__attach_kprobe() allows the user to specify SEC("kprobe/bpf_fentry_test5+0x6") ...for example, and the offset can be passed to perf_event_open_probe() to support kprobe attachment. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org
2021-07-15libbpf: Add bpf_program__attach_kprobe_opts functionJiri Olsa
Adding bpf_program__attach_kprobe_opts that does the same as bpf_program__attach_kprobe, but takes opts argument. Currently opts struct holds just retprobe bool, but we will add new field in following patch. The function is not exported, so there's no need to add size to the struct bpf_program_attach_kprobe_opts for now. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org
2021-07-15selftests/bpf: Add test for bpf_get_func_ip helperJiri Olsa
Adding test for bpf_get_func_ip helper for fentry, fexit, kprobe, kretprobe and fmod_ret programs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-6-jolsa@kernel.org
2021-07-15bpf: Add bpf_get_func_ip helper for kprobe programsJiri Olsa
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_KPROBE programs, so it's now possible to call bpf_get_func_ip from both kprobe and kretprobe programs. Taking the caller's address from 'struct kprobe::addr', which is defined for both kprobe and kretprobe. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-5-jolsa@kernel.org
2021-07-15bpf: Add bpf_get_func_ip helper for tracing programsJiri Olsa
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_TRACING programs, specifically for all trampoline attach types. The trampoline's caller IP address is stored in (ctx - 8) address. so there's no reason to actually call the helper, but rather fixup the call instruction and return [ctx - 8] value directly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-4-jolsa@kernel.org
2021-07-16Merge tag 'drm-intel-fixes-2021-07-15' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Two regression fixes targeting stable: - Fix -EDEADLK handling regression (Ville) - Drop the page table optimisation (Matt) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YPA8y1DSCp2EbtpC@intel.com
2021-07-15Merge tag 'configfs-5.13-1' of git://git.infradead.org/users/hch/configfsLinus Torvalds
Pull configfs fix from Christoph Hellwig: - fix the read and write iterators (Bart Van Assche) * tag 'configfs-5.13-1' of git://git.infradead.org/users/hch/configfs: configfs: fix the read and write iterators
2021-07-16Merge tag 'drm-misc-fixes-2021-07-15' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull (less than what git shortlog provides): * fbdev: Avoid use-after-free by not deleting current video mode * ttm: Avoid NULL-ptr deref in ttm_range_man_fini() * vmwgfx: Fix a merge commit Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YO/yoFO+iSEqnIH0@linux-uq9g
2021-07-15Merge tag 'pwm/for-5.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm fixes from Thierry Reding: "A couple of fixes from Uwe that I missed for v5.14-rc1" * tag 'pwm/for-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: pwm: ep93xx: Ensure configuring period and duty_cycle isn't wrongly skipped pwm: berlin: Ensure configuring period and duty_cycle isn't wrongly skipped pwm: tiecap: Ensure configuring period and duty_cycle isn't wrongly skipped pwm: spear: Ensure configuring period and duty_cycle isn't wrongly skipped pwm: sprd: Ensure configuring period and duty_cycle isn't wrongly skipped
2021-07-15bpf: Enable BPF_TRAMP_F_IP_ARG for trampolines with call_get_func_ipJiri Olsa
Enabling BPF_TRAMP_F_IP_ARG for trampolines that actually need it. The BPF_TRAMP_F_IP_ARG adds extra 3 instructions to trampoline code and is used only by programs with bpf_get_func_ip helper, which is added in following patch and sets call_get_func_ip bit. This patch ensures that BPF_TRAMP_F_IP_ARG flag is used only for trampolines that have programs with call_get_func_ip set. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-3-jolsa@kernel.org
2021-07-15bpf, x86: Store caller's ip in trampoline stackJiri Olsa
Storing caller's ip in trampoline's stack. Trampoline programs can reach the IP in (ctx - 8) address, so there's no change in program's arguments interface. The IP address is takes from [fp + 8], which is return address from the initial 'call fentry' call to trampoline. This IP address will be returned via bpf_get_func_ip helper helper, which is added in following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-2-jolsa@kernel.org
2021-07-15SMB3.1.1: fix mount failure to some servers when compression enabledSteve French
When sending the compression context to some servers, they rejected the SMB3.1.1 negotiate protocol because they expect the compression context to have a data length of a multiple of 8. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-15cifs: added WARN_ON for all the count decrementsShyam Prasad N
We have a few ref counters srv_count, ses_count and tc_count which we use for ref counting. Added a WARN_ON during the decrement of each of these counters to make sure that they don't go below their minimum values. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-15cifs: fix missing null session check in mountSteve French
Although it is unlikely to be have ended up with a null session pointer calling cifs_try_adding_channels in cifs_mount. Coverity correctly notes that we are already checking for it earlier (when we return from do_dfs_failover), so at a minimum to clarify the code we should make sure we also check for it when we exit the loop so we don't end up calling cifs_try_adding_channels or mount_setup_tlink with a null ses pointer. Addresses-Coverity: 1505608 ("Derefernce after null check") Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-15cifs: handle reconnect of tcon when there is no cached dfs referralPaulo Alcantara
When there is no cached DFS referral of tcon->dfs_path, then reconnect to same share. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-16Merge tag 'amd-drm-fixes-5.14-2021-07-14' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.14-2021-07-14: amdgpu: - SR-IOV fixes - RAS fixes - eDP fixes - SMU13 code unification to facilitate fixes in the future - Add new renoir DID - Yellow Carp fixes - Beige Goby fixes - Revert a bunch of TLB fixes that caused regressions - Revert an LTTPR display regression amdkfd - Fix VRAM access regression - SVM fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210714220858.5553-1-alexander.deucher@amd.com
2021-07-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Andrii Nakryiko says: ==================== pull-request: bpf 2021-07-15 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 5 day(s) which contain a total of 9 files changed, 37 insertions(+), 15 deletions(-). The main changes are: 1) Fix NULL pointer dereference in BPF_TEST_RUN for BPF_XDP_DEVMAP and BPF_XDP_CPUMAP programs, from Xuan Zhuo. 2) Fix use-after-free of net_device in XDP bpf_link, from Xuan Zhuo. 3) Follow-up fix to subprog poke descriptor use-after-free problem, from Daniel Borkmann and John Fastabend. 4) Fix out-of-range array access in s390 BPF JIT backend, from Colin Ian King. 5) Fix memory leak in BPF sockmap, from John Fastabend. 6) Fix for sockmap to prevent proc stats reporting bug, from John Fastabend and Jakub Sitnicki. 7) Fix NULL pointer dereference in bpftool, from Tobias Klauser. 8) AF_XDP documentation fixes, from Baruch Siach. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-15tracing: Do not reference char * as a string in histogramsSteven Rostedt (VMware)
The histogram logic was allowing events with char * pointers to be used as normal strings. But it was easy to crash the kernel with: # echo 'hist:keys=filename' > events/syscalls/sys_enter_openat/trigger And open some files, and boom! BUG: unable to handle page fault for address: 00007f2ced0c3280 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1173fa067 P4D 1173fa067 PUD 1171b6067 PMD 1171dd067 PTE 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1810 Comm: cat Not tainted 5.13.0-rc5-test+ #61 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strlen+0x0/0x20 Code: f6 82 80 2a 0b a9 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2a 0b a9 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 RSP: 0018:ffffbdbf81567b50 EFLAGS: 00010246 RAX: 0000000000000003 RBX: ffff93815cdb3800 RCX: ffff9382401a22d0 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 00007f2ced0c3280 RBP: 0000000000000100 R08: ffff9382409ff074 R09: ffffbdbf81567c98 R10: ffff9382409ff074 R11: 0000000000000000 R12: ffff9382409ff074 R13: 0000000000000001 R14: ffff93815a744f00 R15: 00007f2ced0c3280 FS: 00007f2ced0f8580(0000) GS:ffff93825a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2ced0c3280 CR3: 0000000107069005 CR4: 00000000001706e0 Call Trace: event_hist_trigger+0x463/0x5f0 ? find_held_lock+0x32/0x90 ? sched_clock_cpu+0xe/0xd0 ? lock_release+0x155/0x440 ? kernel_init_free_pages+0x6d/0x90 ? preempt_count_sub+0x9b/0xd0 ? kernel_init_free_pages+0x6d/0x90 ? get_page_from_freelist+0x12c4/0x1680 ? __rb_reserve_next+0xe5/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 event_triggers_call+0x52/0xe0 ftrace_syscall_enter+0x264/0x2c0 syscall_trace_enter.constprop.0+0x1ee/0x210 do_syscall_64+0x1c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Where it triggered a fault on strlen(key) where key was the filename. The reason is that filename is a char * to user space, and the histogram code just blindly dereferenced it, with obvious bad results. I originally tried to use strncpy_from_user/kernel_nofault() but found that there's other places that its dereferenced and not worth the effort. Just do not allow "char *" to act like strings. Link: https://lkml.kernel.org/r/20210715000206.025df9d2@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Tom Zanussi <zanussi@kernel.org> Fixes: 79e577cbce4c4 ("tracing: Support string type key properly") Fixes: 5967bd5c4239 ("tracing: Let filter_assign_type() detect FILTER_PTR_STRING") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-07-15Merge tag 'Wimplicit-fallthrough-clang-5.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo Silva: "This fixes many fall-through warnings when building with Clang and -Wimplicit-fallthrough, and also enables -Wimplicit-fallthrough for Clang, globally. It's also important to notice that since we have adopted the use of the pseudo-keyword macro fallthrough, we also want to avoid having more /* fall through */ comments being introduced. Contrary to GCC, Clang doesn't recognize any comments as implicit fall-through markings when the -Wimplicit-fallthrough option is enabled. So, in order to avoid having more comments being introduced, we use the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang, will cause a warning in case a code comment is intended to be used as a fall-through marking. The patch for Makefile also enforces this. We had almost 4,000 of these issues for Clang in the beginning, and there might be a couple more out there when building some architectures with certain configurations. However, with the recent fixes I think we are in good shape and it is now possible to enable the warning for Clang" * tag 'Wimplicit-fallthrough-clang-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits) Makefile: Enable -Wimplicit-fallthrough for Clang powerpc/smp: Fix fall-through warning for Clang dmaengine: mpc512x: Fix fall-through warning for Clang usb: gadget: fsl_qe_udc: Fix fall-through warning for Clang powerpc/powernv: Fix fall-through warning for Clang MIPS: Fix unreachable code issue MIPS: Fix fall-through warnings for Clang ASoC: Mediatek: MT8183: Fix fall-through warning for Clang power: supply: Fix fall-through warnings for Clang dmaengine: ti: k3-udma: Fix fall-through warning for Clang s390: Fix fall-through warnings for Clang dmaengine: ipu: Fix fall-through warning for Clang iommu/arm-smmu-v3: Fix fall-through warning for Clang mmc: jz4740: Fix fall-through warning for Clang PCI: Fix fall-through warning for Clang scsi: libsas: Fix fall-through warning for Clang video: fbdev: Fix fall-through warning for Clang math-emu: Fix fall-through warning cpufreq: Fix fall-through warning for Clang drm/msm: Fix fall-through warning in msm_gem_new_impl() ...
2021-07-15Merge branch 'bpf-timers'Daniel Borkmann
Alexei Starovoitov says: ==================== The first request to support timers in bpf was made in 2013 before sys_bpf syscall was added. That use case was periodic sampling. It was address with attaching bpf programs to perf_events. Then during XDP development the timers were requested to do garbage collection and health checks. They were worked around by implementing timers in user space and triggering progs with BPF_PROG_RUN command. The user space timers and perf_event+bpf timers are not armed by the bpf program. They're done asynchronously vs program execution. The XDP program cannot send a packet and arm the timer at the same time. The tracing prog cannot record an event and arm the timer right away. This large class of use cases remained unaddressed. The jiffy based and hrtimer based timers are essential part of the kernel development and with this patch set the hrtimer based timers will be available to bpf programs. TLDR: bpf timers is a wrapper of hrtimers with all the extra safety added to make sure bpf progs cannot crash the kernel. v6->v7: - address Andrii's comments and add his Acks. v5->v6: - address code review feedback from Martin and add his Acks. - add usercnt > 0 check to bpf_timer_init and remove timers_cancel_and_free second loop in map_free callbacks. - add cond_resched_rcu. v4->v5: - Martin noticed the following issues: . prog could be reallocated bpf_patch_insn_data(). Fixed by passing 'aux' into bpf_timer_set_callback, since 'aux' is stable during insn patching. . Added missing rcu_read_lock. . Removed redundant record_map. - Discovered few bugs with stress testing: . One cpu does htab_free_prealloced_timers->bpf_timer_cancel_and_free->hrtimer_cancel while another is trying to do something with the timer like bpf_timer_start/set_callback. Those ops try to acquire bpf_spin_lock that is already taken by bpf_timer_cancel_and_free, so both cpus spin forever. The same problem existed in bpf_timer_cancel(). One bpf prog on one cpu might call bpf_timer_cancel and wait, while another cpu is in the timer callback that tries to do bpf_timer_*() helper on the same timer. The fix is to do drop_prog_refcnt() and unlock. And only then hrtimer_cancel. Because of this had to add callback_fn != NULL check to bpf_timer_cb(). Also removed redundant bpf_prog_inc/put from bpf_timer_cb() and replaced with rcu_dereference_check similar to recent rcu_read_lock-removal from drivers. bpf_timer_cb is in softirq. . Managed to hit refcnt==0 while doing bpf_prog_put from bpf_timer_cancel_and_free(). That exposed the issue that bpf_prog_put wasn't ready to be called from irq context. Fixed similar to bpf_map_put which is irq ready. - Refactored BPF_CALL_1(bpf_spin_lock) into __bpf_spin_lock_irqsave() to make the main logic more clear, since Martin and Yonghong brought up this concern. v3->v4: 1. Split callback_fn from bpf_timer_start into bpf_timer_set_callback as suggested by Martin. That makes bpf timer api match one to one to kernel hrtimer api and provides greater flexibility. 2. Martin also discovered the following issue with uref approach: bpftool prog load xdp_timer.o /sys/fs/bpf/xdp_timer type xdp bpftool net attach xdpgeneric pinned /sys/fs/bpf/xdp_timer dev lo rm /sys/fs/bpf/xdp_timer nc -6 ::1 8888 bpftool net detach xdpgeneric dev lo The timer callback stays active in the kernel though the prog was detached and map usercnt == 0. It happened because 'bpftool prog load' pinned the prog only. The map usercnt went to zero. Subsequent attach and runs didn't affect map usercnt. The timer was able to start and bpf_prog_inc itself. When the prog was detached the prog stayed active. To address this issue added if (!atomic64_read(&(t->map->usercnt))) return -EPERM; to the first patch. Which means that timers are allowed only in the maps that are held by user space with open file descriptor or maps pinned in bpffs. 3. Discovered that timers in inner maps were broken. The inner map pointers are dynamic. Therefore changed bpf_timer_init() to accept explicit map pointer supplied by the program instead of hidden map pointer supplied by the verifier. To make sure that pointer to a timer actually belongs to that map added the verifier check in patch 3. 4. Addressed Yonghong's feedback. Improved comments and added dynamic in_nmi() check. Added Acks. v2->v3: The v2 approach attempted to bump bpf_prog refcnt when bpf_timer_start is called to make sure callback code doesn't disappear when timer is active and drop refcnt when timer cb is done. That led to a ton of race conditions between callback running and concurrent bpf_timer_init/start/cancel on another cpu, and concurrent bpf_map_update/delete_elem, and map destroy. Then v2.5 approach skipped prog refcnt altogether. Instead it remembered all timers that bpf prog armed in a link list and canceled them when prog refcnt went to zero. The race conditions disappeared, but timers in map-in-map could not be supported cleanly, since timers in inner maps have inner map's life time and don't match prog's life time. This v3 approach makes timers to be owned by maps. It allows timers in inner maps to be supported from the start. This apporach relies on "user refcnt" scheme used in prog_array that stores bpf programs for bpf_tail_call. The bpf_timer_start() increments prog refcnt, but unlike 1st approach the timer callback does decrement the refcnt. The ops->map_release_uref is responsible for cancelling the timers and dropping prog refcnt when user space reference to a map is dropped. That addressed all the races and simplified locking. Andrii presented a use case where specifying callback_fn in bpf_timer_init() is inconvenient vs specifying in bpf_timer_start(). The bpf_timer_init() typically is called outside for timer callback, while bpf_timer_start() most likely will be called from the callback. timer_cb() { ... bpf_timer_start(timer_cb); ...} looks like recursion and as infinite loop to the verifier. The verifier had to be made smarter to recognize such async callbacks. Patches 7,8,9 addressed that. Patch 1 and 2 refactoring. Patch 3 implements bpf timer helpers and locking. Patch 4 implements map side of bpf timer support. Patch 5 prevent pointer mismatch in bpf_timer_init. Patch 6 adds support for BTF in inner maps. Patch 7 teaches check_cfg() pass to understand async callbacks. Patch 8 teaches do_check() pass to understand async callbacks. Patch 9 teaches check_max_stack_depth() pass to understand async callbacks. Patches 10 and 11 are the tests. v1->v2: - Addressed great feedback from Andrii and Toke. - Fixed race between parallel bpf_timer_*() ops. - Fixed deadlock between timer callback and LRU eviction or bpf_map_delete/update. - Disallowed mmap and global timers. - Allow spin_lock and bpf_timer in an element. - Fixed memory leaks due to map destruction and LRU eviction. - A ton more tests. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>