summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-01-20samples/bpf: Use consistent include paths for libbpfToke Høiland-Jørgensen
Fix all files in samples/bpf to include libbpf header files with the bpf/ prefix, to be consistent with external users of the library. Also ensure that all includes of exported libbpf header files (those that are exported on 'make install' of the library) use bracketed includes instead of quoted. To make sure no new files are introduced that doesn't include the bpf/ prefix in its include, remove tools/lib/bpf from the include path entirely, and use tools/lib instead. Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560911.1683545.8795966751309534150.stgit@toke.dk
2020-01-20perf: Use consistent include paths for libbpfToke Høiland-Jørgensen
Fix perf to include libbpf header files with the bpf/ prefix, to be consistent with external users of the library. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560797.1683545.7685921032671386301.stgit@toke.dk
2020-01-20bpftool: Use consistent include paths for libbpfToke Høiland-Jørgensen
Fix bpftool to include libbpf header files with the bpf/ prefix, to be consistent with external users of the library. Also ensure that all includes of exported libbpf header files (those that are exported on 'make install' of the library) use bracketed includes instead of quoted. To make sure no new files are introduced that doesn't include the bpf/ prefix in its include, remove tools/lib/bpf from the include path entirely, and use tools/lib instead. Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560684.1683545.4765181397974997027.stgit@toke.dk
2020-01-20selftests: Use consistent include paths for libbpfToke Høiland-Jørgensen
Fix all selftests to include libbpf header files with the bpf/ prefix, to be consistent with external users of the library. Also ensure that all includes of exported libbpf header files (those that are exported on 'make install' of the library) use bracketed includes instead of quoted. To not break the build, keep the old include path until everything has been changed to the new one; a subsequent patch will remove that. Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560568.1683545.9649335788846513446.stgit@toke.dk
2020-01-20tools/runqslower: Use consistent include paths for libbpfToke Høiland-Jørgensen
Fix the runqslower tool to include libbpf header files with the bpf/ prefix, to be consistent with external users of the library. Also ensure that all includes of exported libbpf header files (those that are exported on 'make install' of the library) use bracketed includes instead of quoted. To not break the build, keep the old include path until everything has been changed to the new one; a subsequent patch will remove that. Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560457.1683545.9913736511685743625.stgit@toke.dk
2020-01-20selftests: Pass VMLINUX_BTF to runqslower MakefileToke Høiland-Jørgensen
Add a VMLINUX_BTF variable with the locally-built path when calling the runqslower Makefile from selftests. This makes sure a simple 'make' invocation in the selftests dir works even when there is no BTF information for the running kernel. Do a wildcard expansion and include the same paths for BTF for the running kernel as in the runqslower Makefile, to make it possible to build selftests without having a vmlinux in the local tree. Also fix the make invocation to use $(OUTPUT)/tools as the destination directory instead of $(CURDIR)/tools. Fixes: 3a0d3092a4ed ("selftests/bpf: Build runqslower from selftests") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560344.1683545.2723631988771664417.stgit@toke.dk
2020-01-20tools/bpf/runqslower: Fix override option for VMLINUX_BTFToke Høiland-Jørgensen
The runqslower tool refuses to build without a file to read vmlinux BTF from. The build fails with an error message to override the location by setting the VMLINUX_BTF variable if autodetection fails. However, the Makefile doesn't actually work with that override - the error message is still emitted. Fix this by including the value of VMLINUX_BTF in the expansion, and only emitting the error message if the *result* is empty. Also permit running 'make clean' even though no VMLINUX_BTF is set. Fixes: 9c01546d26d2 ("tools/bpf: Add runqslower tool to tools/bpf") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157952560237.1683545.17771785178857224877.stgit@toke.dk
2020-01-20samples/bpf: Don't try to remove user's homedir on cleanToke Høiland-Jørgensen
The 'clean' rule in the samples/bpf Makefile tries to remove backup files (ending in ~). However, if no such files exist, it will instead try to remove the user's home directory. While the attempt is mostly harmless, it does lead to a somewhat scary warning like this: rm: cannot remove '~': Is a directory Fix this by using find instead of shell expansion to locate any actual backup files that need to be removed. Fixes: b62a796c109c ("samples/bpf: allow make to be run from samples/bpf/ directory") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/157952560126.1683545.7273054725976032511.stgit@toke.dk
2020-01-20io_uring: fix compat for IORING_REGISTER_FILES_UPDATEEugene Syromiatnikov
fds field of struct io_uring_files_update is problematic with regards to compat user space, as pointer size is different in 32-bit, 32-on-64-bit, and 64-bit user space. In order to avoid custom handling of compat in the syscall implementation, make fds __u64 and use u64_to_user_ptr in order to retrieve it. Also, align the field naturally and check that no garbage is passed there. Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20selftests/bpf: Skip perf hw events test if the setup disabled itHangbin Liu
The same with commit 4e59afbbed96 ("selftests/bpf: skip nmi test when perf hw events are disabled"), it would make more sense to skip the test_stacktrace_build_id_nmi test if the setup (e.g. virtual machines) has disabled hardware perf events. Fixes: 13790d1cc72c ("bpf: add selftest for stackmap with build_id in NMI context") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200117100656.10359-1-liuhangbin@gmail.com
2020-01-20selftests/bpf: Don't check for btf fd in test_btfStanislav Fomichev
After commit 0d13bfce023a ("libbpf: Don't require root for bpf_object__open()") we no longer load BTF during bpf_object__open(), so let's remove the expectation from test_btf that the fd is not -1. The test currently fails. Before: BTF libbpf test[1] (test_btf_haskv.o): do_test_file:4152:FAIL bpf_object__btf_fd: -1 BTF libbpf test[2] (test_btf_newkv.o): do_test_file:4152:FAIL bpf_object__btf_fd: -1 BTF libbpf test[3] (test_btf_nokv.o): do_test_file:4152:FAIL bpf_object__btf_fd: -1 After: BTF libbpf test[1] (test_btf_haskv.o): OK BTF libbpf test[2] (test_btf_newkv.o): OK BTF libbpf test[3] (test_btf_nokv.o): OK Fixes: 0d13bfce023a ("libbpf: Don't require root for bpf_object__open()") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200118010546.74279-1-sdf@google.com
2020-01-20bpf: Fix memory leaks in generic update/delete batch opsBrian Vazquez
Generic update/delete batch ops functions were using __bpf_copy_key without properly freeing the memory. Handle the memory allocation and copy_from_user separately. Fixes: aa2e93b8e58e ("bpf: Add generic support for update and delete batch ops") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200119194040.128369-1-brianvv@google.com
2020-01-20tracing: Do not set trace clock if tracefs lockdown is in effectMasami Ichikawa
When trace_clock option is not set and unstable clcok detected, tracing_set_default_clock() sets trace_clock(ThinkPad A285 is one of case). In that case, if lockdown is in effect, null pointer dereference error happens in ring_buffer_set_clock(). Link: http://lkml.kernel.org/r/20200116131236.3866925-1-masami256@gmail.com Cc: stable@vger.kernel.org Fixes: 17911ff38aa58 ("tracing: Add locked_down checks to the open calls of files created for tracefs") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1788488 Signed-off-by: Masami Ichikawa <masami256@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-20tracing: Fix histogram code when expression has same var as valueSteven Rostedt (VMware)
While working on a tool to convert SQL syntex into the histogram language of the kernel, I discovered the following bug: # echo 'first u64 start_time u64 end_time pid_t pid u64 delta' >> synthetic_events # echo 'hist:keys=pid:start=common_timestamp' > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:delta=common_timestamp-$start,start2=$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger Would not display any histograms in the sched_switch histogram side. But if I were to swap the location of "delta=common_timestamp-$start" with "start2=$start" Such that the last line had: # echo 'hist:keys=next_pid:start2=$start,delta=common_timestamp-$start:onmatch(sched.sched_waking).trace(first,$start2,common_timestamp,next_pid,$delta)' > events/sched/sched_switch/trigger The histogram works as expected. What I found out is that the expressions clear out the value once it is resolved. As the variables are resolved in the order listed, when processing: delta=common_timestamp-$start The $start is cleared. When it gets to "start2=$start", it errors out with "unresolved symbol" (which is silent as this happens at the location of the trace), and the histogram is dropped. When processing the histogram for variable references, instead of adding a new reference for a variable used twice, use the same reference. That way, not only is it more efficient, but the order will no longer matter in processing of the variables. From Tom Zanussi: "Just to clarify some more about what the problem was is that without your patch, we would have two separate references to the same variable, and during resolve_var_refs(), they'd both want to be resolved separately, so in this case, since the first reference to start wasn't part of an expression, it wouldn't get the read-once flag set, so would be read normally, and then the second reference would do the read-once read and also be read but using read-once. So everything worked and you didn't see a problem: from: start2=$start,delta=common_timestamp-$start In the second case, when you switched them around, the first reference would be resolved by doing the read-once, and following that the second reference would try to resolve and see that the variable had already been read, so failed as unset, which caused it to short-circuit out and not do the trigger action to generate the synthetic event: to: delta=common_timestamp-$start,start2=$start With your patch, we only have the single resolution which happens correctly the one time it's resolved, so this can't happen." Link: https://lore.kernel.org/r/20200116154216.58ca08eb@gandalf.local.home Cc: stable@vger.kernel.org Fixes: 067fe038e70f6 ("tracing: Add variable reference handling to hist triggers") Reviewed-by: Tom Zanuss <zanussi@kernel.org> Tested-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-01-20Merge tag 'fixes_for_v5.5-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull reiserfs fix from Jan Kara: "A fixup of a recently merged reiserfs fix which has caused problem when xattrs were not compiled in" * tag 'fixes_for_v5.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: reiserfs: fix handling of -EOPNOTSUPP in reiserfs_for_each_xattr
2020-01-20irqdomain: Fix a memory leak in irq_domain_push_irq()Kevin Hao
Fix a memory leak reported by kmemleak: unreferenced object 0xffff000bc6f50e80 (size 128): comm "kworker/23:2", pid 201, jiffies 4294894947 (age 942.132s) hex dump (first 32 bytes): 00 00 00 00 41 00 00 00 86 c0 03 00 00 00 00 00 ....A........... 00 a0 b2 c6 0b 00 ff ff 40 51 fd 10 00 80 ff ff ........@Q...... backtrace: [<00000000e62d2240>] kmem_cache_alloc_trace+0x1a4/0x320 [<00000000279143c9>] irq_domain_push_irq+0x7c/0x188 [<00000000d9f4c154>] thunderx_gpio_probe+0x3ac/0x438 [<00000000fd09ec22>] pci_device_probe+0xe4/0x198 [<00000000d43eca75>] really_probe+0xdc/0x320 [<00000000d3ebab09>] driver_probe_device+0x5c/0xf0 [<000000005b3ecaa0>] __device_attach_driver+0x88/0xc0 [<000000004e5915f5>] bus_for_each_drv+0x7c/0xc8 [<0000000079d4db41>] __device_attach+0xe4/0x140 [<00000000883bbda9>] device_initial_probe+0x18/0x20 [<000000003be59ef6>] bus_probe_device+0x98/0xa0 [<0000000039b03d3f>] deferred_probe_work_func+0x74/0xa8 [<00000000870934ce>] process_one_work+0x1c8/0x470 [<00000000e3cce570>] worker_thread+0x1f8/0x428 [<000000005d64975e>] kthread+0xfc/0x128 [<00000000f0eaa764>] ret_from_fork+0x10/0x18 Fixes: 495c38d3001f ("irqdomain: Add irq_domain_{push,pop}_irq() functions") Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200120043547.22271-1-haokexin@gmail.com
2020-01-20irqchip: Add NXP INTMUX interrupt multiplexer supportJoakim Zhang
The Interrupt Multiplexer (INTMUX) expands the number of peripherals that can interrupt the core: * The INTMUX has 8 channels that are assigned to 8 NVIC interrupt slots. * Each INTMUX channel can receive up to 32 interrupt sources and has 1 interrupt output. * The INTMUX routes the interrupt sources to the interrupt outputs. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200117060653.27485-3-qiangqing.zhang@nxp.com
2020-01-20dt-bindings: interrupt-controller: Add binding for NXP INTMUX interrupt ↵Joakim Zhang
multiplexer This patch adds the DT bindings for the NXP INTMUX interrupt multiplexer for i.MX8 family SoCs. Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20200117060653.27485-2-qiangqing.zhang@nxp.com
2020-01-20irqchip: Define EXYNOS_IRQ_COMBINERHyunki Koo
This patch is written to clean up dependency of ARCH_EXYNOS Not all exynos device have IRQ_COMBINER, especially aarch64 EXYNOS but it is built for all exynos devices. Thus add the config for EXYNOS_IRQ_COMBINER remove direct dependency between ARCH_EXYNOS and exynos-combiner.c and only selected on the aarch32 devices Signed-off-by: Hyunki Koo <hyunki00.koo@samsung.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20191224211108.7128-1-hyunki00.koo@gmail.com
2020-01-20irqchip/meson-gpio: Add support for meson a1 SoCsQianggui Song
The meson a1 Socs have some changes compared with previous chips. For A113L, it contains 62 pins and can be spied on: - 62:128 undefined - 61:50 12 pins on bank A - 49:37 13 pins on bank F - 36:20 17 pins on bank X - 19:13 7 pins on bank B - 12:0 13 pins on bank P There are five relative registers for gpio interrupt controller, details are as below: - PADCTRL_GPIO_IRQ_CTRL0 bit[31]: enable/disable the whole irq lines bit[16-23]: both edge trigger bit[8-15]: single edge trigger bit[0-7]: pol trigger - PADCTRL_GPIO_IRQ_CTRL[X] bit[0-6]: 7 bits to choose gpio source for irq line 2*[X] - 2 bit[16-22]: 7 bits to choose gpio source for irq line 2*[X] - 1 where X =1,2,3,4 Signed-off-by: Qianggui Song <qianggui.song@amlogic.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20191216123645.10099-4-qianggui.song@amlogic.com
2020-01-20irqchip/meson-gpio: Rework meson irqchip driver to support meson-A1 SoCsQianggui Song
Since Meson-A1 SoCs register layout of gpio interrupt controller has difference with previous chips, registers to decide irq line and offset of trigger method are all changed, the current driver should be modified. Signed-off-by: Qianggui Song <qianggui.song@amlogic.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20191216123645.10099-3-qianggui.song@amlogic.com
2020-01-20dt-bindings: interrupt-controller: New binding for Meson-A1 SoCsQianggui Song
Update dt-binding document for GPIO interrupt controller of Meson-A1 SoCs Signed-off-by: Qianggui Song <qianggui.song@amlogic.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20191216123645.10099-2-qianggui.song@amlogic.com
2020-01-20irqchip/mbigen: Set driver .suppress_bind_attrs to avoid remove problemsJohn Garry
The following crash can be seen for setting CONFIG_DEBUG_TEST_DRIVER_REMOVE=y for DT FW (which some people still use): Hisilicon MBIGEN-V2 60080000.interrupt-controller: Failed to create mbi-gen irqdomain Hisilicon MBIGEN-V2: probe of 60080000.interrupt-controller failed with error -12 [...] Unable to handle kernel paging request at virtual address 0000000000005008 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000041fb9990000 [0000000000005008] pgd=0000000000000000 Internal error: Oops: 96000004 [#1] PREEMPT SMP Modules linked in: CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc6-00002-g3fc42638a506-dirty #1622 Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 pstate: 40000085 (nZcv daIf -PAN -UAO) pc : mbigen_set_type+0x38/0x60 lr : __irq_set_trigger+0x6c/0x188 sp : ffff800014b4b400 x29: ffff800014b4b400 x28: 0000000000000007 x27: 0000000000000000 x26: 0000000000000000 x25: ffff041fd83bd0d4 x24: ffff041fd83bd188 x23: 0000000000000000 x22: ffff80001193ce00 x21: 0000000000000004 x20: 0000000000000000 x19: ffff041fd83bd000 x18: ffffffffffffffff x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000119098c8 x14: ffff041fb94ec91c x13: ffff041fb94ec1a1 x12: 0000000000000030 x11: 0101010101010101 x10: 0000000000000040 x9 : 0000000000000000 x8 : ffff041fb98c6680 x7 : ffff800014b4b380 x6 : ffff041fd81636c8 x5 : 0000000000000000 x4 : 000000000000025f x3 : 0000000000005000 x2 : 0000000000005008 x1 : 0000000000000004 x0 : 0000000080000000 Call trace: mbigen_set_type+0x38/0x60 __setup_irq+0x744/0x900 request_threaded_irq+0xe0/0x198 pcie_pme_probe+0x98/0x118 pcie_port_probe_service+0x38/0x78 really_probe+0xa0/0x3e0 driver_probe_device+0x58/0x100 __device_attach_driver+0x90/0xb0 bus_for_each_drv+0x64/0xc8 __device_attach+0xd8/0x138 device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 device_add+0x4c4/0x770 device_register+0x1c/0x28 pcie_port_device_register+0x1e4/0x4f0 pcie_portdrv_probe+0x34/0xd8 local_pci_probe+0x3c/0xa0 pci_device_probe+0x128/0x1c0 really_probe+0xa0/0x3e0 driver_probe_device+0x58/0x100 __device_attach_driver+0x90/0xb0 bus_for_each_drv+0x64/0xc8 __device_attach+0xd8/0x138 device_attach+0x10/0x18 pci_bus_add_device+0x4c/0xb8 pci_bus_add_devices+0x38/0x88 pci_host_probe+0x3c/0xc0 pci_host_common_probe+0xf0/0x208 hisi_pcie_almost_ecam_probe+0x24/0x30 platform_drv_probe+0x50/0xa0 really_probe+0xa0/0x3e0 driver_probe_device+0x58/0x100 device_driver_attach+0x6c/0x90 __driver_attach+0x84/0xc8 bus_for_each_dev+0x74/0xc8 driver_attach+0x20/0x28 bus_add_driver+0x148/0x1f0 driver_register+0x60/0x110 __platform_driver_register+0x40/0x48 hisi_pcie_almost_ecam_driver_init+0x1c/0x24 The specific problem here is that the mbigen driver real probe has failed as the mbigen_of_create_domain()->of_platform_device_create() call fails, the reason for that being that we never destroyed the platform device created during the remove test dry run and there is some conflict. Since we generally would never want to unbind this driver, and to save adding a driver tear down path for that, just set the driver .suppress_bind_attrs member to avoid this possibility. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/1579196323-180137-1-git-send-email-john.garry@huawei.com
2020-01-20irqchip: Add Aspeed SCU interrupt controllerEddie James
The Aspeed SOCs provide some interrupts through the System Control Unit registers. Add an interrupt controller that provides these interrupts to the system. Signed-off-by: Eddie James <eajames@linux.ibm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lore.kernel.org/r/1579123790-6894-3-git-send-email-eajames@linux.ibm.com
2020-01-20dt-bindings: interrupt-controller: Add Aspeed SCU interrupt controllerEddie James
Document the Aspeed SCU interrupt controller and add an include file for the interrupts it provides. Signed-off-by: Eddie James <eajames@linux.ibm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lore.kernel.org/r/1579123790-6894-2-git-send-email-eajames@linux.ibm.com
2020-01-20gpio/sifive: Add GPIO driver for SiFive SoCsYash Shah
Adds the GPIO driver for SiFive RISC-V SoCs. Signed-off-by: Wesley W. Terpstra <wesley@sifive.com> [Atish: Various fixes and code cleanup] Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Yash Shah <yash.shah@sifive.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/1575976274-13487-6-git-send-email-yash.shah@sifive.com
2020-01-20regulator: core: Fix exported symbols to the exported GPL versionEnric Balletbo i Serra
Change the exported symbols introduced by commit e9153311491da ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage") from EXPORT_SYMBOL() to EXPORT_SYMBOL_GPL(), like is used for all the core parts. Fixes: e9153311491da ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage") Reported-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: https://lore.kernel.org/r/20200120123921.1204339-1-enric.balletbo@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-01-20ubifs: use IS_ENCRYPTED() instead of ubifs_crypt_is_encrypted()Eric Biggers
There's no need for the ubifs_crypt_is_encrypted() function anymore. Just use IS_ENCRYPTED() instead, like ext4 and f2fs do. IS_ENCRYPTED() checks the VFS-level flag instead of the UBIFS-specific flag, but it shouldn't change any behavior since the flags are kept in sync. Link: https://lore.kernel.org/r/20191209212721.244396-1-ebiggers@kernel.org Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-01-20netfilter: ipset: use bitmap infrastructure completelyKadlecsik József
The bitmap allocation did not use full unsigned long sizes when calculating the required size and that was triggered by KASAN as slab-out-of-bounds read in several places. The patch fixes all of them. Reported-by: syzbot+fabca5cbf5e54f3fe2de@syzkaller.appspotmail.com Reported-by: syzbot+827ced406c9a1d9570ed@syzkaller.appspotmail.com Reported-by: syzbot+190d63957b22ef673ea5@syzkaller.appspotmail.com Reported-by: syzbot+dfccdb2bdb4a12ad425e@syzkaller.appspotmail.com Reported-by: syzbot+df0d0f5895ef1f41a65b@syzkaller.appspotmail.com Reported-by: syzbot+b08bd19bb37513357fd4@syzkaller.appspotmail.com Reported-by: syzbot+53cdd0ec0bbabd53370a@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-20btrfs: device stats, log when stats are zeroedAnand Jain
We had a report indicating that some read errors aren't reported by the device stats in the userland. It is important to have the errors reported in the device stat as user land scripts might depend on it to take the reasonable corrective actions. But to debug these issue we need to be really sure that request to reset the device stat did not come from the userland itself. So log an info message when device error reset happens. For example: BTRFS info (device sdc): device stats zeroed by btrfs(9223) Reported-by: philip@philip-seeger.de Link: https://www.spinics.net/lists/linux-btrfs/msg96528.html Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: fix improper setting of scanned for range cyclic write cache pagesJosef Bacik
We noticed that we were having regular CG OOM kills in cases where there was still enough dirty pages to avoid OOM'ing. It turned out there's this corner case in btrfs's handling of range_cyclic where files that were being redirtied were not getting fully written out because of how we do range_cyclic writeback. We unconditionally were setting scanned = 1; the first time we found any pages in the inode. This isn't actually what we want, we want it to be set if we've scanned the entire file. For range_cyclic we could be starting in the middle or towards the end of the file, so we could write one page and then not write any of the other dirty pages in the file because we set scanned = 1. Fix this by not setting scanned = 1 if we find pages. The rules for setting scanned should be 1) !range_cyclic. In this case we have a specified range to write out. 2) range_cyclic && index == 0. In this case we've started at the beginning and there is no need to loop around a second time. 3) range_cyclic && we started at index > 0 and we've reached the end of the file without satisfying our nr_to_write. This patch fixes both of our writepages implementations to make sure these rules hold true. This fixed our over zealous CG OOMs in production. Fixes: d1310b2e0cd9 ("Btrfs: Split the extent_map code into two parts") Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: safely advance counter when looking up bio csumsDavid Sterba
Dan's smatch tool reports fs/btrfs/file-item.c:295 btrfs_lookup_bio_sums() warn: should this be 'count == -1' which points to the while (count--) loop. With count == 0 the check itself could decrement it to -1. There's a WARN_ON a few lines below that has never been seen in practice though. It turns out that the value of page_bytes_left matches the count (by sectorsize multiples). The loop never reaches the state where count would go to -1, because page_bytes_left == 0 is found first and this breaks out. For clarity, use only plain check on count (and only for positive value), decrement safely inside the loop. Any other discrepancy after the whole bio list processing should be reported by the exising WARN_ON_ONCE as well. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: remove unused member btrfs_device::workDavid Sterba
This is a leftover from recently removed bio scheduling framework. Fixes: ba8a9d079543 ("Btrfs: delete the entire async bio submission framework") Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: remove unnecessary wrapper get_alloc_profileJohannes Thumshirn
btrfs_get_alloc_profile() is a simple wrapper over get_alloc_profile(). The only difference is btrfs_get_alloc_profile() is visible to other functions in btrfs while get_alloc_profile() is static and thus only visible to functions in block-group.c. Let's just fold get_alloc_profile() into btrfs_get_alloc_profile() to get rid of the unnecessary second function. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Johannes Thumshirn <jth@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: add correction to handle -1 edge case in async discardDennis Zhou
From Dave's testing described below, it's possible to drive a file system to have bogus values of discardable_extents and _bytes. As btrfs_discard_calc_delay() is the only user of discardable_extents, we can correct here for any negative discardable_extents/discardable_bytes. The problem is not reliably reproducible. The workload that created it was based on linux git tree, switching between release tags, then everytihng deleted followed by a full rebalance. At this state the values of discardable_bytes was 16K and discardable_extents was -1, expected values 0 and 0. Repeating the workload again did not correct the bogus values so the offset seems to be stable once it happens. Reported-by: David Sterba <dsterba@suse.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: ensure removal of discardable_* in free_bitmap()Dennis Zhou
Most callers of free_bitmap() only call it if bitmap_info->bytes is 0. However, there are certain cases where we may free the free space cache via __btrfs_remove_free_space_cache(). This exposes a path where free_bitmap() is called regardless. This may result in a bad accounting situation for discardable_bytes and discardable_extents. So, remove the stats and call btrfs_discard_update_discardable(). Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: make smaller extents more likely to go into bitmapsDennis Zhou
It's less than ideal for small extents to eat into our extent budget, so force extents <= 32KB into the bitmaps save for the first handful. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: increase the metadata allowance for the free_space_cacheDennis Zhou
Currently, there is no way for the free space cache to recover from being serviced by purely bitmaps because the extent threshold is set to 0 in recalculate_thresholds() when we surpass the metadata allowance. This adds a recovery mechanism by keeping large extents out of the bitmaps and increases the metadata upper bound to 64KB. The recovery mechanism bypasses this upper bound, thus making it a soft upper bound. But, with the bypass being 1MB or greater, it shouldn't add unbounded overhead. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: add async discard implementation overviewDennis Zhou
Give a brief overview for how async discard is implemented. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: keep track of discard reuse statsDennis Zhou
Keep track of how much we are discarding and how often we are reusing with async discard. The discard_*_bytes values don't need any special protection because the work item provides the single threaded access. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: only keep track of data extents for async discardDennis Zhou
As mentioned earlier, discarding data can be done either by issuing an explicit discard or implicitly by reusing the LBA. Metadata block_groups see much more frequent reuse due to well it being metadata. So instead of explicitly discarding metadata block_groups, just leave them be and let the latter implicit discarding be done for them. For mixed block_groups, block_groups which contain both metadata and data, we let them be as higher fragmentation is expected. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: have multiple discard listsDennis Zhou
Non-block group destruction discarding currently only had a single list with no minimum discard length. This can lead to caravaning more meaningful discards behind a heavily fragmented block group. This adds support for multiple lists with minimum discard lengths to prevent the caravan effect. We promote block groups back up when we exceed the BTRFS_ASYNC_DISCARD_MAX_FILTER size, currently we support only 2 lists with filters of 1MB and 32KB respectively. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: make max async discard size tunableDennis Zhou
Expose max_discard_size as a tunable via sysfs and switch the current fixed maximum to the default value. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: limit max discard size for async discardDennis Zhou
Throttle the maximum size of a discard so that we can provide an upper bound for the rate of async discard. While the block layer is able to split discards into the appropriate sized discards, we want to be able to account more accurately the rate at which we are consuming NCQ slots as well as limit the upper bound of work for a discard. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: add kbps discard rate limit for async discardDennis Zhou
Provide the ability to rate limit based on kbps in addition to iops as additional guides for the target discard rate. The delay used ends up being max(kbps_delay, iops_delay). Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: calculate discard delay based on number of extentsDennis Zhou
An earlier patch keeps track of discardable_extents. These are undiscarded extents managed by the free space cache. Here, we will use this to dynamically calculate the discard delay interval. There are 3 rate to consider. The first is the target convergence rate, the rate to discard all discardable_extents over the BTRFS_DISCARD_TARGET_MSEC time frame. This is clamped by the lower limit, the iops limit or BTRFS_DISCARD_MIN_DELAY (1ms), and the upper limit, BTRFS_DISCARD_MAX_DELAY (1s). We reevaluate this delay every transaction commit. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: keep track of discardable_bytes for async discardDennis Zhou
Keep track of this metric so that we can understand how ahead or behind we are in discarding rate. This uses the same accounting method as discardable_extents, deltas between previous/current values and propagating them up. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: track discardable extents for async discardDennis Zhou
The number of discardable extents will serve as the rate limiting metric for how often we should discard. This keeps track of discardable extents in the free space caches by maintaining deltas and propagating them to the global count. The deltas are calculated from 2 values stored in PREV and CURR entries, then propagated up to the global discard ctl. The current counter value becomes the previous counter value after update. Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: sysfs: add UUID/debug/discard directoryDennis Zhou
Setup base sysfs directory for discard stats + tunables. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-20btrfs: sysfs: make UUID/debug have its own kobjectDennis Zhou
Btrfs only allowed attributes to be exposed in debug/. Let's let other groups be created by making debug its own kobject. This also makes the per-fs debug options separate from the global features mount attributes. This seems to be needed as sysfs_create_files() requires const struct attribute * while sysfs_create_group() can take struct attribute *. This seems nicer as per file system, you'll probably use to_fs_info(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>