summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-07-05net: netsec: Sync dma for device on buffer allocationIlias Apalodimas
Quoting Arnd, We have to do a sync_single_for_device /somewhere/ before the buffer is given to the device. On a non-cache-coherent machine with a write-back cache, there may be dirty cache lines that get written back after the device DMA's data into it (e.g. from a previous memset from before the buffer got freed), so you absolutely need to flush any dirty cache lines on it first. Since the coherency is configurable in this device make sure we cover all configurations by explicitly syncing the allocated buffer for the device before refilling it's descriptors Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'hns3-next'David S. Miller
Huazhong Tan says: ==================== net: hns3: some cleanups & bugfixes This patch-set includes cleanups and bugfixes for the HNS3 ethernet controller driver. [patch 1/9] fixes VF's broadcast promisc mode not enabled after initializing. [patch 2/9] adds hints for fibre port not support flow control. [patch 3/9] fixes a port capbility updating issue. [patch 4/9 - 9/9] adds some cleanups for HNS3 driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: set maximum length to resp_data_len for exceptional casePeng Li
If HCLGE_MBX_MAX_RESP_DATA_SIZE > HCLGE_MBX_MAX_RESP_DATA_SIZE, the memcpy will cause out of memory. So this patch just set resp_data_len to the maximum length for this case. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: bitwise operator should use unsigned typeYonglong Liu
There are some bitwise operator used signed type, this patch fixes them with unsigned type. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: add default value for tc_size and tc_offsetPeng Li
This patch adds default value for tc_size and tc_offset, or it may get random value and used later. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: check msg_data before memcpy in hclgevf_send_mbx_msgWeihang Li
The value of msg_data may be NULL in some cases, which will cause errors reported by some compiler. So this patch adds a check to fix it. Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: set default value for param "type" in hclgevf_bind_ring_to_vectorPeng Li
The value of param type is always not changed in hclgevf_bind_ring_to_vector, move the assignment to front of "for {}" can reduce the redundant assignment. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: add all IMP return codePeng Li
Currently, the HNS3 driver just defines part of IMP return code, This patch supplements all the remaining IMP return code, and adds a function to convert this code to the error number. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: fix port capbility updating issueJian Shen
Currently, the driver queries the media port information, and updates the port capability periodically. But it sets an error mac->speed_type value, which stops update port capability. Fixes: 88d10bd6f730 ("net: hns3: add support for multiple media type") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: fix flow control configure issue for fibre portJian Shen
Flow control autoneg is unsupported for fibre port. It takes no effect for flow control when restart autoneg. This patch fixes it, return -EOPNOTSUPP when user tries to enable flow control autoneg. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: hns3: enable broadcast promisc mode when initializing VFJian Shen
For revision 0x20, the broadcast promisc is enabled by firmware, it's unnecessary to enable it when initializing VF. For revision 0x21, it's necessary to enable broadcast promisc mode when initializing or re-initializing VF, otherwise, it will be unable to send and receive promisc packets. Fixes: f01f5559cac8 ("net: hns3: don't allow vf to enable promisc mode") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05r8152: set RTL8152_UNPLUG only for real disconnectionHayes Wang
Set the flag of RTL8152_UNPLUG if and only if the device is unplugged. Some error codes sometimes don't mean the real disconnection of usb device. For those situations, set the flag of RTL8152_UNPLUG causes the driver skips some flows of disabling the device, and it let the device stay at incorrect state. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: remove unused parameter from skb_checksum_try_convertLi RongQing
the check parameter is never used Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'mlxsw-Enable-disable-PTP-shapers'David S. Miller
Ido Schimmel says: ==================== mlxsw: Enable/disable PTP shapers Shalom says: In order to get more accurate hardware time stamping in Spectrum-1, the driver needs to apply a shaper on the port for speeds lower than 40Gbps. This shaper is called a PTP shaper and it is applied on hierarchy 0, which is the port hierarchy. This shaper may affect the shaper rates of all hierarchies. This patchset adds the ability to enable or disable the PTP shaper on the port in two scenarios: 1. When the user wants to enable/disable the hardware time stamping 2. When the port is brought up or down (including port speed change) Patch #1 adds the QEEC.ptps field that is used for enabling or disabling the PTP shaper on a port. Patch #2 adds a note about disabling the PTP shaper when calling to mlxsw_sp_port_ets_maxrate_set(). Patch #3 adds the QPSC register that is responsible for configuring the PTP shaper parameters per speed. Patch #4 sets the PTP shaper parameters during the ptp_init(). Patch #5 adds new operation for getting the port's speed. Patch #6 enables/disables the PTP shaper when turning on or off the hardware time stamping. Patch #7 enables/disables the PTP shaper when the port's status has changed (including port speed change). Patch #8 applies the PTP shaper enable/disable logic by filling the PTP shaper parameters array. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: spectrum_ptp: Apply the PTP shaper enable/disable logicShalom Toledo
Apply by filling the PTP shaper parameters array. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: spectrum: Set up PTP shaper when port status has changedShalom Toledo
When getting port up down event (PUDE), change the PTP shaper configuration based on hardware time stamping on/off and the port's speed. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: spectrum_ptp: Enable/disable PTP shaper on a port when getting ↵Shalom Toledo
HWTSTAMP on/off In order to get more accurate hardware time stamping, the driver needs to enable PTP shaper on the port, for speeds lower than 40 Gbps. Enable the PTP shaper on the port when the user turns on the hardware time stamping, and disable it when the user turns off the hardware time stamping. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: spectrum: Add new operation for getting the port's speedShalom Toledo
New operation for getting the port's speed as part of port-type-speed operations. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: spectrum_ptp: Set the PTP shaper parametersShalom Toledo
Set the PTP shaper parameters during the ptp_init(). For different speeds, there are different parameters. When the port's speed changes and PTP shaper is enabled, the firmware changes the ETS shaper values according to the PTP shaper parameters for this new speed. The PTP shaper parameters array is left empty for now, will be filled in a follow-up patch. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: reg: Add QoS PTP Shaper Configuration RegisterShalom Toledo
The QPSC allows advanced configuration of the PTP shapers. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: spectrum: Add note about the PTP shaperShalom Toledo
Add note about disabling the PTP shaper when calling to mlxsw_sp_port_ets_maxrate_set(). Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05mlxsw: reg: Add ptps field in QoS ETS Element Configuration RegisterShalom Toledo
The PTP Shaper field is used for enabling and disabling of port-rate based shaper which is slightly lower than port rate. Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: socionext: remove set but not used variable 'pkts'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/socionext/netsec.c: In function 'netsec_clean_tx_dring': drivers/net/ethernet/socionext/netsec.c:637:15: warning: variable 'pkts' set but not used [-Wunused-but-set-variable] It is not used since commit ba2b232108d3 ("net: netsec: add XDP support") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: ethernet: allwinner: Remove unneeded memsetHariprasad Kelam
Remove unneeded memset as alloc_etherdev is using kvzalloc which uses __GFP_ZERO flag Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/ethernet: using dev_get_drvdata directlyFuqian Huang
Several drivers cast a struct device pointer to a struct platform_device pointer only to then call platform_get_drvdata(). To improve readability, these constructs can be simplified by using dev_get_drvdata() directly. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/can: using dev_get_drvdata directlyFuqian Huang
Several drivers cast a struct device pointer to a struct platform_device pointer only to then call platform_get_drvdata(). To improve readability, these constructs can be simplified by using dev_get_drvdata() directly. Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'hsr-bug-fixes'David S. Miller
Cong Wang says: ==================== hsr: a few bug fixes This patchset contains 3 bug fixes for hsr triggered by a syzbot reproducer, please check each patch for details. ==================== Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
2019-07-05hsr: fix a NULL pointer deref in hsr_dev_xmit()Cong Wang
hsr_port_get_hsr() could return NULL and kernel could crash: BUG: kernel NULL pointer dereference, address: 0000000000000010 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 8000000074b84067 P4D 8000000074b84067 PUD 7057d067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 754 Comm: a.out Not tainted 5.2.0-rc6+ #718 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:hsr_dev_xmit+0x20/0x31 Code: 48 8b 1b eb e0 5b 5d 41 5c c3 66 66 66 66 90 55 48 89 fd 48 8d be 40 0b 00 00 be 04 00 00 00 e8 ee f2 ff ff 48 89 ef 48 89 c6 <48> 8b 40 10 48 89 45 10 e8 6c 1b 00 00 31 c0 5d c3 66 66 66 66 90 RSP: 0018:ffffb5b400003c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9821b4509a88 RCX: 0000000000000000 RDX: ffff9821b4509a88 RSI: 0000000000000000 RDI: ffff9821bc3fc7c0 RBP: ffff9821bc3fc7c0 R08: 0000000000000000 R09: 00000000000c2019 R10: 0000000000000000 R11: 0000000000000002 R12: ffff9821bc3fc7c0 R13: ffff9821b4509a88 R14: 0000000000000000 R15: 000000000000006e FS: 00007fee112a1800(0000) GS:ffff9821bd800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000006e9ce000 CR4: 00000000000406f0 Call Trace: <IRQ> netdev_start_xmit+0x1b/0x38 dev_hard_start_xmit+0x121/0x21e ? validate_xmit_skb.isra.0+0x19/0x1e3 __dev_queue_xmit+0x74c/0x823 ? lockdep_hardirqs_on+0x12b/0x17d ip6_finish_output2+0x3d3/0x42c ? ip6_mtu+0x55/0x5c ? mld_sendpack+0x191/0x229 mld_sendpack+0x191/0x229 mld_ifc_timer_expire+0x1f7/0x230 ? mld_dad_timer_expire+0x58/0x58 call_timer_fn+0x12e/0x273 __run_timers.part.0+0x174/0x1b5 ? mld_dad_timer_expire+0x58/0x58 ? sched_clock_cpu+0x10/0xad ? mark_lock+0x26/0x1f2 ? __lock_is_held+0x40/0x71 run_timer_softirq+0x26/0x48 __do_softirq+0x1af/0x392 irq_exit+0x53/0xa2 smp_apic_timer_interrupt+0x1c4/0x1d9 apic_timer_interrupt+0xf/0x20 </IRQ> Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05hsr: implement dellink to clean up resourcesCong Wang
hsr_link_ops implements ->newlink() but not ->dellink(), which leads that resources not released after removing the device, particularly the entries in self_node_db and node_db. So add ->dellink() implementation to replace the priv_destructor. This also makes the code slightly easier to understand. Reported-by: syzbot+c6167ec3de7def23d1e8@syzkaller.appspotmail.com Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05hsr: fix a memory leak in hsr_del_port()Cong Wang
hsr_del_port() should release all the resources allocated in hsr_add_port(). As a consequence of this change, hsr_for_each_port() is no longer safe to work with hsr_del_port(), switch to list_for_each_entry_safe() as we always hold RTNL lock. Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-06selftests/bpf: add test_tcp_rtt to .gitignoreStanislav Fomichev
Forgot to add it in the original patch. Fixes: b55873984dab ("selftests/bpf: test BPF_SOCK_OPS_RTT_CB") Reported-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-06selftests/bpf: fix test_align liveliness expectationsStanislav Fomichev
Commit 2589726d12a1 ("bpf: introduce bounded loops") caused a change in the way some registers liveliness is reported in the test_align. Add missing "_w" to a couple of tests. Note, there are no offset changes! Fixes: 2589726d12a1 ("bpf: introduce bounded loops") Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2019-07-05 1) A lot of work to remove indirections from the xfrm code. From Florian Westphal. 2) Fix a WARN_ON with ipv6 that triggered because of a forgotten break statement. From Florian Westphal. 3) Remove xfrmi_init_net, it is not needed. From Li RongQing. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-07-05 1) Fix xfrm selector prefix length validation for inter address family tunneling. From Anirudh Gupta. 2) Fix a memleak in pfkey. From Jeremy Sowden. 3) Fix SA selector validation to allow empty selectors again. From Nicolas Dichtel. 4) Select crypto ciphers for xfrm_algo, this fixes some randconfig builds. From Arnd Bergmann. 5) Remove a duplicated assignment in xfrm_bydst_resize. From Cong Wang. 6) Fix a hlist corruption on hash rebuild. From Florian Westphal. 7) Fix a memory leak when creating xfrm interfaces. From Nicolas Dichtel. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05bpf, riscv: Enable zext optimization for more RV64G ALU opsLuke Nelson
Commit 66d0d5a854a6 ("riscv: bpf: eliminate zero extension code-gen") added the new zero-extension optimization for some BPF ALU operations. Since then, bugs in the JIT that have been fixed in the bpf tree require this optimization to be added to other operations: commit 1e692f09e091 ("bpf, riscv: clear high 32 bits for ALU32 add/sub/neg/lsh/rsh/arsh"), and commit fe121ee531d1 ("bpf, riscv: clear target register high 32-bits for and/or/xor on ALU32"). Now that these have been merged to bpf-next, the zext optimization can be enabled for the fixed operations. Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Jiong Wang <jiong.wang@netronome.com> Cc: Xi Wang <xi.wang@gmail.com> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Acked-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05netfilter: nf_tables: __nft_expr_type_get() selects specific family typePablo Neira Ayuso
In case that there are two types, prefer the family specify extension. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05tools: bpftool: Fix json dump crash on powerpcJiri Olsa
Michael reported crash with by bpf program in json mode on powerpc: # bpftool prog -p dump jited id 14 [{ "name": "0xd00000000a9aa760", "insns": [{ "pc": "0x0", "operation": "nop", "operands": [null ] },{ "pc": "0x4", "operation": "nop", "operands": [null ] },{ "pc": "0x8", "operation": "mflr", Segmentation fault (core dumped) The code is assuming char pointers in format, which is not always true at least for powerpc. Fixing this by dumping the whole string into buffer based on its format. Please note that libopcodes code does not check return values from fprintf callback, but as per Jakub suggestion returning -1 on allocation failure so we do the best effort to propagate the error. Fixes: 107f041212c1 ("tools: bpftool: add JSON output for `bpftool prog dump jited *` command") Reported-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05netfilter: nf_tables: add nft_expr_type_request_module()Pablo Neira Ayuso
This helper function makes sure the family specific extension is loaded. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05tools: bpftool: add "prog run" subcommand to test-run programsQuentin Monnet
Add a new "bpftool prog run" subcommand to run a loaded program on input data (and possibly with input context) passed by the user. Print output data (and output context if relevant) into a file or into the console. Print return value and duration for the test run into the console. A "repeat" argument can be passed to run the program several times in a row. The command does not perform any kind of verification based on program type (Is this program type allowed to use an input context?) or on data consistency (Can I work with empty input data?), this is left to the kernel. Example invocation: # perl -e 'print "\x0" x 14' | ./bpftool prog run \ pinned /sys/fs/bpf/sample_ret0 \ data_in - data_out - repeat 5 0000000 0000 0000 0000 0000 0000 0000 0000 | ........ ...... Return value: 0, duration (average): 260ns When one of data_in or ctx_in is "-", bpftool reads from standard input, in binary format. Other formats (JSON, hexdump) might be supported (via an optional command line keyword like "data_fmt_in") in the future if relevant, but this would require doing more parsing in bpftool. v2: - Fix argument names for function check_single_stdin(). (Yonghong) Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05blk-iolatency: fix STS_AGAIN handlingDennis Zhou
The iolatency controller is based on rq_qos. It increments on rq_qos_throttle() and decrements on either rq_qos_cleanup() or rq_qos_done_bio(). a3fb01ba5af0 fixes the double accounting issue where blk_mq_make_request() may call both rq_qos_cleanup() and rq_qos_done_bio() on REQ_NO_WAIT. So checking STS_AGAIN prevents the double decrement. The above works upstream as the only way we can get STS_AGAIN is from blk_mq_get_request() failing. The STS_AGAIN handling isn't a real problem as bio_endio() skipping only happens on reserved tag allocation failures which can only be caused by driver bugs and already triggers WARN. However, the fix creates a not so great dependency on how STS_AGAIN can be propagated. Internally, we (Facebook) carry a patch that kills read ahead if a cgroup is io congested or a fatal signal is pending. This combined with chained bios progagate their bi_status to the parent is not already set can can cause the parent bio to not clean up properly even though it was successful. This consequently leaks the inflight counter and can hang all IOs under that blkg. To nip the adverse interaction early, this removes the rq_qos_cleanup() callback in iolatency in favor of cleaning up always on the rq_qos_done_bio() path. Fixes: a3fb01ba5af0 ("blk-iolatency: only account submitted bios") Debugged-by: Tejun Heo <tj@kernel.org> Debugged-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-05Merge branch 'bpf-libbpf-int-btf-map'Daniel Borkmann
Andrii Nakryiko says: ==================== This patch set implements an update to how BTF-defined maps are specified. The change is in how integer attributes, e.g., type, max_entries, map_flags, are specified: now they are captured as part of map definition struct's BTF type information (using array dimension), eliminating the need for compile-time data initialization and keeping all the metadata in one place. All existing selftests that were using BTF-defined maps are updated, along with some other selftests, that were switched to new syntax. v4->v5: - revert sample_map_ret0.c, which is loaded with iproute2 (kernel test robot); v3->v4: - add acks; - fix int -> uint type in commit message; v2->v3: - rename __int into __uint (Yonghong); v1->v2: - split bpf_helpers.h change from libbpf change (Song). ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05selftests/bpf: convert legacy BPF maps to BTF-defined onesAndrii Nakryiko
Convert selftests that were originally left out and new ones added recently to consistently use BTF-defined maps. Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05selftests/bpf: convert selftests using BTF-defined maps to new syntaxAndrii Nakryiko
Convert all the existing selftests that are already using BTF-defined maps to use new syntax (with no static data initialization). Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05selftests/bpf: add __uint and __type macro for BTF-defined mapsAndrii Nakryiko
Add simple __uint and __type macro that hide details of how type and integer values are captured in BTF-defined maps. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05libbpf: capture value in BTF type info for BTF-defined map defsAndrii Nakryiko
Change BTF-defined map definitions to capture compile-time integer values as part of BTF type definition, to avoid split of key/value type information and actual type/size/flags initialization for maps. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05Merge branch 'bpf-libbpf-link-trace'Daniel Borkmann
Andrii Nakryiko says: ==================== This patchset adds the following APIs to allow attaching BPF programs to tracing entities: - bpf_program__attach_perf_event for attaching to any opened perf event FD, allowing users full control; - bpf_program__attach_kprobe for attaching to kernel probes (both entry and return probes); - bpf_program__attach_uprobe for attaching to user probes (both entry/return); - bpf_program__attach_tracepoint for attaching to kernel tracepoints; - bpf_program__attach_raw_tracepoint for attaching to raw kernel tracepoint (wrapper around bpf_raw_tracepoint_open); This set of APIs makes libbpf more useful for tracing applications. All attach APIs return abstract struct bpf_link that encapsulates logic of detaching BPF program. See patch #2 for details. bpf_assoc was considered as an alternative name for this opaque "handle", but bpf_link seems to be appropriate semantically and is nice and short. Pre-patch #1 makes internal libbpf_strerror_r helper function work w/ negative error codes, lifting the burder off callers to keep track of error sign. Patch #2 adds bpf_link abstraction. Patch #3 adds attach_perf_event, which is the base for all other APIs. Patch #4 adds kprobe/uprobe APIs. Patch #5 adds tracepoint API. Patch #6 adds raw_tracepoint API. Patch #7 converts one existing test to use attach_perf_event. Patch #8 adds new kprobe/uprobe tests. Patch #9 converts some selftests currently using tracepoint to new APIs. v4->v5: - typo and small nits (Yonghong); - validate pfd in attach_perf_event (Yonghong); - parse_uint_from_file fixes (Yonghong); - check for malloc failure in attach_raw_tracepoint (Yonghong); - attach_probes selftests clean up fixes (Yonghong); v3->v4: - proper errno handling (Stanislav); - bpf_fd -> prog_fd (Stanislav); - switch to fprintf (Song); v2->v3: - added bpf_link concept (Daniel); - didn't add generic bpf_link__attach_program for reasons described in [0]; - dropped Stanislav's Reviewed-by from patches #2-#6, in case he doesn't like the change; v1->v2: - preserve errno before close() call (Stanislav); - use libbpf_perf_event_disable_and_close in selftest (Stanislav); - remove unnecessary memset (Stanislav); [0] https://lore.kernel.org/bpf/CAEf4BzZ7EM5eP2eaZn7T2Yb5QgVRiwAs+epeLR1g01TTx-6m6Q@mail.gmail.com/ ==================== Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05selftests/bpf: convert existing tracepoint tests to new APIsAndrii Nakryiko
Convert some existing tests that attach to tracepoints to use bpf_program__attach_tracepoint API instead. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05selftests/bpf: add kprobe/uprobe selftestsAndrii Nakryiko
Add tests verifying kprobe/kretprobe/uprobe/uretprobe APIs work as expected. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05selftests/bpf: switch test to new attach_perf_event APIAndrii Nakryiko
Use new bpf_program__attach_perf_event() in test previously relying on direct ioctl manipulations. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-05libbpf: add raw tracepoint attach APIAndrii Nakryiko
Add a wrapper utilizing bpf_link "infrastructure" to allow attaching BPF programs to raw tracepoints. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>