summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-06-03Merge branch 'ndo_xdp_xmit-cleanup'Alexei Starovoitov
Jesper Dangaard Brouer says: ==================== As I mentioned in merge commit 10f678683e4 ("Merge branch 'xdp_xmit-bulking'") I plan to change the API for ndo_xdp_xmit once more, by adding a flags argument, which is done in this patchset. I know it is late in the cycle (currently at rc7), but it would be nice to avoid changing NDOs over several kernel releases, as it is annoying to vendors and distro backporters, but it is not strictly UAPI so it is allowed (according to Alexei). The end-goal is getting rid of the ndo_xdp_flush operation, as it will make it possible for drivers to implement a TXQ synchronization mechanism that is not necessarily derived from the CPU id (smp_processor_id). This patchset removes all callers of the ndo_xdp_flush operation, but it doesn't take the last step of removing it from all drivers. This can be done later, or I can update the patchset on request. Micro-benchmarks only show a very small performance improvement, for map-redirect around ~2 ns, and for non-map redirect ~7 ns. I've not benchmarked this with CONFIG_RETPOLINE, but the performance benefit should be more visible given we end-up removing an indirect call. --- V2: Updated based on feedback from Song Liu <songliubraving@fb.com> ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf/xdp: devmap can avoid calling ndo_xdp_flushJesper Dangaard Brouer
The XDP_REDIRECT map devmap can avoid using ndo_xdp_flush, by instead instructing ndo_xdp_xmit to flush via XDP_XMIT_FLUSH flag in appropriate places. Notice after this patch it is possible to remove ndo_xdp_flush completely, as this is the last user of ndo_xdp_flush. This is left for later patches, to keep driver changes separate. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf/xdp: non-map redirect can avoid calling ndo_xdp_flushJesper Dangaard Brouer
This is the first real user of the XDP_XMIT_FLUSH flag. As pointed out many times, XDP_REDIRECT without using BPF maps is significant slower than the map variant. This is primary due to the lack of bulking, as the ndo_xdp_flush operation is required after each frame (to avoid frames hanging on the egress device). It is still possible to optimize this case. Instead of invoking two NDO indirect calls, which are very expensive with CONFIG_RETPOLINE, instead instruct ndo_xdp_xmit to flush via XDP_XMIT_FLUSH flag. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03xdp: done implementing ndo_xdp_xmit flush flag for all driversJesper Dangaard Brouer
Removing XDP_XMIT_FLAGS_NONE as all driver now implement a flush operation in their ndo_xdp_xmit call. The compiler will catch if any users of XDP_XMIT_FLAGS_NONE remains. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03virtio_net: implement flush flag for ndo_xdp_xmitJesper Dangaard Brouer
When passed the XDP_XMIT_FLUSH flag virtnet_xdp_xmit now performs the same virtqueue_kick as virtnet_xdp_flush. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03tun: implement flush flag for ndo_xdp_xmitJesper Dangaard Brouer
When passed the XDP_XMIT_FLUSH flag tun_xdp_xmit now performs the same kind of socket wake up as in tun_xdp_flush(). The wake up code from tun_xdp_flush is generalized and shared with tun_xdp_xmit. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03ixgbe: implement flush flag for ndo_xdp_xmitJesper Dangaard Brouer
When passed the XDP_XMIT_FLUSH flag ixgbe_xdp_xmit now performs the same kind of ring tail update as in ixgbe_xdp_flush. The update tail code in ixgbe_xdp_flush is generalized and shared with ixgbe_xdp_xmit. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03i40e: implement flush flag for ndo_xdp_xmitJesper Dangaard Brouer
When passed the XDP_XMIT_FLUSH flag i40e_xdp_xmit now performs the same kind of ring tail update as in i40e_xdp_flush. The advantage is that all the necessary checks have been performed and xdp_ring can be updated, instead of having to perform the exact same steps/checks in i40e_xdp_flush Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03xdp: add flags argument to ndo_xdp_xmit APIJesper Dangaard Brouer
This patch only change the API and reject any use of flags. This is an intermediate step that allows us to implement the flush flag operation later, for each individual driver in a separate patch. The plan is to implement flush operation via XDP_XMIT_FLUSH flag and then remove XDP_XMIT_FLAGS_NONE when done. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03Merge tag 'wireless-drivers-next-for-davem-2018-05-31' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.18 Hopefully the last pull request to 4.18 before the merge window. Nothing major here, we have smaller new features and of course a lots of fixes. Major changes: ath10k * add memory dump support for QCA9888 and QCA99X0 * add support to configure channel dwell time * support new DFS host confirmation feature in the firmware ath * update various regulatory mappings wcn36xx * various fixes to improve reliability * add Factory Test Mode support brmfmac * add debugfs file for reading firmware capabilities mwifiex * support sysfs initiated device coredump ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03mlx4_core: restore optimal ICM memory allocationEric Dumazet
Commit 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks") brought two regressions caught in our regression suite. The big one is an additional cost of 256 bytes of overhead per 4096 bytes, or 6.25 % which is unacceptable since ICM can be pretty large. This comes from having to allocate one struct mlx4_icm_chunk (256 bytes) per MLX4_TABLE_CHUNK, which the buggy commit shrank to 4KB (instead of prior 256KB) Note that mlx4_alloc_icm() is already able to try high order allocations and fallback to low-order allocations under high memory pressure. Most of these allocations happen right after boot time, when we get plenty of non fragmented memory, there is really no point being so pessimistic and break huge pages into order-0 ones just for fun. We only have to tweak gfp_mask a bit, to help falling back faster, without risking OOM killings. Second regression is an KASAN fault, that will need further investigations. Fixes: 1383cb8103bb ("mlx4_core: allocate ICM memory in page size chunks") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Cc: John Sperbeck <jsperbeck@google.com> Cc: Tarick Bedeir <tarick@google.com> Cc: Qing Huang <qing.huang@oracle.com> Cc: Daniel Jurgens <danielj@mellanox.com> Cc: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03vlan: use non-archaic spelling of failesThadeu Lima de Souza Cascardo
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03net: axienet: remove stale comment of axienet_openYueHaibing
axienet_open no longer return -ENODEV when PHY cannot be connected to since commit d7cc3163e026 ("net: axienet: Support phy-less mode of operation") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03Merge branch 'misc-BPF-improvements'Alexei Starovoitov
Daniel Borkmann says: ==================== This set adds various patches I still had in my queue, first two are test cases to provide coverage for the recent two fixes that went to bpf tree, then a small improvement on the error message for gpl helpers. Next, we expose prog and map id into fdinfo in order to allow for inspection of these objections currently used in applications. Patch after that removes a retpoline call for map lookup/update/delete helpers. A new helper is added in the subsequent patch to lookup the skb's socket's cgroup v2 id which can be used in an efficient way for e.g. lookups on egress side. Next one is a fix to fully clear state info in tunnel/xfrm helpers. Given this is full cap_sys_admin from init ns and has same priv requirements like tracing, bpf-next should be okay. A small bug fix for bpf_asm follows, and next a fix for context access in tracing which was recently reported. Lastly, a small update in the maintainer's file to add patchwork url and missing files. Thanks! v2 -> v3: - Noticed a merge artefact inside uapi header comment, sigh, fixed now. v1 -> v2: - minor fix in getting context access work on 32 bit for tracing - add paragraph to uapi helper doc to better describe kernel build deps for cggroup helper ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf, doc: add missing patchwork url and libbpf to maintainersDaniel Borkmann
Add missing bits under tools/lib/bpf/ and also Q: entry in order to make it easier for people to retrieve current patch queue. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: sync bpf uapi header with toolsDaniel Borkmann
Pull in recent changes from include/uapi/linux/bpf.h. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: fix context access in tracing progs on 32 bit archsDaniel Borkmann
Wang reported that all the testcases for BPF_PROG_TYPE_PERF_EVENT program type in test_verifier report the following errors on x86_32: 172/p unpriv: spill/fill of different pointers ldx FAIL Unexpected error message! 0: (bf) r6 = r10 1: (07) r6 += -8 2: (15) if r1 == 0x0 goto pc+3 R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 3: (bf) r2 = r10 4: (07) r2 += -76 5: (7b) *(u64 *)(r6 +0) = r2 6: (55) if r1 != 0x0 goto pc+1 R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp 7: (7b) *(u64 *)(r6 +0) = r1 8: (79) r1 = *(u64 *)(r6 +0) 9: (79) r1 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (71) r0 = *(u8 *)(r1 +68) invalid bpf_context access off=68 size=1 379/p check bpf_perf_event_data->sample_period half load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (69) r0 = *(u16 *)(r1 +68) invalid bpf_context access off=68 size=2 380/p check bpf_perf_event_data->sample_period word load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (61) r0 = *(u32 *)(r1 +68) invalid bpf_context access off=68 size=4 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (79) r0 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 Reason is that struct pt_regs on x86_32 doesn't fully align to 8 byte boundary due to its size of 68 bytes. Therefore, bpf_ctx_narrow_access_ok() will then bail out saying that off & (size_default - 1) which is 68 & 7 doesn't cleanly align in the case of sample_period access from struct bpf_perf_event_data, hence verifier wrongly thinks we might be doing an unaligned access here though underlying arch can handle it just fine. Therefore adjust this down to machine size and check and rewrite the offset for narrow access on that basis. We also need to fix corresponding pe_prog_is_valid_access(), since we hit the check for off % size != 0 (e.g. 68 % 8 -> 4) in the first and last test. With that in place, progs for tracing work on x86_32. Reported-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: fix cbpf parser bug for octal numbersDaniel Borkmann
Range is 0-7, not 0-9, otherwise parser silently excludes it from the strtol() rather than throwing an error. Reported-by: Marc Boschma <marc@boschma.cx> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: make sure to clear unused fields in tunnel/xfrm state fetchDaniel Borkmann
Since the remaining bits are not filled in struct bpf_tunnel_key resp. struct bpf_xfrm_state and originate from uninitialized stack space, we should make sure to clear them before handing control back to the program. Also add a padding element to struct bpf_xfrm_state for future use similar as we have in struct bpf_tunnel_key and clear it as well. struct bpf_xfrm_state { __u32 reqid; /* 0 4 */ __u32 spi; /* 4 4 */ __u16 family; /* 8 2 */ /* XXX 2 bytes hole, try to pack */ union { __u32 remote_ipv4; /* 4 */ __u32 remote_ipv6[4]; /* 16 */ }; /* 12 16 */ /* size: 28, cachelines: 1, members: 4 */ /* sum members: 26, holes: 1, sum holes: 2 */ /* last cacheline: 28 bytes */ }; Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: add bpf_skb_cgroup_id helperDaniel Borkmann
Add a new bpf_skb_cgroup_id() helper that allows to retrieve the cgroup id from the skb's socket. This is useful in particular to enable bpf_get_cgroup_classid()-like behavior for cgroup v1 in cgroup v2 by allowing ID based matching on egress. This can in particular be used in combination with applying policy e.g. from map lookups, and also complements the older bpf_skb_under_cgroup() interface. In user space the cgroup id for a given path can be retrieved through the f_handle as demonstrated in [0] recently. [0] https://lkml.org/lkml/2018/5/22/1190 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: avoid retpoline for lookup/update/delete calls on mapsDaniel Borkmann
While some of the BPF map lookup helpers provide a ->map_gen_lookup() callback for inlining the map lookup altogether it is not available for every map, so the remaining ones have to call bpf_map_lookup_elem() helper which does a dispatch to map->ops->map_lookup_elem(). In times of retpolines, this will control and trap speculative execution rather than letting it do its work for the indirect call and will therefore cause a slowdown. Likewise, bpf_map_update_elem() and bpf_map_delete_elem() do not have an inlined version and need to call into their map->ops->map_update_elem() resp. map->ops->map_delete_elem() handlers. Before: # bpftool prog dump xlated id 1 0: (bf) r2 = r10 1: (07) r2 += -8 2: (7a) *(u64 *)(r2 +0) = 0 3: (18) r1 = map[id:1] 5: (85) call __htab_map_lookup_elem#232656 6: (15) if r0 == 0x0 goto pc+4 7: (71) r1 = *(u8 *)(r0 +35) 8: (55) if r1 != 0x0 goto pc+1 9: (72) *(u8 *)(r0 +35) = 1 10: (07) r0 += 56 11: (15) if r0 == 0x0 goto pc+4 12: (bf) r2 = r0 13: (18) r1 = map[id:1] 15: (85) call bpf_map_delete_elem#215008 <-- indirect call via 16: (95) exit helper After: # bpftool prog dump xlated id 1 0: (bf) r2 = r10 1: (07) r2 += -8 2: (7a) *(u64 *)(r2 +0) = 0 3: (18) r1 = map[id:1] 5: (85) call __htab_map_lookup_elem#233328 6: (15) if r0 == 0x0 goto pc+4 7: (71) r1 = *(u8 *)(r0 +35) 8: (55) if r1 != 0x0 goto pc+1 9: (72) *(u8 *)(r0 +35) = 1 10: (07) r0 += 56 11: (15) if r0 == 0x0 goto pc+4 12: (bf) r2 = r0 13: (18) r1 = map[id:1] 15: (85) call htab_lru_map_delete_elem#238240 <-- direct call 16: (95) exit In all three lookup/update/delete cases however we can use the actual address of the map callback directly if we find that there's only a single path with a map pointer leading to the helper call, meaning when the map pointer has not been poisoned from verifier side. Example code can be seen above for the delete case. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03net/ncsi: Avoid GFP_KERNEL in response handlerSamuel Mendoza-Jonas
ncsi_rsp_handler_gc() allocates the filter arrays using GFP_KERNEL in softirq context, causing the below backtrace. This allocation is only a few dozen bytes during probing so allocate with GFP_ATOMIC instead. [ 42.813372] BUG: sleeping function called from invalid context at mm/slab.h:416 [ 42.820900] in_atomic(): 1, irqs_disabled(): 0, pid: 213, name: kworker/0:1 [ 42.827893] INFO: lockdep is turned off. [ 42.832023] CPU: 0 PID: 213 Comm: kworker/0:1 Tainted: G W 4.13.16-01441-gad99b38 #65 [ 42.841007] Hardware name: Generic DT based system [ 42.845966] Workqueue: events ncsi_dev_work [ 42.850251] [<8010a494>] (unwind_backtrace) from [<80107510>] (show_stack+0x20/0x24) [ 42.858046] [<80107510>] (show_stack) from [<80612770>] (dump_stack+0x20/0x28) [ 42.865309] [<80612770>] (dump_stack) from [<80148248>] (___might_sleep+0x230/0x2b0) [ 42.873241] [<80148248>] (___might_sleep) from [<80148334>] (__might_sleep+0x6c/0xac) [ 42.881129] [<80148334>] (__might_sleep) from [<80240d6c>] (__kmalloc+0x210/0x2fc) [ 42.888737] [<80240d6c>] (__kmalloc) from [<8060ad54>] (ncsi_rsp_handler_gc+0xd0/0x170) [ 42.896770] [<8060ad54>] (ncsi_rsp_handler_gc) from [<8060b454>] (ncsi_rcv_rsp+0x16c/0x1d4) [ 42.905314] [<8060b454>] (ncsi_rcv_rsp) from [<804d86c8>] (__netif_receive_skb_core+0x3c8/0xb50) [ 42.914158] [<804d86c8>] (__netif_receive_skb_core) from [<804d96cc>] (__netif_receive_skb+0x20/0x7c) [ 42.923420] [<804d96cc>] (__netif_receive_skb) from [<804de4b0>] (netif_receive_skb_internal+0x78/0x6a4) [ 42.932931] [<804de4b0>] (netif_receive_skb_internal) from [<804df980>] (netif_receive_skb+0x78/0x158) [ 42.942292] [<804df980>] (netif_receive_skb) from [<8042f204>] (ftgmac100_poll+0x43c/0x4e8) [ 42.950855] [<8042f204>] (ftgmac100_poll) from [<804e094c>] (net_rx_action+0x278/0x4c4) [ 42.958918] [<804e094c>] (net_rx_action) from [<801016a8>] (__do_softirq+0xe0/0x4c4) [ 42.966716] [<801016a8>] (__do_softirq) from [<8011cd9c>] (do_softirq.part.4+0x50/0x78) [ 42.974756] [<8011cd9c>] (do_softirq.part.4) from [<8011cebc>] (__local_bh_enable_ip+0xf8/0x11c) [ 42.983579] [<8011cebc>] (__local_bh_enable_ip) from [<804dde08>] (__dev_queue_xmit+0x260/0x890) [ 42.992392] [<804dde08>] (__dev_queue_xmit) from [<804df1f0>] (dev_queue_xmit+0x1c/0x20) [ 43.000689] [<804df1f0>] (dev_queue_xmit) from [<806099c0>] (ncsi_xmit_cmd+0x1c0/0x244) [ 43.008763] [<806099c0>] (ncsi_xmit_cmd) from [<8060dc14>] (ncsi_dev_work+0x2e0/0x4c8) [ 43.016725] [<8060dc14>] (ncsi_dev_work) from [<80133dfc>] (process_one_work+0x214/0x6f8) [ 43.024940] [<80133dfc>] (process_one_work) from [<80134328>] (worker_thread+0x48/0x558) [ 43.033070] [<80134328>] (worker_thread) from [<8013ba80>] (kthread+0x130/0x174) [ 43.040506] [<8013ba80>] (kthread) from [<80102950>] (ret_from_fork+0x14/0x24) Fixes: 062b3e1b6d4f ("net/ncsi: Refactor MAC, VLAN filters") Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-03bpf: show prog and map id in fdinfoDaniel Borkmann
Its trivial and straight forward to expose it for scripts that can then use it along with bpftool in order to inspect an individual application's used maps and progs. Right now we dump some basic information in the fdinfo file but with the help of the map/prog id full introspection becomes possible now. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: fixup error message from gpl helpers on license mismatchDaniel Borkmann
Stating 'proprietary program' in the error is just silly since it can also be a different open source license than that which is just not compatible. Reference: https://twitter.com/majek04/status/998531268039102465 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: add also cbpf long jump test cases with heavy expansionDaniel Borkmann
We have one triggering on eBPF but lets also add a cBPF example to make sure we keep tracking them. Also add anther cBPF test running max number of MSH ops. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03bpf: test case for map pointer poison with calls/branchesDaniel Borkmann
Add several test cases where the same or different map pointers originate from different paths in the program and execute a map lookup or tail call at a common location. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-03net: netcp: ethss: remove unnecessary pointer set to NULLYueHaibing
If statement has make sure the 'slave->phy' is NULL Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-04powerpc: Remove core support for Marvell mv64x60 hostbridgesMark Greer
There are no longer any platforms that use Marvell's mv64x60 hostbridges so remove the supporting kernel code. CC: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/boot: Remove core support for Marvell mv64x60 hostbridgesMark Greer
There are no longer any platforms that use Marvell's mv64x60 hostbridges so remove the supporting boot code. Signed-off-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/boot: Remove support for Marvell mv64x60 i2c controllerMark Greer
There are no longer any platforms that use Marvell's mv64x60's i2c controller so remove its driver. Signed-off-by: Mark Greer <mgreer@animalcreek.com> Acked-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Mark Greer &lt;<a href="mailto:mgreer@animalcreek.com">mgreer@animalcreek.com</a>&gt;<br> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/boot: Remove support for Marvell MPSC serial controllerMark Greer
There are no longer any platforms that use Marvell's MPSC serial controller so remove its driver. Signed-off-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/embedded6xx: Remove C2K board supportMark Greer
The C2K platform appears to be orphaned so remove code supporting it. CC: Remi Machet <rmachet@nvidia.com> Signed-off-by: Mark Greer <mgreer@animalcreek.com> Acked-by: Remi Machet <remi@machet.us> Signed-off-by: Mark Greer <mgreer@animalcreek.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/lib: optimise PPC32 memcmpChristophe Leroy
At the time being, memcmp() compares two chunks of memory byte per byte. This patch optimises the comparison by comparing word by word. On the same way as commit 15c2d45d17418 ("powerpc: Add 64bit optimised memcmp"), this patch moves memcmp() into a dedicated file named memcmp_32.S A small benchmark performed on an 8xx comparing two chuncks of 512 bytes performed 100000 times gives: Before : 5852274 TB ticks After: 1488638 TB ticks This is almost 4 times faster Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/lib: optimise 32 bits __clear_user()Christophe Leroy
Rewrite clear_user() on the same principle as memset(0), making use of dcbz to clear complete cache lines. This code is a copy/paste of memset(), with some modifications in order to retrieve remaining number of bytes to be cleared, as it needs to be returned in case of error. On the same way as done on PPC64 in commit 17968fbbd19f1 ("powerpc: 64bit optimised __clear_user"), the patch moves __clear_user() into a dedicated file string_32.S On a MPC885, throughput is almost doubled: Before: ~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 18.990779 seconds, 52.7MB/s After: ~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 9.611468 seconds, 104.0MB/s On a MPC8321, throughput is multiplied by 2.12: Before: root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 6.844352 seconds, 146.1MB/s After: root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000 1048576000 bytes (1000.0MB) copied, 3.218854 seconds, 310.7MB/s Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/time: inline arch_vtime_task_switch()Christophe Leroy
arch_vtime_task_switch() is a small function which is called only from vtime_common_task_switch(), so it is worth inlining Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/Makefile: set -mcpu=860 flag for the 8xxChristophe Leroy
When compiled with GCC 8.1, vmlinux is significantly bigger than with GCC 4.8. When looking at the generated code with objdump, we notice that all functions and loops when a 16 bytes alignment. This significantly increases the size of the kernel. It is pointless and even counterproductive as on the 8xx 'nop' also consumes one clock cycle. Size of vmlinux with GCC 4.8: text data bss dec hex filename 5801948 1626076 457796 7885820 7853fc vmlinux Size of vmlinux with GCC 8.1: text data bss dec hex filename 6764592 1630652 456476 8851720 871108 vmlinux Size of vmlinux with GCC 8.1 and this patch: text data bss dec hex filename 6331544 1631756 456476 8419776 8079c0 vmlinux Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc: Implement csum_ipv6_magic in assemblyChristophe Leroy
The generic csum_ipv6_magic() generates a pretty bad result 00000000 <csum_ipv6_magic>: (PPC32) 0: 81 23 00 00 lwz r9,0(r3) 4: 81 03 00 04 lwz r8,4(r3) 8: 7c e7 4a 14 add r7,r7,r9 c: 7d 29 38 10 subfc r9,r9,r7 10: 7d 4a 51 10 subfe r10,r10,r10 14: 7d 27 42 14 add r9,r7,r8 18: 7d 2a 48 50 subf r9,r10,r9 1c: 80 e3 00 08 lwz r7,8(r3) 20: 7d 08 48 10 subfc r8,r8,r9 24: 7d 4a 51 10 subfe r10,r10,r10 28: 7d 29 3a 14 add r9,r9,r7 2c: 81 03 00 0c lwz r8,12(r3) 30: 7d 2a 48 50 subf r9,r10,r9 34: 7c e7 48 10 subfc r7,r7,r9 38: 7d 4a 51 10 subfe r10,r10,r10 3c: 7d 29 42 14 add r9,r9,r8 40: 7d 2a 48 50 subf r9,r10,r9 44: 80 e4 00 00 lwz r7,0(r4) 48: 7d 08 48 10 subfc r8,r8,r9 4c: 7d 4a 51 10 subfe r10,r10,r10 50: 7d 29 3a 14 add r9,r9,r7 54: 7d 2a 48 50 subf r9,r10,r9 58: 81 04 00 04 lwz r8,4(r4) 5c: 7c e7 48 10 subfc r7,r7,r9 60: 7d 4a 51 10 subfe r10,r10,r10 64: 7d 29 42 14 add r9,r9,r8 68: 7d 2a 48 50 subf r9,r10,r9 6c: 80 e4 00 08 lwz r7,8(r4) 70: 7d 08 48 10 subfc r8,r8,r9 74: 7d 4a 51 10 subfe r10,r10,r10 78: 7d 29 3a 14 add r9,r9,r7 7c: 7d 2a 48 50 subf r9,r10,r9 80: 81 04 00 0c lwz r8,12(r4) 84: 7c e7 48 10 subfc r7,r7,r9 88: 7d 4a 51 10 subfe r10,r10,r10 8c: 7d 29 42 14 add r9,r9,r8 90: 7d 2a 48 50 subf r9,r10,r9 94: 7d 08 48 10 subfc r8,r8,r9 98: 7d 4a 51 10 subfe r10,r10,r10 9c: 7d 29 2a 14 add r9,r9,r5 a0: 7d 2a 48 50 subf r9,r10,r9 a4: 7c a5 48 10 subfc r5,r5,r9 a8: 7c 63 19 10 subfe r3,r3,r3 ac: 7d 29 32 14 add r9,r9,r6 b0: 7d 23 48 50 subf r9,r3,r9 b4: 7c c6 48 10 subfc r6,r6,r9 b8: 7c 63 19 10 subfe r3,r3,r3 bc: 7c 63 48 50 subf r3,r3,r9 c0: 54 6a 80 3e rotlwi r10,r3,16 c4: 7c 63 52 14 add r3,r3,r10 c8: 7c 63 18 f8 not r3,r3 cc: 54 63 84 3e rlwinm r3,r3,16,16,31 d0: 4e 80 00 20 blr 0000000000000000 <.csum_ipv6_magic>: (PPC64) 0: 81 23 00 00 lwz r9,0(r3) 4: 80 03 00 04 lwz r0,4(r3) 8: 81 63 00 08 lwz r11,8(r3) c: 7c e7 4a 14 add r7,r7,r9 10: 7f 89 38 40 cmplw cr7,r9,r7 14: 7d 47 02 14 add r10,r7,r0 18: 7d 30 10 26 mfocrf r9,1 1c: 55 29 f7 fe rlwinm r9,r9,30,31,31 20: 7d 4a 4a 14 add r10,r10,r9 24: 7f 80 50 40 cmplw cr7,r0,r10 28: 7d 2a 5a 14 add r9,r10,r11 2c: 80 03 00 0c lwz r0,12(r3) 30: 81 44 00 00 lwz r10,0(r4) 34: 7d 10 10 26 mfocrf r8,1 38: 55 08 f7 fe rlwinm r8,r8,30,31,31 3c: 7d 29 42 14 add r9,r9,r8 40: 81 04 00 04 lwz r8,4(r4) 44: 7f 8b 48 40 cmplw cr7,r11,r9 48: 7d 29 02 14 add r9,r9,r0 4c: 7d 70 10 26 mfocrf r11,1 50: 55 6b f7 fe rlwinm r11,r11,30,31,31 54: 7d 29 5a 14 add r9,r9,r11 58: 7f 80 48 40 cmplw cr7,r0,r9 5c: 7d 29 52 14 add r9,r9,r10 60: 7c 10 10 26 mfocrf r0,1 64: 54 00 f7 fe rlwinm r0,r0,30,31,31 68: 7d 69 02 14 add r11,r9,r0 6c: 7f 8a 58 40 cmplw cr7,r10,r11 70: 7c 0b 42 14 add r0,r11,r8 74: 81 44 00 08 lwz r10,8(r4) 78: 7c f0 10 26 mfocrf r7,1 7c: 54 e7 f7 fe rlwinm r7,r7,30,31,31 80: 7c 00 3a 14 add r0,r0,r7 84: 7f 88 00 40 cmplw cr7,r8,r0 88: 7d 20 52 14 add r9,r0,r10 8c: 80 04 00 0c lwz r0,12(r4) 90: 7d 70 10 26 mfocrf r11,1 94: 55 6b f7 fe rlwinm r11,r11,30,31,31 98: 7d 29 5a 14 add r9,r9,r11 9c: 7f 8a 48 40 cmplw cr7,r10,r9 a0: 7d 29 02 14 add r9,r9,r0 a4: 7d 70 10 26 mfocrf r11,1 a8: 55 6b f7 fe rlwinm r11,r11,30,31,31 ac: 7d 29 5a 14 add r9,r9,r11 b0: 7f 80 48 40 cmplw cr7,r0,r9 b4: 7d 29 2a 14 add r9,r9,r5 b8: 7c 10 10 26 mfocrf r0,1 bc: 54 00 f7 fe rlwinm r0,r0,30,31,31 c0: 7d 29 02 14 add r9,r9,r0 c4: 7f 85 48 40 cmplw cr7,r5,r9 c8: 7c 09 32 14 add r0,r9,r6 cc: 7d 50 10 26 mfocrf r10,1 d0: 55 4a f7 fe rlwinm r10,r10,30,31,31 d4: 7c 00 52 14 add r0,r0,r10 d8: 7f 80 30 40 cmplw cr7,r0,r6 dc: 7d 30 10 26 mfocrf r9,1 e0: 55 29 ef fe rlwinm r9,r9,29,31,31 e4: 7c 09 02 14 add r0,r9,r0 e8: 54 03 80 3e rotlwi r3,r0,16 ec: 7c 03 02 14 add r0,r3,r0 f0: 7c 03 00 f8 not r3,r0 f4: 78 63 84 22 rldicl r3,r3,48,48 f8: 4e 80 00 20 blr This patch implements it in assembly for both PPC32 and PPC64 Link: https://github.com/linuxppc/linux/issues/9 Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/32: Optimise __csum_partial()Christophe Leroy
Improve __csum_partial by interleaving loads and adds. On a 8xx, it brings neither improvement nor degradation. On a 83xx, it brings a 25% improvement. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/lib: Adjust .balign inside string functions for PPC32Christophe Leroy
commit 87a156fb18fe1 ("Align hot loops of some string functions") degraded the performance of string functions by adding useless nops A simple benchmark on an 8xx calling 100000x a memchr() that matches the first byte runs in 41668 TB ticks before this patch and in 35986 TB ticks after this patch. So this gives an improvement of approx 10% Another benchmark doing the same with a memchr() matching the 128th byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks after this patch, so regardless on the number of loops, removing those useless nops improves the test by 5683 TB ticks. Fixes: 87a156fb18fe1 ("Align hot loops of some string functions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/signal32: Use fault_in_pages_readable() to prefault user contextChristophe Leroy
Use fault_in_pages_readable() to prefault user context instead of open coding Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/8xx: Remove RTC clock on 88xChristophe Leroy
The 885 familly processors don't have the Real Time Clock Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/boot: remove unused variable in mpc8xxChristophe Leroy
Variable div is set but never used. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/misc: merge reloc_offset() and add_reloc_offset()Christophe Leroy
reloc_offset() is the same as add_reloc_offset(0) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/64: optimises from64to32()Christophe Leroy
The current implementation of from64to32() gives a poor result: 0000000000000270 <.from64to32>: 270: 38 00 ff ff li r0,-1 274: 78 69 00 22 rldicl r9,r3,32,32 278: 78 00 00 20 clrldi r0,r0,32 27c: 7c 60 00 38 and r0,r3,r0 280: 7c 09 02 14 add r0,r9,r0 284: 78 09 00 22 rldicl r9,r0,32,32 288: 7c 00 4a 14 add r0,r0,r9 28c: 78 03 00 20 clrldi r3,r0,32 290: 4e 80 00 20 blr This patch modifies from64to32() to operate in the same spirit as csum_fold() It swaps the two 32-bit halves of sum then it adds it with the unswapped sum. If there is a carry from adding the two 32-bit halves, it will carry from the lower half into the upper half, giving us the correct sum in the upper half. The resulting code is: 0000000000000260 <.from64to32>: 260: 78 60 00 02 rotldi r0,r3,32 264: 7c 60 1a 14 add r3,r0,r3 268: 78 63 00 22 rldicl r3,r3,32,32 26c: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/mm: Remove stale_map[] handling on non SMP processorsChristophe Leroy
stale_map[] bits are only set in steal_context_smp() so on UP processors this map is useless. Only manage it for SMP processors. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/mm: constify LAST_CONTEXT in mmu_context_nohashChristophe Leroy
last_context is 16 on the 8xx, 65535 on the 47x and 255 on other ones. The kernel is exclusively built for the 8xx, for the 47x or for another processor so the last context can be defined as a constant depending on the processor. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Reformat old comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/mm: Avoid unnecessary test and reduce code sizeChristophe Leroy
no_selective_tlbil hence the use of either steal_all_contexts() or steal_context_up() depends on the subarch, it won't change during run. Only the 8xx uses steal_all_contexts and CONFIG_PPC_8xx is exclusive of other processors. This patch replaces the test of no_selective_tlbil global var by a test of CONFIG_PPC_8xx selection. It avoids the test and removes unnecessary code. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/mm: constify FIRST_CONTEXT in mmu_context_nohashChristophe Leroy
First context is now 1 for all supported platforms, so it can be made a constant. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/dma: remove unnecessary BUG()Christophe Leroy
Direction is already checked in all calling functions in include/linux/dma-mapping.h and also in called function __dma_sync() So really no need to check it once more here. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-06-04powerpc/sstep: Fix emulate_step test if VSX not presentRavi Bangoria
emulate_step() tests are failing if VSX is not supported or disabled. emulate_step_test: lxvd2x : FAIL emulate_step_test: stxvd2x : FAIL If !CPU_FTR_VSX, emulate_step() failure is expected and testcase should PASS with a valid justification. After patch: emulate_step_test: lxvd2x : PASS (!CPU_FTR_VSX) emulate_step_test: stxvd2x : PASS (!CPU_FTR_VSX) Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>