summaryrefslogtreecommitdiff
path: root/drivers/net/xen-netfront.c
AgeCommit message (Collapse)Author
2021-12-16xen/netfront: harden netfront against event channel stormsJuergen Gross
The Xen netfront driver is still vulnerable for an attack via excessive number of events sent by the backend. Fix that by using lateeoi event channels. For being able to detect the case of no rx responses being added while the carrier is down a new lock is needed in order to update and test rsp_cons and the number of seen unconsumed responses atomically. This is part of XSA-391 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- V2: - don't eoi irq in case of interface set broken (Jan Beulich) - handle carrier off + no new responses added (Jan Beulich) V3: - add rx_ prefix to rsp_unconsumed (Jan Beulich) - correct xennet_set_rx_rsp_cons() spelling (Jan Beulich)
2021-10-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
include/net/sock.h 7b50ecfcc6cd ("net: Rename ->stream_memory_read to ->sock_is_readable") 4c1e34c0dbff ("vsock: Enable y2038 safe timeval for timeout") drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c 0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") e77bcdd1f639 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.") Adjacent code addition in both cases, keep both. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-25xen/netfront: stop tx queues during live migrationDongli Zhang
The tx queues are not stopped during the live migration. As a result, the ndo_start_xmit() may access netfront_info->queues which is freed by talk_to_netback()->xennet_destroy_queues(). This patch is to netif_device_detach() at the beginning of xen-netfront resuming, and netif_device_attach() at the end of resuming. CPU A CPU B talk_to_netback() -> if (info->queues) xennet_destroy_queues(info); to free netfront_info->queues xennet_start_xmit() to access netfront_info->queues -> err = xennet_create_queues(info, &num_queues); The idea is borrowed from virtio-net. Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-22net: xen: use eth_hw_addr_set()Jakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-25xen/netfront: don't trust the backend response data blindlyJuergen Gross
Today netfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Note that only the tx queue needs special id handling, as for the rx queue the id is equal to the index in the ring page. Introduce a new indicator for the device whether it is broken and let the device stop working when it is set. Set this indicator in case the backend sets any weird data. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: disentangle tx_skb_freelistJuergen Gross
The tx_skb_freelist elements are in a single linked list with the request id used as link reference. The per element link field is in a union with the skb pointer of an in use request. Move the link reference out of the union in order to enable a later reuse of it for requests which need a populated skb pointer. Rename add_id_to_freelist() and get_id_from_freelist() to add_id_to_list() and get_id_from_list() in order to prepare using those for other lists as well. Define ~0 as value to indicate the end of a list and place that value into the link for a request not being on the list. When freeing a skb zero the skb pointer in the request. Use a NULL value of the skb pointer instead of skb_entry_is_link() for deciding whether a request has a skb linked to it. Remove skb_entry_set_link() and open code it instead as it is really trivial now. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: don't read data from request on the ring pageJuergen Gross
In order to avoid a malicious backend being able to influence the local processing of a request build the request locally first and then copy it to the ring page. Any reading from the request influencing the processing in the frontend needs to be done on the local instance. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25xen/netfront: read response from backend only onceJuergen Gross
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18bpf, devmap: Move drop error path to devmap for XDP_REDIRECTLorenzo Bianconi
We want to change the current ndo_xdp_xmit drop semantics because it will allow us to implement better queue overflow handling. This is working towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error path handling from each XDP ethernet driver to devmap code. According to the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx loop whenever the hw reports a tx error and it will just return to devmap caller the number of successfully transmitted frames. It will be devmap responsibility to free dropped frames. Move each XDP ndo_xdp_xmit capable driver to the new APIs: - veth - virtio-net - mvneta - mvpp2 - socionext - amazon ena - bnxt - freescale (dpaa2, dpaa) - xen-frontend - qede - ice - igb - ixgbe - i40e - mlx5 - ti (cpsw, cpsw-new) - tun - sfc Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org
2021-02-04drivers: net: xen-netfront: Simplify the calculation of variablesJiapeng Chong
Fix the following coccicheck warnings: ./drivers/net/xen-netfront.c:1816:52-54: WARNING !A || A && B is equivalent to !A || B. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/1612261069-13315-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-16net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanoursLee Jones
Fixes the following W=1 kernel build warning(s): from drivers/net/ethernet/ibm/ibmvnic.c:35: inlined from ‘handle_vpd_rsp’ at drivers/net/ethernet/ibm/ibmvnic.c:4124:3: drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_field' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'skb' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_len' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_data' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_field' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_data' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'len' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_len' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'scrq_arr' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'txbuff' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'num_entries' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'hdr_field' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'adapter' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'rwi' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'reset_state' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'adapter' not described in 'do_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'rwi' not described in 'do_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'reset_state' not described in 'do_reset' Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-08net, xdp: Introduce xdp_prepare_buff utility routineLorenzo Bianconi
Introduce xdp_prepare_buff utility routine to initialize per-descriptor xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-08net, xdp: Introduce xdp_init_buff utility routineLorenzo Bianconi
Introduce xdp_init_buff utility routine to initialize xdp_buff fields const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on xdp_init_buff in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Acked-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-12-01xsk: Propagate napi_id to XDP socket Rx pathBjörn Töpel
Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
2020-11-02drivers: net: xen-netfront: Fixed W=1 set but unused warningsAndrew Lunn
drivers/net/xen-netfront.c:2416:16: warning: variable ‘target’ set but not used [-Wunused-but-set-variable] 2416 | unsigned long target; Remove target and just discard the return value from simple_strtoul(). This patch does give a checkpatch warning, but the warning was there before anyway, as this file has lots of checkpatch warnings. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20201031180435.1081127-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-07-25bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commandsAndrii Nakryiko
Now that BPF program/link management is centralized in generic net_device code, kernel code never queries program id from drivers, so XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary. This patch removes all the implementations of those commands in kernel, along the xdp_attachment_query(). This patch was compile-tested on allyesconfig. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24xen-netfront: fix potential deadlock in xennet_remove()Andrea Righi
There's a potential race in xennet_remove(); this is what the driver is doing upon unregistering a network device: 1. state = read bus state 2. if state is not "Closed": 3. request to set state to "Closing" 4. wait for state to be set to "Closing" 5. request to set state to "Closed" 6. wait for state to be set to "Closed" If the state changes to "Closed" immediately after step 1 we are stuck forever in step 4, because the state will never go back from "Closed" to "Closing". Make sure to check also for state == "Closed" in step 4 to prevent the deadlock. Also add a 5 sec timeout any time we wait for the bus state to change, to avoid getting stuck forever in wait_event(). Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-03net/xen-netfront: add kernel TX timestampsDaniel Drown
This adds kernel TX timestamps to the xen-netfront driver. Tested with chrony on an AWS EC2 instance. Signed-off-by: Daniel Drown <dan-netdev@drown.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02xen-netfront: remove redundant assignment to variable 'act'Colin Ian King
The variable act is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-01xen networking: add basic XDP support for xen-netfrontDenis Kirjanov
The patch adds a basic XDP processing to xen-netfront driver. We ran an XDP program for an RX response received from netback driver. Also we request xen-netback to adjust data offset for bpf_xdp_adjust_head() header space for custom headers. synchronization between frontend and backend parts is done by using xenbus state switching: Reconfiguring -> Reconfigured- > Connected UDP packets drop rate using xdp program is around 310 kpps using ./pktgen_sample04_many_flows.sh and 160 kpps without the patch. Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-01xen-netfront: do not use ~0U as error return value for xennet_fill_frags()Dongli Zhang
xennet_fill_frags() uses ~0U as return value when the sk_buff is not able to cache extra fragments. This is incorrect because the return type of xennet_fill_frags() is RING_IDX and 0xffffffff is an expected value for ring buffer index. In the situation when the rsp_cons is approaching 0xffffffff, the return value of xennet_fill_frags() may become 0xffffffff which xennet_poll() (the caller) would regard as error. As a result, queue->rx.rsp_cons is set incorrectly because it is updated only when there is error. If there is no error, xennet_poll() would be responsible to update queue->rx.rsp_cons. Finally, queue->rx.rsp_cons would point to the rx ring buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL. This leads to NULL pointer access in the next iteration to process rx ring buffer entries. The symptom is similar to the one fixed in commit 00b368502d18 ("xen-netfront: do not assume sk_buff_head list is empty in error handling"). This patch changes the return type of xennet_fill_frags() to indicate whether it is successful or failed. The queue->rx.rsp_cons will be always updated inside this function. Fixes: ad4f15dc2c70 ("xen/netfront: don't bug in case of too many frags") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-17Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Pull in bug fixes from 'net' tree for the merge window. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-16xen-netfront: do not assume sk_buff_head list is empty in error handlingDongli Zhang
When skb_shinfo(skb) is not able to cache extra fragment (that is, skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS), xennet_fill_frags() assumes the sk_buff_head list is already empty. As a result, cons is increased only by 1 and returns to error handling path in xennet_poll(). However, if the sk_buff_head list is not empty, queue->rx.rsp_cons may be set incorrectly. That is, queue->rx.rsp_cons would point to the rx ring buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL. This leads to NULL pointer access in the next iteration to process rx ring buffer entries. Below is how xennet_poll() does error handling. All remaining entries in tmpq are accounted to queue->rx.rsp_cons without assuming how many outstanding skbs are remained in the list. 985 static int xennet_poll(struct napi_struct *napi, int budget) ... ... 1032 if (unlikely(xennet_set_skb_gso(skb, gso))) { 1033 __skb_queue_head(&tmpq, skb); 1034 queue->rx.rsp_cons += skb_queue_len(&tmpq); 1035 goto err; 1036 } It is better to always have the error handling in the same way. Fixes: ad4f15dc2c70 ("xen/netfront: don't bug in case of too many frags") Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-30net: Use skb_frag_off accessorsJonathan Lemon
Use accessor functions for skb fragment's page_offset instead of direct references, in preparation for bvec conversion. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16xen-netfront: mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/xen-netfront.c: In function ‘netback_changed’: drivers/net/xen-netfront.c:2038:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (dev->state == XenbusStateClosed) ^ drivers/net/xen-netfront.c:2041:2: note: here case XenbusStateClosing: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-18xen/netfront: tolerate frags with no dataJuergen Gross
At least old Xen net backends seem to send frags with no real data sometimes. In case such a fragment happens to occur with the frag limit already reached the frontend will BUG currently even if this situation is easily recoverable. Modify the BUG_ON() condition accordingly. Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-09xen/netfront: remove unnecessary wmbJacob Wen
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY is already able to make sure backend sees requests before req_prod is updated. Signed-off-by: Jacob Wen <jian.w.wen@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13xen/netfront: don't bug in case of too many fragsJuergen Gross
Commit 57f230ab04d291 ("xen/netfront: raise max number of slots in xennet_get_responses()") raised the max number of allowed slots by one. This seems to be problematic in some configurations with netback using a larger MAX_SKB_FRAGS value (e.g. old Linux kernel with MAX_SKB_FRAGS defined as 18 instead of nowadays 17). Instead of BUG_ON() in this case just fall back to retransmission. Fixes: 57f230ab04d291 ("xen/netfront: raise max number of slots in xennet_get_responses()") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-07xen/netfront: fix waiting for xenbus state changeJuergen Gross
Commit 822fb18a82aba ("xen-netfront: wait xenbus state change when load module manually") added a new wait queue to wait on for a state change when the module is loaded manually. Unfortunately there is no wakeup anywhere to stop that waiting. Instead of introducing a new wait queue rename the existing module_unload_q to module_wq and use it for both purposes (loading and unloading). As any state change of the backend might be intended to stop waiting do the wake_up_all() in any case when netback_changed() is called. Fixes: 822fb18a82aba ("xen-netfront: wait xenbus state change when load module manually") Cc: <stable@vger.kernel.org> #4.18 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-14xen-netfront: fix warn message as irq device name has '/'Xiao Liang
There is a call trace generated after commit 2d408c0d4574b01b9ed45e02516888bf925e11a9( xen-netfront: fix queue name setting). There is no 'device/vif/xx-q0-tx' file found under /proc/irq/xx/. This patch only picks up device type and id as its name. With the patch, now /proc/interrupts looks like below and the warning message gone: 70: 21 0 0 0 xen-dyn -event vif0-q0-tx 71: 15 0 0 0 xen-dyn -event vif0-q0-rx 72: 14 0 0 0 xen-dyn -event vif0-q1-tx 73: 33 0 0 0 xen-dyn -event vif0-q1-rx 74: 12 0 0 0 xen-dyn -event vif0-q2-tx 75: 24 0 0 0 xen-dyn -event vif0-q2-rx 76: 19 0 0 0 xen-dyn -event vif0-q3-tx 77: 21 0 0 0 xen-dyn -event vif0-q3-rx Below is call trace information without this patch: name 'device/vif/0-q0-tx' WARNING: CPU: 2 PID: 37 at fs/proc/generic.c:174 __xlate_proc_name+0x85/0xa0 RIP: 0010:__xlate_proc_name+0x85/0xa0 RSP: 0018:ffffb85c40473c18 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff984c7f516930 RBP: ffffb85c40473cb8 R08: 000000000000002c R09: 0000000000000229 R10: 0000000000000000 R11: 0000000000000001 R12: ffffb85c40473c98 R13: ffffb85c40473cb8 R14: ffffb85c40473c50 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff984c7f500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f69b6899038 CR3: 000000001c20a006 CR4: 00000000001606e0 Call Trace: __proc_create+0x45/0x230 ? snprintf+0x49/0x60 proc_mkdir_data+0x35/0x90 register_handler_proc+0xef/0x110 ? proc_register+0xfc/0x110 ? proc_create_data+0x70/0xb0 __setup_irq+0x39b/0x660 ? request_threaded_irq+0xad/0x160 request_threaded_irq+0xf5/0x160 ? xennet_tx_buf_gc+0x1d0/0x1d0 [xen_netfront] bind_evtchn_to_irqhandler+0x3d/0x70 ? xenbus_alloc_evtchn+0x41/0xa0 netback_changed+0xa46/0xcda [xen_netfront] ? find_watch+0x40/0x40 xenwatch_thread+0xc5/0x160 ? finish_wait+0x80/0x80 kthread+0x112/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x35/0x40 Code: 81 5c 00 48 85 c0 75 cc 5b 49 89 2e 31 c0 5d 4d 89 3c 24 41 5c 41 5d 41 5e 41 5f c3 4c 89 ee 48 c7 c7 40 4f 0e b4 e8 65 ea d8 ff <0f> 0b b8 fe ff ff ff 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f 1f ---[ end trace 650e5561b0caab3a ]--- Signed-off-by: Xiao Liang <xiliang@redhat.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-11Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-08-11xen/netfront: don't cache skb_shinfo()Juergen Gross
skb_shinfo() can change when calling __pskb_pull_tail(): Don't cache its return value. Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The BTF conflicts were simple overlapping changes. The virtio_net conflict was an overlap of a fix of statistics counter, happening alongisde a move over to a bonafide statistics structure rather than counting value on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30xen-netfront: wait xenbus state change when load module manuallyXiao Liang
When loading module manually, after call xenbus_switch_state to initializes the state of the netfront device, the driver state did not change so fast that may lead no dev created in latest kernel. This patch adds wait to make sure xenbus knows the driver is not in closed/unknown state. Current state: [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Cannot get device settings: No such device Cannot get wake-on-lan settings: No such device Cannot get message level: No such device Cannot get link status: No such device No data available With the patch installed. [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Link detected: yes Signed-off-by: Xiao Liang <xiliang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-22xen-netfront: fix queue name settingVitaly Kuznetsov
Commit f599c64fdf7d ("xen-netfront: Fix race between device setup and open") changed the initialization order: xennet_create_queues() now happens before we do register_netdev() so using netdev->name in xennet_init_queue() is incorrect, we end up with the following in /proc/interrupts: 60: 139 0 xen-dyn -event eth%d-q0-tx 61: 265 0 xen-dyn -event eth%d-q0-rx 62: 234 0 xen-dyn -event eth%d-q1-tx 63: 1 0 xen-dyn -event eth%d-q1-rx and this looks ugly. Actually, using early netdev name (even when it's already set) is also not ideal: nowadays we tend to rename eth devices and queue name may end up not corresponding to the netdev name. Use nodename from xenbus device for queue naming: this can't change in VM's lifetime. Now /proc/interrupts looks like 62: 202 0 xen-dyn -event device/vif/0-q0-tx 63: 317 0 xen-dyn -event device/vif/0-q0-rx 64: 262 0 xen-dyn -event device/vif/0-q1-tx 65: 17 0 xen-dyn -event device/vif/0-q1-rx Fixes: f599c64fdf7d ("xen-netfront: Fix race between device setup and open") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-09net: allow ndo_select_queue to pass netdevAlexander Duyck
This patch makes it so that instead of passing a void pointer as the accel_priv we instead pass a net_device pointer as sb_dev. Making this change allows us to pass the subordinate device through to the fallback function eventually so that we can keep the actual code in the ndo_select_queue call as focused on possible on the exception cases. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-22xen-netfront: Update features after registering netdevRoss Lagerwall
Update the features after calling register_netdev() otherwise the device features are not set up correctly and it not possible to change the MTU of the device. After this change, the features reported by ethtool match the device's features before the commit which introduced the issue and it is possible to change the device's MTU. Fixes: f599c64fdf7d ("xen-netfront: Fix race between device setup and open") Reported-by: Liam Shepherd <liam@dancer.es> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-22xen-netfront: Fix mismatched rtnl_unlockRoss Lagerwall
Fixes: f599c64fdf7d ("xen-netfront: Fix race between device setup and open") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12xen/netfront: raise max number of slots in xennet_get_responses()Juergen Gross
The max number of slots used in xennet_get_responses() is set to MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD). In old kernel-xen MAX_SKB_FRAGS was 18, while nowadays it is 17. This difference is resulting in frequent messages "too many slots" and a reduced network throughput for some workloads (factor 10 below that of a kernel-xen based guest). Replacing MAX_SKB_FRAGS by XEN_NETIF_NR_SLOTS_MIN for calculation of the max number of slots to use solves that problem (tests showed no more messages "too many slots" and throughput was as high as with the kernel-xen based guest system). Replace MAX_SKB_FRAGS-2 by XEN_NETIF_NR_SLOTS_MIN-1 in netfront_tx_slot_available() for making it clearer what is really being tested without actually modifying the tested value. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-14xen-netfront: fix xennet_start_xmit()'s return typeLuc Van Oostenryck
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, but the implementation in this driver returns an 'int'. Fix this by returning 'netdev_tx_t' in this driver too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-03-26drivers/net: Use octal not symbolic permissionsJoe Perches
Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28xen-netfront: Fix hang on device removalJason Andryuk
A toolstack may delete the vif frontend and backend xenstore entries while xen-netfront is in the removal code path. In that case, the checks for xenbus_read_driver_state would return XenbusStateUnknown, and xennet_remove would hang indefinitely. This hang prevents system shutdown. xennet_remove must be able to handle XenbusStateUnknown, and netback_changed must also wake up the wake_queue for that state as well. Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Cc: Eduardo Otubo <otubo@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-02-06xen-netfront: Fix race between device setup and openRoss Lagerwall
When a netfront device is set up it registers a netdev fairly early on, before it has set up the queues and is actually usable. A userspace tool like NetworkManager will immediately try to open it and access its state as soon as it appears. The bug can be reproduced by hotplugging VIFs until the VM runs out of grant refs. It registers the netdev but fails to set up any queues (since there are no more grant refs). In the meantime, NetworkManager opens the device and the kernel crashes trying to access the queues (of which there are none). Fix this in two ways: * For initial setup, register the netdev much later, after the queues are setup. This avoids the race entirely. * During a suspend/resume cycle, the frontend reconnects to the backend and the queues are recreated. It is possible (though highly unlikely) to race with something opening the device and accessing the queues after they have been destroyed but before they have been recreated. Extend the region covered by the rtnl semaphore to protect against this race. There is a possibility that we fail to recreate the queues so check for this in the open function. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-01-08xen-netfront: enable device after manual module loadEduardo Otubo
When loading the module after unloading it, the network interface would not be enabled and thus wouldn't have a backend counterpart and unable to be used by the guest. The guest would face errors like: [root@guest ~]# ethtool -i eth0 Cannot get driver information: No such device [root@guest ~]# ifconfig eth0 eth0: error fetching interface information: Device not found This patch initializes the state of the netfront device whenever it is loaded manually, this state would communicate the netback to create its device and establish the connection between them. Signed-off-by: Eduardo Otubo <otubo@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) The forcedeth conversion from pci_*() DMA interfaces to dma_*() ones missed one spot. From Zhu Yanjun. 2) Missing CRYPTO_SHA256 Kconfig dep in cfg80211, from Johannes Berg. 3) Fix checksum offloading in thunderx driver, from Sunil Goutham. 4) Add SPDX to vm_sockets_diag.h, from Stephen Hemminger. 5) Fix use after free of packet headers in TIPC, from Jon Maloy. 6) "sizeof(ptr)" vs "sizeof(*ptr)" bug in i40e, from Gustavo A R Silva. 7) Tunneling fixes in mlxsw driver, from Petr Machata. 8) Fix crash in fanout_demux_rollover() of AF_PACKET, from Mike Maloney. 9) Fix race in AF_PACKET bind() vs. NETDEV_UP notifier, from Eric Dumazet. 10) Fix regression in sch_sfq.c due to one of the timer_setup() conversions. From Paolo Abeni. 11) SCTP does list_for_each_entry() using wrong struct member, fix from Xin Long. 12) Don't use big endian netlink attribute read for IFLA_BOND_AD_ACTOR_SYSTEM, it is in cpu endianness. Also from Xin Long. 13) Fix mis-initialization of q->link.clock in CBQ scheduler, preventing adding filters there. From Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) ethernet: dwmac-stm32: Fix copyright net: via: via-rhine: use %p to format void * address instead of %x net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit myri10ge: Update MAINTAINERS net: sched: cbq: create block for q->link.block atm: suni: remove extraneous space to fix indentation atm: lanai: use %p to format kernel addresses instead of %x VSOCK: Don't set sk_state to TCP_CLOSE before testing it atm: fore200e: use %pK to format kernel addresses instead of %x ambassador: fix incorrect indentation of assignment statement vxlan: use __be32 type for the param vni in __vxlan_fdb_delete bonding: use nla_get_u64 to extract the value for IFLA_BOND_AD_ACTOR_SYSTEM sctp: use right member as the param of list_for_each_entry sch_sfq: fix null pointer dereference at timer expiration cls_bpf: don't decrement net's refcount when offload fails net/packet: fix a race in packet_bind() and packet_notifier() packet: fix crash in fanout_demux_rollover() sctp: remove extern from stream sched sctp: force the params with right types for sctp csum apis sctp: force SCTP_ERROR_INV_STRM with __u32 when calling sctp_chunk_fail ...
2017-11-27xen-netfront: remove warning when unloading moduleEduardo Otubo
v2: * Replace busy wait with wait_event()/wake_up_all() * Cannot garantee that at the time xennet_remove is called, the xen_netback state will not be XenbusStateClosed, so added a condition for that * There's a small chance for the xen_netback state is XenbusStateUnknown by the time the xen_netfront switches to Closed, so added a condition for that. When unloading module xen_netfront from guest, dmesg would output warning messages like below: [ 105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use! [ 105.236839] deferring g.e. 0x903 (pfn 0x35805) This problem relies on netfront and netback being out of sync. By the time netfront revokes the g.e.'s netback didn't have enough time to free all of them, hence displaying the warnings on dmesg. The trick here is to make netfront to wait until netback frees all the g.e.'s and only then continue to cleanup for the module removal, and this is done by manipulating both device states. Signed-off-by: Eduardo Otubo <otubo@redhat.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>