summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2018-07-07ipv6: fold sockcm_cookie into ipcm6_cookieWillem de Bruijn
ipcm_cookie includes sockcm_cookie. Do the same for ipcm6_cookie. This reduces the number of arguments that need to be passed around, applies ipcm6_init to all cookie fields at once and reduces code differentiation between ipv4 and ipv6. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07sock: sockc cookie initializerWillem de Bruijn
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba19978 ("ip: limit use of gso_size to udp"). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07ipv6: ipcm6_cookie initializerWillem de Bruijn
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba19978 ("ip: limit use of gso_size to udp"). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-07ipv4: ipcm_cookie initializersWillem de Bruijn
Initialize the cookie in one location to reduce code duplication and avoid bugs from inconsistent initialization, such as that fixed in commit 9887cba19978 ("ip: limit use of gso_size to udp"). Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-06Bluetooth: Use extended LE Connection if supportedJaganath Kanakkassery
This implements extended LE craete connection and enhanced LE conn complete event if the controller supports. For now it is as good as legacy LE connection and event as no new features in the extended connection is handled. < HCI Command: LE Extended Create Connection (0x08|0x0043) plen 26 Filter policy: White list is not used (0x00) Own address type: Public (0x00) Peer address type: Random (0x01) Peer address: DB:7E:2E:1D:85:E8 (Static) Initiating PHYs: 0x01 Entry 0: LE 1M Scan interval: 60.000 msec (0x0060) Scan window: 60.000 msec (0x0060) Min connection interval: 50.00 msec (0x0028) Max connection interval: 70.00 msec (0x0038) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Min connection length: 0.000 msec (0x0000) Max connection length: 0.000 msec (0x0000) > HCI Event: Command Status (0x0f) plen 4 LE Extended Create Connection (0x08|0x0043) ncmd 2 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 31 LE Enhanced Connection Complete (0x0a) Status: Success (0x00) Handle: 3585 Role: Master (0x00) Peer address type: Random (0x01) Peer address: DB:7E:2E:1D:85:E8 (Static) Local resolvable private address: 00:00:00:00:00:00 (Non-Resolvable) Peer resolvable private address: 00:00:00:00:00:00 (Non-Resolvable) Connection interval: 67.50 msec (0x0036) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x00 @ MGMT Event: Device Connected (0x000b) plen 40 LE Address: DB:7E:2E:1D:85:E8 (Static) Flags: 0x00000000 Data length: 27 Name (complete): Designer Mouse Appearance: Mouse (0x03c2) Flags: 0x05 LE Limited Discoverable Mode BR/EDR Not Supported 16-bit Service UUIDs (complete): 1 entry Human Interface Device (0x1812) Signed-off-by: Jaganath Kanakkassery <jaganathx.kanakkassery@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06Bluetooth: Introduce helpers for le conn status and completeJaganath Kanakkassery
This is done so that the helpers can be used for extended conn implementation which will be done in subsequent patch. Signed-off-by: Jaganath Kanakkassery <jaganathx.kanakkassery@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06Bluetooth: Process extended ADV report eventJaganath Kanakkassery
This patch enables Extended ADV report event if extended scanning is supported in the controller and process the same. The new features are not handled and for now its as good as legacy ADV report. > HCI Event: LE Meta Event (0x3e) plen 53 LE Extended Advertising Report (0x0d) Num reports: 1 Entry 0 Event type: 0x0013 Props: 0x0013 Connectable Scannable Use legacy advertising PDUs Data status: Complete Legacy PDU Type: ADV_IND (0x0013) Address type: Random (0x01) Address: DB:7E:2E:1A:85:E8 (Static) Primary PHY: LE 1M Secondary PHY: LE 1M SID: 0x00 TX power: 0 dBm RSSI: -90 dBm (0xa6) Periodic advertising invteral: 0.00 msec (0x0000) Direct address type: Public (0x00) Direct address: 00:00:00:00:00:00 (OUI 00-00-00) Data length: 0x1b 0f 09 44 65 73 69 67 6e 65 72 20 4d 6f 75 73 65 ..Designer Mouse 03 19 c2 03 02 01 05 03 03 12 18 ........... Signed-off-by: Jaganath Kanakkassery <jaganathx.kanakkassery@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06Bluetooth: Use extended scanning if controller supportsJaganath Kanakkassery
This implements Set extended scan param and set extended scan enable commands and use it for start LE scan based on controller support. The new features added in these commands are setting of new PHY for scanning and setting of scan duration. Both features are disabled for now, meaning only 1M PHY is set and scan duration is set to 0 which means that scanning will be done untill scan disable is called. < HCI Command: LE Set Extended Scan Parameters (0x08|0x0041) plen 8 Own address type: Random (0x01) Filter policy: Accept all advertisement (0x00) PHYs: 0x01 Entry 0: LE 1M Type: Active (0x01) Interval: 11.250 msec (0x0012) Window: 11.250 msec (0x0012) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Extended Scan Enable (0x08|0x0042) plen 6 Extended scan: Enabled (0x01) Filter duplicates: Enabled (0x01) Duration: 0 msec (0x0000) Period: 0.00 sec (0x0000) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Enable (0x08|0x0042) ncmd 2 Status: Success (0x00) Signed-off-by: Jaganath Kanakkassery <jaganathx.kanakkassery@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06Bluetooth: Introduce helpers for LE set scan start and completeJaganath Kanakkassery
Introduce a helper hci_req_start_scan() which starts an LE scan and call it from passive_Scan() and active_scan(). There is not functionality change in this patch. This is basically done to enable extended scanning if the controller supports which will be done in the subsequent patch Signed-off-by: Jaganath Kanakkassery <jaganathx.kanakkassery@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06netfilter: nf_tables: place all set backends in one single modulePablo Neira Ayuso
This patch disallows rbtree with single elements, which is causing problems with the recent timeout support. Before this patch, you could opt out individual set representations per module, which is just adding extra complexity. Fixes: 8d8540c4f5e0("netfilter: nft_set_rbtree: add timeout support") Reported-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-06nl80211/mac80211: allow non-linear skb in rx_control_portDenis Kenzior
The current implementation of cfg80211_rx_control_port assumed that the caller could provide a contiguous region of memory for the control port frame to be sent up to userspace. Unfortunately, many drivers produce non-linear skbs, especially for data frames. This resulted in userspace getting notified of control port frames with correct metadata (from address, port, etc) yet garbage / nonsense contents, resulting in bad handshakes, disconnections, etc. mac80211 linearizes skbs containing management frames. But it didn't seem worthwhile to do this for control port frames. Thus the signature of cfg80211_rx_control_port was changed to take the skb directly. nl80211 then takes care of obtaining control port frame data directly from the (linear | non-linear) skb. The caller is still responsible for freeing the skb, cfg80211_rx_control_port does not take ownership of it. Fixes: 6a671a50f819 ("nl80211: Add CMD_CONTROL_PORT_FRAME API") Signed-off-by: Denis Kenzior <denkenz@gmail.com> [fix some kernel-doc formatting, add fixes tag] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-07-06netfilter: nf_tproxy: fix possible non-linear access to transport headerMáté Eckl
This patch fixes a silent out-of-bound read possibility that was present because of the misuse of this function. Mostly it was called with a struct udphdr *hp which had only the udphdr part linearized by the skb_header_pointer, however nf_tproxy_get_sock_v{4,6} uses it as a tcphdr pointer, so some reads for tcp specific attributes may be invalid. Fixes: a583636a83ea ("inet: refactor inet[6]_lookup functions to take skb") Signed-off-by: Máté Eckl <ecklm94@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-06Bluetooth: Add HCI command for clear Resolv listAnkit Navik
Check for Resolv list supported by controller. So check the supported commmand first before issuing this command i.e.,HCI_OP_LE_CLEAR_RESOLV_LIST Before patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 #55 [hci0] 13.338168 > HCI Event: Command Complete (0x0e) plen 5 #56 [hci0] 13.338842 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 #57 [hci0] 13.339029 > HCI Event: Command Complete (0x0e) plen 4 #58 [hci0] 13.339939 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Resolving L.. (0x08|0x002a) plen 0 #59 [hci0] 13.340152 > HCI Event: Command Complete (0x0e) plen 5 #60 [hci0] 13.340952 LE Read Resolving List Size (0x08|0x002a) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 #61 [hci0] 13.341180 > HCI Event: Command Complete (0x0e) plen 12 #62 [hci0] 13.341898 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 After patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 #55 [hci0] 28.919131 > HCI Event: Command Complete (0x0e) plen 5 #56 [hci0] 28.920016 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 #57 [hci0] 28.920164 > HCI Event: Command Complete (0x0e) plen 4 #58 [hci0] 28.920873 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Resolving L.. (0x08|0x002a) plen 0 #59 [hci0] 28.921109 > HCI Event: Command Complete (0x0e) plen 5 #60 [hci0] 28.922016 LE Read Resolving List Size (0x08|0x002a) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear Resolving... (0x08|0x0029) plen 0 #61 [hci0] 28.922166 > HCI Event: Command Complete (0x0e) plen 4 #62 [hci0] 28.922872 LE Clear Resolving List (0x08|0x0029) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 #63 [hci0] 28.923117 > HCI Event: Command Complete (0x0e) plen 12 #64 [hci0] 28.924030 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06Bluetooth: Store Resolv list sizeAnkit Navik
When the controller supports the Read LE Resolv List size feature, the maximum list size are read and now stored. Before patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 #55 [hci0] 17.979791 > HCI Event: Command Complete (0x0e) plen 5 #56 [hci0] 17.980629 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 #57 [hci0] 17.980786 > HCI Event: Command Complete (0x0e) plen 4 #58 [hci0] 17.981627 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 #59 [hci0] 17.981786 > HCI Event: Command Complete (0x0e) plen 12 #60 [hci0] 17.982636 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 After patch: < HCI Command: LE Read White List... (0x08|0x000f) plen 0 #55 [hci0] 13.338168 > HCI Event: Command Complete (0x0e) plen 5 #56 [hci0] 13.338842 LE Read White List Size (0x08|0x000f) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Clear White List (0x08|0x0010) plen 0 #57 [hci0] 13.339029 > HCI Event: Command Complete (0x0e) plen 4 #58 [hci0] 13.339939 LE Clear White List (0x08|0x0010) ncmd 1 Status: Success (0x00) < HCI Command: LE Read Resolving L.. (0x08|0x002a) plen 0 #59 [hci0] 13.340152 > HCI Event: Command Complete (0x0e) plen 5 #60 [hci0] 13.340952 LE Read Resolving List Size (0x08|0x002a) ncmd 1 Status: Success (0x00) Size: 25 < HCI Command: LE Read Maximum Dat.. (0x08|0x002f) plen 0 #61 [hci0] 13.341180 > HCI Event: Command Complete (0x0e) plen 12 #62 [hci0] 13.341898 LE Read Maximum Data Length (0x08|0x002f) ncmd 1 Status: Success (0x00) Max TX octets: 251 Max TX time: 17040 Max RX octets: 251 Max RX time: 17040 Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-066lowpan: iphc: reset mac_header after decompress to fix panicMichael Scott
After decompression of 6lowpan socket data, an IPv6 header is inserted before the existing socket payload. After this, we reset the network_header value of the skb to account for the difference in payload size from prior to decompression + the addition of the IPv6 header. However, we fail to reset the mac_header value. Leaving the mac_header value untouched here, can cause a calculation error in net/packet/af_packet.c packet_rcv() function when an AF_PACKET socket is opened in SOCK_RAW mode for use on a 6lowpan interface. On line 2088, the data pointer is moved backward by the value returned from skb_mac_header(). If skb->data is adjusted so that it is before the skb->head pointer (which can happen when an old value of mac_header is left in place) the kernel generates a panic in net/core/skbuff.c line 1717. This panic can be generated by BLE 6lowpan interfaces (such as bt0) and 802.15.4 interfaces (such as lowpan0) as they both use the same 6lowpan sources for compression and decompression. Signed-off-by: Michael Scott <michael@opensourcefoundries.com> Acked-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-07-06ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user nsTyler Hicks
The low and high values of the net.ipv4.ping_group_range sysctl were being silently forced to the default disabled state when a write to the sysctl contained GIDs that didn't map to the associated user namespace. Confusingly, the sysctl's write operation would return success and then a subsequent read of the sysctl would indicate that the low and high values are the overflowgid. This patch changes the behavior by clearly returning an error when the sysctl write operation receives a GID range that doesn't map to the associated user namespace. In such a situation, the previous value of the sysctl is preserved and that range will be returned in a subsequent read of the sysctl. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-06net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()Edward Cree
Essentially the same as the ipv4 equivalents. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-06net: ipv4: fix list processing on L3 slave devicesEdward Cree
If we have an L3 master device, l3mdev_ip_rcv() will steal the skb, but we were returning NET_RX_SUCCESS from ip_rcv_finish_core() which meant that ip_list_rcv_finish() would keep it on the list. Instead let's move the l3mdev_ip_rcv() call into the caller, so that our response to a steal can be different in the single packet path (return NET_RX_SUCCESS) and the list path (forget this packet and continue). Fixes: 5fa12739a53d ("net: ipv4: listify ip_rcv_finish") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05leds: triggers: let struct led_trigger::activate() return an error codeUwe Kleine-König
Given that activating a trigger can fail, let the callback return an indication. This prevents to have a trigger active according to the "trigger" sysfs attribute but not functional. All users are changed accordingly to return 0 for now. There is no intended change in behaviour. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
2018-07-05batman-adv: fix checkpatch warning about misspelled "cache"Sven Eckelmann
commit a2d4df9b673c ("spelling.txt: add more spellings to spelling.txt") introduced the spellcheck of "cache" for checkpatch.pl. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2018-07-05net: core: filter: mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Warning level 2 was used: -Wimplicit-fallthrough=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05net: decnet: dn_nsp_in: mark expected switch fall-throughGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05tipc: mark expected switch fall-throughsGustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Warning level 2 was used: -Wimplicit-fallthrough=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05net: qrtr: Reset the node and port ID of broadcast messagesArun Kumar Neelakantam
All the control messages broadcast to remote routers are using QRTR_NODE_BCAST instead of using local router NODE ID which cause the packets to be dropped on remote router due to invalid NODE ID. Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05net: qrtr: Broadcast messages only from control portArun Kumar Neelakantam
The broadcast node id should only be sent with the control port id. Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05ipv6: make ipv6_renew_options() interrupt/kernel safePaul Moore
At present the ipv6_renew_options_kern() function ends up calling into access_ok() which is problematic if done from inside an interrupt as access_ok() calls WARN_ON_IN_IRQ() on some (all?) architectures (x86-64 is affected). Example warning/backtrace is shown below: WARNING: CPU: 1 PID: 3144 at lib/usercopy.c:11 _copy_from_user+0x85/0x90 ... Call Trace: <IRQ> ipv6_renew_option+0xb2/0xf0 ipv6_renew_options+0x26a/0x340 ipv6_renew_options_kern+0x2c/0x40 calipso_req_setattr+0x72/0xe0 netlbl_req_setattr+0x126/0x1b0 selinux_netlbl_inet_conn_request+0x80/0x100 selinux_inet_conn_request+0x6d/0xb0 security_inet_conn_request+0x32/0x50 tcp_conn_request+0x35f/0xe00 ? __lock_acquire+0x250/0x16c0 ? selinux_socket_sock_rcv_skb+0x1ae/0x210 ? tcp_rcv_state_process+0x289/0x106b tcp_rcv_state_process+0x289/0x106b ? tcp_v6_do_rcv+0x1a7/0x3c0 tcp_v6_do_rcv+0x1a7/0x3c0 tcp_v6_rcv+0xc82/0xcf0 ip6_input_finish+0x10d/0x690 ip6_input+0x45/0x1e0 ? ip6_rcv_finish+0x1d0/0x1d0 ipv6_rcv+0x32b/0x880 ? ip6_make_skb+0x1e0/0x1e0 __netif_receive_skb_core+0x6f2/0xdf0 ? process_backlog+0x85/0x250 ? process_backlog+0x85/0x250 ? process_backlog+0xec/0x250 process_backlog+0xec/0x250 net_rx_action+0x153/0x480 __do_softirq+0xd9/0x4f7 do_softirq_own_stack+0x2a/0x40 </IRQ> ... While not present in the backtrace, ipv6_renew_option() ends up calling access_ok() via the following chain: access_ok() _copy_from_user() copy_from_user() ipv6_renew_option() The fix presented in this patch is to perform the userspace copy earlier in the call chain such that it is only called when the option data is actually coming from userspace; that place is do_ipv6_setsockopt(). Not only does this solve the problem seen in the backtrace above, it also allows us to simplify the code quite a bit by removing ipv6_renew_options_kern() completely. We also take this opportunity to cleanup ipv6_renew_options()/ipv6_renew_option() a small amount as well. This patch is heavily based on a rough patch by Al Viro. I've taken his original patch, converted a kmemdup() call in do_ipv6_setsockopt() to a memdup_user() call, made better use of the e_inval jump target in the same function, and cleaned up the use ipv6_renew_option() by ipv6_renew_options(). CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add enable_sriov boolean generic parameterVasundhara Volam
enable_sriov - Enables Single-Root Input/Output Virtualization(SR-IOV) characteristic of the device. Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add generic parameters internal_err_reset and max_macsMoshe Shemesh
Add 2 first generic parameters to devlink configuration parameters set: internal_err_reset - When set enables reset device on internal errors. max_macs - max number of MACs per ETH port. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add devlink notifications support for paramsMoshe Shemesh
Add devlink_param_notify() function to support devlink param notifications. Add notification call to devlink param set, register and unregister functions. Add devlink_param_value_changed() function to enable the driver notify devlink on value change. Driver should use this function after value was changed on any configuration mode part to driverinit. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add support for get/set driverinit valueMoshe Shemesh
"driverinit" configuration mode value is held by devlink to enable the driver query the value after reload. Two additional functions added to help the driver get/set the value from/to devlink: devlink_param_driverinit_value_set() and devlink_param_driverinit_value_get(). Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add param set commandMoshe Shemesh
Add param set command to set value for a parameter. Value can be set to any of the supported configuration modes. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add param get commandMoshe Shemesh
Add param get command which gets data per parameter. Option to dump the parameters data per device. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05devlink: Add devlink_param register and unregisterMoshe Shemesh
Define configuration parameters data structure. Add functions to register and unregister the driver supported configuration parameters table. For each parameter registered, the driver should fill all the parameter's fields. In case the only supported configuration mode is "driverinit" the parameter's get()/set() functions are not required and should be set to NULL, for any other configuration mode, these functions are required and should be set by the driver. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05net: limit each hash list length to MAX_GRO_SKBSLi RongQing
After commit 07d78363dcff ("net: Convert NAPI gro list into a small hash table.")' there is 8 hash buckets, which allows more flows to be held for merging. but MAX_GRO_SKBS, the total held skb for merging, is 8 skb still, limit the hash table performance. keep MAX_GRO_SKBS as 8 skb, but limit each hash list length to 8 skb, not the total 8 skb Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-05netfilter: x_tables: set module owner for icmp(6) matchesFlorian Westphal
nft_compat relies on xt_request_find_match to increment refcount of the module that provides the match/target. The (builtin) icmp matches did't set the module owner so it was possible to rmmod ip(6)tables while icmp extensions were still in use. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-05ieee802154: 6lowpan: set IFLA_LINKLubomir Rintel
Otherwise NetworkManager (and iproute alike) is not able to identify the parent IEEE 802.15.4 interface of a 6LoWPAN link. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2018-07-05net: ipv4: fix drop handling in ip_list_rcv() and ip_list_rcv_finish()Edward Cree
Since callees (ip_rcv_core() and ip_rcv_finish_core()) might free or steal the skb, we can't use the list_cut_before() method; we can't even do a list_del(&skb->list) in the drop case, because skb might have already been freed and reused. So instead, take each skb off the source list before processing, and add it to the sublist afterwards if it wasn't freed or stolen. Fixes: 5fa12739a53d net: ipv4: listify ip_rcv_finish Fixes: 17266ee93984 net: ipv4: listified version of ip_rcv Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net/sched: Make etf report drops on error_queueJesus Sanchez-Palencia
Use the socket error queue for reporting dropped packets if the socket has enabled that feature through the SO_TXTIME API. Packets are dropped either on enqueue() if they aren't accepted by the qdisc or on dequeue() if the system misses their deadline. Those are reported as different errors so applications can react accordingly. Userspace can retrieve the errors through the socket error queue and the corresponding cmsg interfaces. A struct sock_extended_err* is used for returning the error data, and the packet's timestamp can be retrieved by adding both ee_data and ee_info fields as e.g.: ((__u64) serr->ee_data << 32) + serr->ee_info This feature is disabled by default and must be explicitly enabled by applications. Enabling it can bring some overhead for the Tx cycles of the application. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net/sched: Add HW offloading capability to ETFJesus Sanchez-Palencia
Add infra so etf qdisc supports HW offload of time-based transmission. For hw offload, the time sorted list is still used, so packets are dequeued always in order of txtime. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 etf offload delta 100000 \ clockid CLOCK_REALTIME In this example, the Qdisc will use HW offload for the control of the transmission time through the network adapter. The hrtimer used for packets scheduling inside the qdisc will use the clockid CLOCK_REALTIME as reference and packets leave the Qdisc "delta" (100000) nanoseconds before their transmission time. Because this will be using HW offload and since dynamic clocks are not supported by the hrtimer, the system clock and the PHC clock must be synchronized for this mode to behave as expected. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net/sched: Introduce the ETF QdiscVinicius Costa Gomes
The ETF (Earliest TxTime First) qdisc uses the information added earlier in this series (the socket option SO_TXTIME and the new role of sk_buff->tstamp) to schedule packets transmission based on absolute time. For some workloads, just bandwidth enforcement is not enough, and precise control of the transmission of packets is necessary. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 etf delta 100000 \ clockid CLOCK_TAI In this example, the Qdisc will provide SW best-effort for the control of the transmission time to the network adapter, the time stamp in the socket will be in reference to the clockid CLOCK_TAI and packets will leave the qdisc "delta" (100000) nanoseconds before its transmission time. The ETF qdisc will buffer packets sorted by their txtime. It will drop packets on enqueue() if their skbuff clockid does not match the clock reference of the Qdisc. Moreover, on dequeue(), a packet will be dropped if it expires while being enqueued. The qdisc also supports the SO_TXTIME deadline mode. For this mode, it will dequeue a packet as soon as possible and change the skb timestamp to 'now' during etf_dequeue(). Note that both the qdisc's and the SO_TXTIME ABIs allow for a clockid to be configured, but it's been decided that usage of CLOCK_TAI should be enforced until we decide to allow for other clockids to be used. The rationale here is that PTP times are usually in the TAI scale, thus no other clocks should be necessary. For now, the qdisc will return EINVAL if any clocks other than CLOCK_TAI are used. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net/sched: Allow creating a Qdisc watchdog with other clocksVinicius Costa Gomes
This adds 'qdisc_watchdog_init_clockid()' that allows a clockid to be passed, this allows other time references to be used when scheduling the Qdisc to run. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net: packet: Hook into time based transmission.Richard Cochran
For raw layer-2 packets, copy the desired future transmit time from the CMSG cookie into the skb. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net: ipv6: Hook into time based transmissionJesus Sanchez-Palencia
Add a struct sockcm_cookie parameter to ip6_setup_cork() so we can easily re-use the transmit_time field from struct inet_cork for most paths, by copying the timestamp from the CMSG cookie. This is later copied into the skb during __ip6_make_skb(). For the raw fast path, also pass the sockcm_cookie as a parameter so we can just perform the copy at rawv6_send_hdrinc() directly. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net: ipv4: Hook into time based transmissionJesus Sanchez-Palencia
Add a transmit_time field to struct inet_cork, then copy the timestamp from the CMSG cookie at ip_setup_cork() so we can safely copy it into the skb later during __ip_make_skb(). For the raw fast path, just perform the copy at raw_send_hdrinc(). Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net: Add a new socket option for a future transmit time.Richard Cochran
This patch introduces SO_TXTIME. User space enables this option in order to pass a desired future transmit time in a CMSG when calling sendmsg(2). The argument to this socket option is a 8-bytes long struct provided by the uapi header net_tstamp.h defined as: struct sock_txtime { clockid_t clockid; u32 flags; }; Note that new fields were added to struct sock by filling a 2-bytes hole found in the struct. For that reason, neither the struct size or number of cachelines were altered. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net: Clear skb->tstamp only on the forwarding pathJesus Sanchez-Palencia
This is done in preparation for the upcoming time based transmission patchset. Now that skb->tstamp will be used to hold packet's txtime, we must ensure that it is being cleared when traversing namespaces. Also, doing that from skb_scrub_packet() before the early return would break our feature when tunnels are used. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net: sched: act_pedit: fix possible memory leak in tcf_pedit_init()Wei Yongjun
'keys_ex' is malloced by tcf_pedit_keys_ex_parse() in tcf_pedit_init() but not all of the error handle path free it, this may cause memory leak. This patch fix it. Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04sctp: fix the issue that pathmtu may be set lower than MINSEGMENTXin Long
After commit b6c5734db070 ("sctp: fix the handling of ICMP Frag Needed for too small MTUs"), sctp_transport_update_pmtu would refetch pathmtu from the dst and set it to transport's pathmtu without any check. The new pathmtu may be lower than MINSEGMENT if the dst is obsolete and updated by .get_dst() in sctp_transport_update_pmtu. In this case, it could have a smaller MTU as well, and thus we should validate it against MINSEGMENT instead. Syzbot reported a warning in sctp_mtu_payload caused by this. This patch refetches the pathmtu by calling sctp_dst_mtu where it does the check against MINSEGMENT. v1->v2: - refetch the pathmtu by calling sctp_dst_mtu instead as Marcelo's suggestion. Fixes: b6c5734db070 ("sctp: fix the handling of ICMP Frag Needed for too small MTUs") Reported-by: syzbot+f0d9d7cba052f9344b03@syzkaller.appspotmail.com Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net:sched: add action inheritdsfield to skbeditQiaobin Fu
The new action inheritdsfield copies the field DS of IPv4 and IPv6 packets into skb->priority. This enables later classification of packets based on the DS field. v5: *Update the drop counter for TC_ACT_SHOT v4: *Not allow setting flags other than the expected ones. *Allow dumping the pure flags. v3: *Use optional flags, so that it won't break old versions of tc. *Allow users to set both SKBEDIT_F_PRIORITY and SKBEDIT_F_INHERITDSFIELD flags. v2: *Fix the style issue *Move the code from skbmod to skbedit Original idea by Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Qiaobin Fu <qiaobinf@bu.edu> Reviewed-by: Michel Machado <michel@digirati.com.br> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-04net/ipv6: Revert attempt to simplify route replace and appendDavid Ahern
NetworkManager likes to manage linklocal prefix routes and does so with the NLM_F_APPEND flag, breaking attempts to simplify the IPv6 route code and by extension enable multipath routes with device only nexthops. Revert f34436a43092 and these followup patches: 6eba08c3626b ("ipv6: Only emit append events for appended routes"). ce45bded6435 ("mlxsw: spectrum_router: Align with new route replace logic") 53b562df8c20 ("mlxsw: spectrum_router: Allow appending to dev-only routes") Update the fib_tests cases to reflect the old behavior. Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") Signed-off-by: David Ahern <dsahern@gmail.com>