summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2019-07-06SUNRPC: Replace the queue timer with a delayed work functionTrond Myklebust
The queue timer function, which walks the RPC queue in order to locate candidates for waking up is one of the current constraints against removing the bh-safe queue spin locks. Replace it with a delayed work queue, so that we can do the actual rpc task wake ups from an ordinary process context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-06Bluetooth: Add SMP workaround Microsoft Surface Precision Mouse bugSzymon Janc
Microsoft Surface Precision Mouse provides bogus identity address when pairing. It connects with Static Random address but provides Public Address in SMP Identity Address Information PDU. Address has same value but type is different. Workaround this by dropping IRK if ID address discrepancy is detected. > HCI Event: LE Meta Event (0x3e) plen 19 LE Connection Complete (0x01) Status: Success (0x00) Handle: 75 Role: Master (0x00) Peer address type: Random (0x01) Peer address: E0:52:33:93:3B:21 (Static) Connection interval: 50.00 msec (0x0028) Connection latency: 0 (0x0000) Supervision timeout: 420 msec (0x002a) Master clock accuracy: 0x00 .... > ACL Data RX: Handle 75 flags 0x02 dlen 12 SMP: Identity Address Information (0x09) len 7 Address type: Public (0x00) Address: E0:52:33:93:3B:21 Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Tested-by: Maarten Fonville <maarten.fonville@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199461 Cc: stable@vger.kernel.org Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06Bluetooth: L2CAP: Check bearer type on __l2cap_global_chan_by_addrLuiz Augusto von Dentz
The spec defines PSM and LE_PSM as different domains so a listen on the same PSM is valid if the address type points to a different bearer. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06Bluetooth: Use controller sets when availableLuiz Augusto von Dentz
This makes use of controller sets when using Extended Advertising feature thus offloading the scheduling to the controller. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06Bluetooth: validate BLE connection interval updatescsonsino
Problem: The Linux Bluetooth stack yields complete control over the BLE connection interval to the remote device. The Linux Bluetooth stack provides access to the BLE connection interval min and max values through /sys/kernel/debug/bluetooth/hci0/ conn_min_interval and /sys/kernel/debug/bluetooth/hci0/conn_max_interval. These values are used for initial BLE connections, but the remote device has the ability to request a connection parameter update. In the event that the remote side requests to change the connection interval, the Linux kernel currently only validates that the desired value is within the acceptable range in the Bluetooth specification (6 - 3200, corresponding to 7.5ms - 4000ms). There is currently no validation that the desired value requested by the remote device is within the min/max limits specified in the conn_min_interval/conn_max_interval configurations. This essentially leads to Linux yielding complete control over the connection interval to the remote device. The proposed patch adds a verification step to the connection parameter update mechanism, ensuring that the desired value is within the min/max bounds of the current connection. If the desired value is outside of the current connection min/max values, then the connection parameter update request is rejected and the negative response is returned to the remote device. Recall that the initial connection is established using the local conn_min_interval/conn_max_interval values, so this allows the Linux administrator to retain control over the BLE connection interval. The one downside that I see is that the current default Linux values for conn_min_interval and conn_max_interval typically correspond to 30ms and 50ms respectively. If this change were accepted, then it is feasible that some devices would no longer be able to negotiate to their desired connection interval values. This might be remedied by setting the default Linux conn_min_interval and conn_max_interval values to the widest supported range (6 - 3200 / 7.5ms - 4000ms). This could lead to the same behavior as the current implementation, where the remote device could request to change the connection interval value to any value that is permitted by the Bluetooth specification, and Linux would accept the desired value. Signed-off-by: Carey Sonsino <csonsino@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06Bluetooth: Add support for LE ping featureSpoorthi Ravishankar Koppad
Changes made to add HCI Write Authenticated Payload timeout command for LE Ping feature. As per the Core Specification 5.0 Volume 2 Part E Section 7.3.94, the following code changes implements HCI Write Authenticated Payload timeout command for LE Ping feature. Signed-off-by: Spoorthi Ravishankar Koppad <spoorthix.k@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06Bluetooth: Check state in l2cap_disconnect_rspMatias Karhumaa
Because of both sides doing L2CAP disconnection at the same time, it was possible to receive L2CAP Disconnection Response with CID that was already freed. That caused problems if CID was already reused and L2CAP Connection Request with same CID was sent out. Before this patch kernel deleted channel context regardless of the state of the channel. Example where leftover Disconnection Response (frame #402) causes local device to delete L2CAP channel which was not yet connected. This in turn confuses remote device's stack because same CID is re-used without properly disconnecting. Btmon capture before patch: ** snip ** > ACL Data RX: Handle 43 flags 0x02 dlen 8 #394 [hci1] 10.748949 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #395 [hci1] 10.749062 Channel: 65 len 4 [PSM 3 mode 0] {chan 2} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #396 [hci1] 10.749073 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Packets (0x13) plen 5 #397 [hci1] 10.752391 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #398 [hci1] 10.753394 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #399 [hci1] 10.756499 L2CAP: Disconnection Request (0x06) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #400 [hci1] 10.756548 L2CAP: Disconnection Response (0x07) ident 26 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #401 [hci1] 10.757459 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #402 [hci1] 10.759148 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 = bluetoothd: 00:1E:AB:4C:56:54: error updating services: Input/o.. 10.759447 > HCI Event: Number of Completed Packets (0x13) plen 5 #403 [hci1] 10.759386 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #404 [hci1] 10.760397 L2CAP: Connection Request (0x02) ident 27 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #405 [hci1] 10.760441 L2CAP: Connection Response (0x03) ident 27 len 8 Destination CID: 65 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #406 [hci1] 10.760449 L2CAP: Configure Request (0x04) ident 19 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > HCI Event: Number of Completed Packets (0x13) plen 5 #407 [hci1] 10.761399 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #408 [hci1] 10.762942 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Similar case after the patch: *snip* > ACL Data RX: Handle 43 flags 0x02 dlen 8 #22702 [hci0] 1664.411056 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Disconnect (DISC) (0x43) Address: 0x03 cr 1 dlci 0x00 Control: 0x53 poll/final 1 Length: 0 FCS: 0xfd < ACL Data TX: Handle 43 flags 0x00 dlen 8 #22703 [hci0] 1664.411136 Channel: 65 len 4 [PSM 3 mode 0] {chan 3} RFCOMM: Unnumbered Ack (UA) (0x63) Address: 0x03 cr 1 dlci 0x00 Control: 0x73 poll/final 1 Length: 0 FCS: 0xd7 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22704 [hci0] 1664.411143 L2CAP: Disconnection Request (0x06) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22705 [hci0] 1664.414009 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22706 [hci0] 1664.415007 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22707 [hci0] 1664.418674 L2CAP: Disconnection Request (0x06) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22708 [hci0] 1664.418762 L2CAP: Disconnection Response (0x07) ident 17 len 4 Destination CID: 65 Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 12 #22709 [hci0] 1664.421073 L2CAP: Connection Request (0x02) ident 12 len 4 PSM: 1 (0x0001) Source CID: 65 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22710 [hci0] 1664.421371 L2CAP: Disconnection Response (0x07) ident 11 len 4 Destination CID: 65 Source CID: 65 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22711 [hci0] 1664.424082 Num handles: 1 Handle: 43 Count: 1 > HCI Event: Number of Completed Pac.. (0x13) plen 5 #22712 [hci0] 1664.425040 Num handles: 1 Handle: 43 Count: 1 > ACL Data RX: Handle 43 flags 0x02 dlen 12 #22713 [hci0] 1664.426103 L2CAP: Connection Request (0x02) ident 18 len 4 PSM: 3 (0x0003) Source CID: 65 < ACL Data TX: Handle 43 flags 0x00 dlen 16 #22714 [hci0] 1664.426186 L2CAP: Connection Response (0x03) ident 18 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 43 flags 0x00 dlen 27 #22715 [hci0] 1664.426196 L2CAP: Configure Request (0x04) ident 13 len 19 Destination CID: 65 Flags: 0x0000 Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 1013 Option: Retransmission and Flow Control (0x04) [mandatory] Mode: Basic (0x00) TX window size: 0 Max transmit: 0 Retransmission timeout: 0 Monitor timeout: 0 Maximum PDU size: 0 > ACL Data RX: Handle 43 flags 0x02 dlen 16 #22716 [hci0] 1664.428804 L2CAP: Connection Response (0x03) ident 12 len 8 Destination CID: 66 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) *snip* Fix is to check that channel is in state BT_DISCONN before deleting the channel. This bug was found while fuzzing Bluez's OBEX implementation using Synopsys Defensics. Reported-by: Matti Kamunen <matti.kamunen@synopsys.com> Reported-by: Ari Timonen <ari.timonen@synopsys.com> Signed-off-by: Matias Karhumaa <matias.karhumaa@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06Bluetooth: hidp: NUL terminate a string in the compat ioctlDan Carpenter
This change is similar to commit a1616a5ac99e ("Bluetooth: hidp: fix buffer overflow") but for the compat ioctl. We take a string from the user and forgot to ensure that it's NUL terminated. I have also changed the strncpy() in to strscpy() in hidp_setup_hid(). The difference is the strncpy() doesn't necessarily NUL terminate the destination string. Either change would fix the problem but it's nice to take a belt and suspenders approach and do both. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-066lowpan: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Because we don't care if debugfs works or not, this trickles back a bit so we can clean things up by making some functions return void instead of an error value that is never going to fail. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2019-07-06netfilter: nf_tables: force module load in case select_ops() returns -EAGAINPablo Neira Ayuso
nft_meta needs to pull in the nft_meta_bridge module in case that this is a bridge family rule from the select_ops() path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05Merge tag 'nfsd-5.2-2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "Two more quick bugfixes for nfsd: fixing a regression causing mount failures on high-memory machines and fixing the DRC over RDMA" * tag 'nfsd-5.2-2' of git://linux-nfs.org/~bfields/linux: nfsd: Fix overflow causing non-working mounts on 1 TB machines svcrdma: Ignore source port when computing DRC hash
2019-07-05ipv4: Fix NULL pointer dereference in ipv4_neigh_lookup()Ido Schimmel
Both ip_neigh_gw4() and ip_neigh_gw6() can return either a valid pointer or an error pointer, but the code currently checks that the pointer is not NULL. Fix this by checking that the pointer is not an error pointer, as this can result in a NULL pointer dereference [1]. Specifically, I believe that what happened is that ip_neigh_gw4() returned '-EINVAL' (0xffffffffffffffea) to which the offset of 'refcnt' (0x70) was added, which resulted in the address 0x000000000000005a. [1] BUG: KASAN: null-ptr-deref in refcount_inc_not_zero_checked+0x6e/0x180 Read of size 4 at addr 000000000000005a by task swapper/2/0 CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.2.0-rc6-custom-reg-179657-gaa32d89 #396 Hardware name: Mellanox Technologies Ltd. MSN2010/SA002610, BIOS 5.6.5 08/24/2017 Call Trace: <IRQ> dump_stack+0x73/0xbb __kasan_report+0x188/0x1ea kasan_report+0xe/0x20 refcount_inc_not_zero_checked+0x6e/0x180 ipv4_neigh_lookup+0x365/0x12c0 __neigh_update+0x1467/0x22f0 arp_process.constprop.6+0x82e/0x1f00 __netif_receive_skb_one_core+0xee/0x170 process_backlog+0xe3/0x640 net_rx_action+0x755/0xd90 __do_softirq+0x29b/0xae7 irq_exit+0x177/0x1c0 smp_apic_timer_interrupt+0x164/0x5e0 apic_timer_interrupt+0xf/0x20 </IRQ> Fixes: 5c9f7c1dfc2e ("ipv4: Add helpers for neigh lookup for nexthop") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net: remove unused parameter from skb_checksum_try_convertLi RongQing
the check parameter is never used Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05hsr: fix a NULL pointer deref in hsr_dev_xmit()Cong Wang
hsr_port_get_hsr() could return NULL and kernel could crash: BUG: kernel NULL pointer dereference, address: 0000000000000010 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 8000000074b84067 P4D 8000000074b84067 PUD 7057d067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 754 Comm: a.out Not tainted 5.2.0-rc6+ #718 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:hsr_dev_xmit+0x20/0x31 Code: 48 8b 1b eb e0 5b 5d 41 5c c3 66 66 66 66 90 55 48 89 fd 48 8d be 40 0b 00 00 be 04 00 00 00 e8 ee f2 ff ff 48 89 ef 48 89 c6 <48> 8b 40 10 48 89 45 10 e8 6c 1b 00 00 31 c0 5d c3 66 66 66 66 90 RSP: 0018:ffffb5b400003c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9821b4509a88 RCX: 0000000000000000 RDX: ffff9821b4509a88 RSI: 0000000000000000 RDI: ffff9821bc3fc7c0 RBP: ffff9821bc3fc7c0 R08: 0000000000000000 R09: 00000000000c2019 R10: 0000000000000000 R11: 0000000000000002 R12: ffff9821bc3fc7c0 R13: ffff9821b4509a88 R14: 0000000000000000 R15: 000000000000006e FS: 00007fee112a1800(0000) GS:ffff9821bd800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000006e9ce000 CR4: 00000000000406f0 Call Trace: <IRQ> netdev_start_xmit+0x1b/0x38 dev_hard_start_xmit+0x121/0x21e ? validate_xmit_skb.isra.0+0x19/0x1e3 __dev_queue_xmit+0x74c/0x823 ? lockdep_hardirqs_on+0x12b/0x17d ip6_finish_output2+0x3d3/0x42c ? ip6_mtu+0x55/0x5c ? mld_sendpack+0x191/0x229 mld_sendpack+0x191/0x229 mld_ifc_timer_expire+0x1f7/0x230 ? mld_dad_timer_expire+0x58/0x58 call_timer_fn+0x12e/0x273 __run_timers.part.0+0x174/0x1b5 ? mld_dad_timer_expire+0x58/0x58 ? sched_clock_cpu+0x10/0xad ? mark_lock+0x26/0x1f2 ? __lock_is_held+0x40/0x71 run_timer_softirq+0x26/0x48 __do_softirq+0x1af/0x392 irq_exit+0x53/0xa2 smp_apic_timer_interrupt+0x1c4/0x1d9 apic_timer_interrupt+0xf/0x20 </IRQ> Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05hsr: implement dellink to clean up resourcesCong Wang
hsr_link_ops implements ->newlink() but not ->dellink(), which leads that resources not released after removing the device, particularly the entries in self_node_db and node_db. So add ->dellink() implementation to replace the priv_destructor. This also makes the code slightly easier to understand. Reported-by: syzbot+c6167ec3de7def23d1e8@syzkaller.appspotmail.com Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05hsr: fix a memory leak in hsr_del_port()Cong Wang
hsr_del_port() should release all the resources allocated in hsr_add_port(). As a consequence of this change, hsr_for_each_port() is no longer safe to work with hsr_del_port(), switch to list_for_each_entry_safe() as we always hold RTNL lock. Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2019-07-05 1) A lot of work to remove indirections from the xfrm code. From Florian Westphal. 2) Fix a WARN_ON with ipv6 that triggered because of a forgotten break statement. From Florian Westphal. 3) Remove xfrmi_init_net, it is not needed. From Li RongQing. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-07-05 1) Fix xfrm selector prefix length validation for inter address family tunneling. From Anirudh Gupta. 2) Fix a memleak in pfkey. From Jeremy Sowden. 3) Fix SA selector validation to allow empty selectors again. From Nicolas Dichtel. 4) Select crypto ciphers for xfrm_algo, this fixes some randconfig builds. From Arnd Bergmann. 5) Remove a duplicated assignment in xfrm_bydst_resize. From Cong Wang. 6) Fix a hlist corruption on hash rebuild. From Florian Westphal. 7) Fix a memory leak when creating xfrm interfaces. From Nicolas Dichtel. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05netfilter: nf_tables: __nft_expr_type_get() selects specific family typePablo Neira Ayuso
In case that there are two types, prefer the family specify extension. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05netfilter: nf_tables: add nft_expr_type_request_module()Pablo Neira Ayuso
This helper function makes sure the family specific extension is loaded. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05netfilter: nft_meta_bridge: Add NFT_META_BRI_IIFVPROTO supportwenxu
This patch allows you to match on bridge vlan protocol, eg. nft add rule bridge firewall zones counter meta ibrvproto 0x8100 Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05bridge: add br_vlan_get_proto()wenxu
This new function allows you to fetch the bridge port vlan protocol. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05netfilter: nft_meta_bridge: add NFT_META_BRI_IIFPVID supportwenxu
This patch allows you to match on the bridge port pvid, eg. nft add rule bridge firewall zones counter meta ibrpvid 10 Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05bridge: add br_vlan_get_pvid_rcu()Pablo Neira Ayuso
This new function allows you to fetch bridge pvid from packet path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
2019-07-05netfilter: nft_meta_bridge: Remove the br_private.h headerwenxu
nft_bridge_meta should not access the bridge internal API. Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05netfilter: nft_meta: move bridge meta keys into nft_meta_bridgewenxu
Separate bridge meta key from nft_meta to meta_bridge to avoid a dependency between the bridge module and nft_meta when using the bridge API available through include/linux/if_bridge.h Signed-off-by: wenxu <wenxu@ucloud.cn> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05ipvs: strip gre tunnel headers from icmp errorsJulian Anastasov
Recognize GRE tunnels in received ICMP errors and properly strip the tunnel headers. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-05netfilter: nf_tables: Add synproxy supportFernando Fernandez Mancera
Add synproxy support for nf_tables. This behaves like the iptables synproxy target but it is structured in a way that allows us to propose improvements in the future. Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-07-03 The following pull-request contains BPF updates for your *net-next* tree. There is a minor merge conflict in mlx5 due to 8960b38932be ("linux/dim: Rename externally used net_dim members") which has been pulled into your tree in the meantime, but resolution seems not that bad ... getting current bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed just so he's aware of the resolution below: ** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c: <<<<<<< HEAD static int mlx5e_open_cq(struct mlx5e_channel *c, struct dim_cq_moder moder, struct mlx5e_cq_param *param, struct mlx5e_cq *cq) ======= int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder, struct mlx5e_cq_param *param, struct mlx5e_cq *cq) >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497 Resolution is to take the second chunk and rename net_dim_cq_moder into dim_cq_moder. Also the signature for mlx5e_open_cq() in ... drivers/net/ethernet/mellanox/mlx5/core/en.h +977 ... and in mlx5e_open_xsk() ... drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64 ... needs the same rename from net_dim_cq_moder into dim_cq_moder. ** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c: <<<<<<< HEAD int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix)); struct dim_cq_moder icocq_moder = {0, 0}; struct net_device *netdev = priv->netdev; struct mlx5e_channel *c; unsigned int irq; ======= struct net_dim_cq_moder icocq_moder = {0, 0}; >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497 Take the second chunk and rename net_dim_cq_moder into dim_cq_moder as well. Let me know if you run into any issues. Anyway, the main changes are: 1) Long-awaited AF_XDP support for mlx5e driver, from Maxim. 2) Addition of two new per-cgroup BPF hooks for getsockopt and setsockopt along with a new sockopt program type which allows more fine-grained pass/reject settings for containers. Also add a sock_ops callback that can be selectively enabled on a per-socket basis and is executed for every RTT to help tracking TCP statistics, both features from Stanislav. 3) Follow-up fix from loops in precision tracking which was not propagating precision marks and as a result verifier assumed that some branches were not taken and therefore wrongly removed as dead code, from Alexei. 4) Fix BPF cgroup release synchronization race which could lead to a double-free if a leaf's cgroup_bpf object is released and a new BPF program is attached to the one of ancestor cgroups in parallel, from Roman. 5) Support for bulking XDP_TX on veth devices which improves performance in some cases by around 9%, from Toshiaki. 6) Allow for lookups into BPF devmap and improve feedback when calling into bpf_redirect_map() as lookup is now performed right away in the helper itself, from Toke. 7) Add support for fq's Earliest Departure Time to the Host Bandwidth Manager (HBM) sample BPF program, from Lawrence. 8) Various cleanups and minor fixes all over the place from many others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-04ipvs: allow tunneling with gre encapsulationVadim Fedorenko
windows real servers can handle gre tunnels, this patch allows gre encapsulation with the tunneling method, thereby letting ipvs be load balancer for windows-based services Signed-off-by: Vadim Fedorenko <vfedorenko@yandex-team.ru> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-04netfilter: nf_queue: remove unused hook entries pointerFlorian Westphal
Its not used anywhere, so remove this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-04netfilter: nf_log: Replace a seq_printf() call by seq_puts() in seq_show()Markus Elfring
A string which did not contain a data format specification should be put into a sequence. Thus use the corresponding function “seq_puts”. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-04netfilter: rename nf_SYNPROXY.h to nf_synproxy.hPablo Neira Ayuso
Uppercase is a reminiscence from the iptables infrastructure, rename this header before this is included in stable kernels. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-07-03nfs: fix out-of-date connectathon talk URLJ. Bruce Fields
Reported-by: David Wysochanski <dwysocha@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-07-03ipv4: use indirect call wrappers for {tcp, udp}_{recv, send}msg()Paolo Abeni
This avoids an indirect call per syscall for common ipv4 transports v1 -> v2: - avoid unneeded reclaration for udp_sendmsg, as suggested by Willem Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03ipv6: use indirect call wrappers for {tcp, udpv6}_{recv, send}msg()Paolo Abeni
This avoids an indirect call per syscall for common ipv6 transports Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03net: adjust socket level ICW to cope with ipv6 variant of {recv, send}msgPaolo Abeni
After the previous patch we have ipv{6,4} variants for {recv,send}msg, we should use the generic _INET ICW variant to call into the proper build-in. This also allows dropping the now unused and rather ugly _INET4 ICW macro v1 -> v2: - use ICW macro to declare inet6_{recv,send}msg - fix a couple of checkpatch offender in the code context Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03ipv6: provide and use ipv6 specific version for {recv, send}msgPaolo Abeni
This will simplify indirect call wrapper invocation in the following patch. No functional change intended, any - out-of-tree - IPv6 user of inet_{recv,send}msg can keep using the existing functions. SCTP code still uses the existing version even for ipv6: as this series will not add ICW for SCTP, moving to the new helper would not give any benefit. The only other in-kernel user of inet_{recv,send}msg is pvcalls_conn_back_read(), but psvcalls explicitly creates only IPv4 socket, so no need to update that code path, too. v1 -> v2: drop inet6_{recv,send}msg declaration from header file, prefer ICW macro instead Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03inet: factor out inet_send_prepare()Paolo Abeni
The same code is replicated verbatim in multiple places, and the next patches will introduce an additional user for it. Factor out a helper and use it where appropriate. No functional change intended. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2019-07-03 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix the interpreter to properly handle BPF_ALU32 | BPF_ARSH on BE architectures, from Jiong. 2) Fix several bugs in the x32 BPF JIT for handling shifts by 0, from Luke and Xi. 3) Fix NULL pointer deref in btf_type_is_resolve_source_only(), from Stanislav. 4) Properly handle the check that forwarding is enabled on the device in bpf_ipv6_fib_lookup() helper code, from Anton. 5) Fix UAPI bpf_prog_info fields alignment for archs that have 16 bit alignment such as m68k, from Baruch. 6) Fix kernel hanging in unregister_netdevice loop while unregistering device bound to XDP socket, from Ilya. 7) Properly terminate tail update in xskq_produce_flush_desc(), from Nathan. 8) Fix broken always_inline handling in test_lwt_seg6local, from Jiri. 9) Fix bpftool to use correct argument in cgroup errors, from Jakub. 10) Fix detaching dummy prog in XDP redirect sample code, from Prashant. 11) Add Jonathan to AF_XDP reviewers, from Björn. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03sctp: count data bundling sack chunk for outctrlchunksXin Long
Now all ctrl chunks are counted for asoc stats.octrlchunks and net SCTP_MIB_OUTCTRLCHUNKS either after queuing up or bundling, other than the chunk maked and bundled in sctp_packet_bundle_sack, which caused 'outctrlchunks' not consistent with 'inctrlchunks' in peer. This issue exists since very beginning, here to fix it by increasing both net SCTP_MIB_OUTCTRLCHUNKS and asoc stats.octrlchunks when sack chunk is maked and bundled in sctp_packet_bundle_sack. Reported-by: Ja Ram Jeon <jajeon@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03net: don't warn in inet diag when IPV6 is disabledStephen Hemminger
If IPV6 was disabled, then ss command would cause a kernel warning because the command was attempting to dump IPV6 socket information. The fix is to just remove the warning. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202249 Fixes: 432490f9d455 ("net: ip, diag -- Add diag interface for raw sockets") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-03ceph: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. This cleanup allows the return value of the functions to be made void, as no logic should care if these files succeed or not. Cc: "Yan, Zheng" <zyan@redhat.com> Cc: Sage Weil <sage@redhat.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: ceph-devel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20190612145538.GA18772@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-03sunrpc: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: linux-nfs@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20190612145622.GA18839@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-03bpf: add icsk_retransmits to bpf_tcp_sockStanislav Fomichev
Add some inet_connection_sock fields to bpf_tcp_sock that might be useful for debugging congestion control issues. Cc: Eric Dumazet <edumazet@google.com> Cc: Priyaranjan Jha <priyarjha@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-03bpf: add dsack_dups/delivered{, _ce} to bpf_tcp_sockStanislav Fomichev
Add more fields to bpf_tcp_sock that might be useful for debugging congestion control issues. Cc: Eric Dumazet <edumazet@google.com> Cc: Priyaranjan Jha <priyarjha@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-03bpf: split shared bpf_tcp_sock and bpf_sock_ops implementationStanislav Fomichev
We've added bpf_tcp_sock member to bpf_sock_ops and don't expect any new tcp_sock fields in bpf_sock_ops. Let's remove CONVERT_COMMON_TCP_SOCK_FIELDS so bpf_tcp_sock can be independently extended. Cc: Eric Dumazet <edumazet@google.com> Cc: Priyaranjan Jha <priyarjha@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-03bpf: add BPF_CGROUP_SOCK_OPS callback that is executed on every RTTStanislav Fomichev
Performance impact should be minimal because it's under a new BPF_SOCK_OPS_RTT_CB_FLAG flag that has to be explicitly enabled. Suggested-by: Eric Dumazet <edumazet@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Priyaranjan Jha <priyarjha@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-03xdp: fix hang while unregistering device bound to xdp socketIlya Maximets
Device that bound to XDP socket will not have zero refcount until the userspace application will not close it. This leads to hang inside 'netdev_wait_allrefs()' if device unregistering requested: # ip link del p1 < hang on recvmsg on netlink socket > # ps -x | grep ip 5126 pts/0 D+ 0:00 ip link del p1 # journalctl -b Jun 05 07:19:16 kernel: unregister_netdevice: waiting for p1 to become free. Usage count = 1 Jun 05 07:19:27 kernel: unregister_netdevice: waiting for p1 to become free. Usage count = 1 ... Fix that by implementing NETDEV_UNREGISTER event notification handler to properly clean up all the resources and unref device. This should also allow socket killing via ss(8) utility. Fixes: 965a99098443 ("xsk: add support for bind for Rx") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-07-03xdp: hold device for umem regardless of zero-copy modeIlya Maximets
Device pointer stored in umem regardless of zero-copy mode, so we heed to hold the device in all cases. Fixes: c9b47cc1fabc ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>