summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2016-10-15Merge tag 'gcc-plugins-v4.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc plugins update from Kees Cook: "This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals" * tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: latent_entropy: Mark functions with __latent_entropy gcc-plugins: Add latent_entropy plugin
2016-10-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix various build warnings in tlan/qed/xen-netback drivers, from Arnd Bergmann. 2) Propagate proper error code in strparser's strp_recv(), from Geert Uytterhoeven. 3) Fix accidental broadcast of RTM_GETTFILTER responses, from Eric Dumazret. 4) Need to use list_for_each_entry_safe() in qed driver, from Wei Yongjun. 5) Openvswitch 802.1AD bug fixes from Jiri Benc. 6) Cure BUILD_BUG_ON() in mlx5 driver, from Tom Herbert. 7) Fix UDP ipv6 checksumming in netvsc driver, from Stephen Hemminger. 8) stmmac driver fixes from Giuseppe CAVALLARO. 9) Fix access to mangled IP6CB in tcp, from Eric Dumazet. 10) Fix info leaks in tipc and rtnetlink, from Dan Carpenter. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits) net: bridge: add the multicast_flood flag attribute to brport_attrs net: axienet: Remove unused parameter from __axienet_device_reset liquidio: CN23XX: fix a loop timeout net: rtnl: info leak in rtnl_fill_vfinfo() tipc: info leak in __tipc_nl_add_udp_addr() net: ipv4: Do not drop to make_route if oif is l3mdev net: phy: Trigger state machine on state change and not polling. ipv6: tcp: restore IP6CB for pktoptions skbs netvsc: Remove mistaken udp.h inclusion. xen-netback: fix type mismatch warning stmmac: fix error check when init ptp stmmac: fix ptp init for gmac4 qed: fix old-style function definition netvsc: fix checksum on UDP IPV6 net_sched: reorder pernet ops and act ops registrations xen-netback: fix guest Rx stall detection (after guest Rx refactor) drivers/ptp: Fix kernel memory disclosure net/mlx5: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON qmi_wwan: add support for Quectel EC21 and EC25 openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev ...
2016-10-13Merge tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "Highlights include: Stable bugfixes: - sunrpc: fix writ espace race causing stalls - NFS: Fix inode corruption in nfs_prime_dcache() - NFSv4: Don't report revoked delegations as valid in nfs_have_delegation() - NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid - NFSv4: Open state recovery must account for file permission changes - NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic Features: - Add support for tracking multiple layout types with an ordered list - Add support for using multiple backchannel threads on the client - Add support for pNFS file layout session trunking - Delay xprtrdma use of DMA API (for device driver removal) - Add support for xprtrdma remote invalidation - Add support for larger xprtrdma inline thresholds - Use a scatter/gather list for sending xprtrdma RPC calls - Add support for the CB_NOTIFY_LOCK callback - Improve hashing sunrpc auth_creds by using both uid and gid Bugfixes: - Fix xprtrdma use of DMA API - Validate filenames before adding to the dcache - Fix corruption of xdr->nwords in xdr_copy_to_scratch - Fix setting buffer length in xdr_set_next_buffer() - Don't deadlock the state manager on the SEQUENCE status flags - Various delegation and stateid related fixes - Retry operations if an interrupted slot receives EREMOTEIO - Make nfs boot time y2038 safe" * tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (100 commits) NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic fs: nfs: Make nfs boot time y2038 safe sunrpc: replace generic auth_cred hash with auth-specific function sunrpc: add RPCSEC_GSS hash_cred() function sunrpc: add auth_unix hash_cred() function sunrpc: add generic_auth hash_cred() function sunrpc: add hash_cred() function to rpc_authops struct Retry operation on EREMOTEIO on an interrupted slot pNFS: Fix atime updates on pNFS clients sunrpc: queue work on system_power_efficient_wq NFSv4.1: Even if the stateid is OK, we may need to recover the open modes NFSv4: If recovery failed for a specific open stateid, then don't retry NFSv4: Fix retry issues with nfs41_test/free_stateid NFSv4: Open state recovery must account for file permission changes NFSv4: Mark the lock and open stateids as invalid after freeing them NFSv4: Don't test open_stateid unless it is set NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation NFSv4: Fix a race when updating an open_stateid NFSv4: Fix a race in nfs_inode_reclaim_delegation() ...
2016-10-13Merge tag 'nfsd-4.9' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Some RDMA work and some good bugfixes, and two new features that could benefit from user testing: - Anna Schumacker contributed a simple NFSv4.2 COPY implementation. COPY is already supported on the client side, so a call to copy_file_range() on a recent client should now result in a server-side copy that doesn't require all the data to make a round trip to the client and back. - Jeff Layton implemented callbacks to notify clients when contended locks become available, which should reduce latency on workloads with contended locks" * tag 'nfsd-4.9' of git://linux-nfs.org/~bfields/linux: NFSD: Implement the COPY call nfsd: handle EUCLEAN nfsd: only WARN once on unmapped errors exportfs: be careful to only return expected errors. nfsd4: setclientid_confirm with unmatched verifier should fail nfsd: randomize SETCLIENTID reply to help distinguish servers nfsd: set the MAY_NOTIFY_LOCK flag in OPEN replies nfs: add a new NFS4_OPEN_RESULT_MAY_NOTIFY_LOCK constant nfsd: add a LRU list for blocked locks nfsd: have nfsd4_lock use blocking locks for v4.1+ locks nfsd: plumb in a CB_NOTIFY_LOCK operation NFSD: fix corruption in notifier registration svcrdma: support Remote Invalidation svcrdma: Server-side support for rpcrdma_connect_private rpcrdma: RDMA/CM private message data structure svcrdma: Skip put_page() when send_reply() fails svcrdma: Tail iovec leaves an orphaned DMA mapping nfsd: fix dprintk in nfsd4_encode_getdeviceinfo nfsd: eliminate cb_minorversion field nfsd: don't set a FL_LAYOUT lease for flexfiles layouts
2016-10-13net: bridge: add the multicast_flood flag attribute to brport_attrsNikolay Aleksandrov
When I added the multicast flood control flag, I also added an attribute for it for sysfs similar to other flags, but I forgot to add it to brport_attrs. Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13net: rtnl: info leak in rtnl_fill_vfinfo()Dan Carpenter
The "vf_vlan_info" struct ends with a 2 byte struct hole so we have to memset it to ensure that no stack information is revealed to user space. Fixes: 79aab093a0b5 ('net: Update API for VF vlan protocol 802.1ad support') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13tipc: info leak in __tipc_nl_add_udp_addr()Dan Carpenter
We should clear out the padding and unused struct members so that we don't expose stack information to userspace. Fixes: fdb3accc2c15 ('tipc: add the ability to get UDP options via netlink') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13net: ipv4: Do not drop to make_route if oif is l3mdevDavid Ahern
Commit e0d56fdd7342 was a bit aggressive removing l3mdev calls in the IPv4 stack. If the fib_lookup fails we do not want to drop to make_route if the oif is an l3mdev device. Also reverts 19664c6a0009 ("net: l3mdev: Remove netif_index_is_l3_master") which removed netif_index_is_l3_master. Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13ipv6: tcp: restore IP6CB for pktoptions skbsEric Dumazet
Baozeng Ding reported following KASAN splat : BUG: KASAN: use-after-free in ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 at addr ffff880029c84ec8 Read of size 1 by task poc/25548 Call Trace: [<ffffffff82cf43c9>] dump_stack+0x12e/0x185 /lib/dump_stack.c:15 [< inline >] print_address_description /mm/kasan/report.c:204 [<ffffffff817ced3b>] kasan_report_error+0x48b/0x4b0 /mm/kasan/report.c:283 [< inline >] kasan_report /mm/kasan/report.c:303 [<ffffffff817ced9e>] __asan_report_load1_noabort+0x3e/0x40 /mm/kasan/report.c:321 [<ffffffff85c71da1>] ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 /net/ipv6/datagram.c:687 [<ffffffff85c734c3>] ip6_datagram_recv_ctl+0x33/0x40 [<ffffffff85c0b07c>] do_ipv6_getsockopt.isra.4+0xaec/0x2150 [<ffffffff85c0c7f6>] ipv6_getsockopt+0x116/0x230 [<ffffffff859b5a12>] tcp_getsockopt+0x82/0xd0 /net/ipv4/tcp.c:3035 [<ffffffff855fb385>] sock_common_getsockopt+0x95/0xd0 /net/core/sock.c:2647 [< inline >] SYSC_getsockopt /net/socket.c:1776 [<ffffffff855f8ba2>] SyS_getsockopt+0x142/0x230 /net/socket.c:1758 [<ffffffff8685cdc5>] entry_SYSCALL_64_fastpath+0x23/0xc6 Memory state around the buggy address: ffff880029c84d80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff880029c84e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > ffff880029c84e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff880029c84f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff880029c84f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff He also provided a syzkaller reproducer. Issue is that ip6_datagram_recv_specific_ctl() expects to find IP6CB data that was moved at a different place in tcp_v6_rcv() This patch moves tcp_v6_restore_cb() up and calls it from tcp_v6_do_rcv() when np->pktoptions is set. Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13net_sched: reorder pernet ops and act ops registrationsWANG Cong
Krister reported a kernel NULL pointer dereference after tcf_action_init_1() invokes a_o->init(), it is a race condition where one thread calling tcf_register_action() to initialize the netns data after putting act ops in the global list and the other thread searching the list and then calling a_o->init(net, ...). Fix this by moving the pernet ops registration before making the action ops visible. This is fine because: a) we don't rely on act_base in pernet ops->init(), b) in the worst case we have a fully initialized netns but ops is still not ready so new actions still can't be created. Reported-by: Krister Johansen <kjlx@templeofstupid.com> Tested-by: Krister Johansen <kjlx@templeofstupid.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal devJiri Benc
The internal device does support 802.1AD offloading since 018c1dda5ff1 ("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes"). Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Eric Garver <e@erig.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13openvswitch: fix vlan subtraction from packet lengthJiri Benc
When the packet has its vlan tag in skb->vlan_tci, the length of the VLAN header is not counted in skb->len. It doesn't make sense to subtract it. Fixes: 018c1dda5ff1 ("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Eric Garver <e@erig.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13openvswitch: vlan: remove wrong likely statementJiri Benc
This code is called whenever flow key is being extracted from the packet. The packet may be as likely vlan tagged as not. Fixes: 018c1dda5ff1 ("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Eric Garver <e@erig.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-13net_sched: do not broadcast RTM_GETTFILTER resultEric Dumazet
There are two ways to get tc filters from kernel to user space. 1) Full dump (tc_dump_tfilter()) 2) RTM_GETTFILTER to get one precise filter, reducing overhead. The second operation is unfortunately broadcasting its result, polluting "tc monitor" users. This patch makes sure only the requester gets the result, using netlink_unicast() instead of rtnetlink_send() Jamal cooked an iproute2 patch to implement "tc filter get" operation, but other user space libraries already use RTM_GETTFILTER when a single filter is queried, instead of dumping all filters. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-12strparser: Propagate correct error code in strp_recv()Geert Uytterhoeven
With m68k-linux-gnu-gcc-4.1: net/strparser/strparser.c: In function ‘strp_recv’: net/strparser/strparser.c:98: warning: ‘err’ may be used uninitialized in this function Pass "len" (which is an error code when negative) instead of the uninitialized "err" variable to fix this. Fixes: 43a0c6751a322847 ("strparser: Stream parser for messages") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-11treewide: remove redundant #include <linux/kconfig.h>Masahiro Yamada
Kernel source files need not include <linux/kconfig.h> explicitly because the top Makefile forces to include it with: -include $(srctree)/include/linux/kconfig.h This commit removes explicit includes except the following: * arch/s390/include/asm/facilities_src.h * tools/testing/radix-tree/linux/kernel.h These two are used for host programs. Link: http://lkml.kernel.org/r/1473656164-11929-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Netfilter list handling fix, from Linus. 2) RXRPC/AFS bug fixes from David Howells (oops on call to serviceless endpoints, build warnings, missing notifications, etc.) From David Howells. 3) Kernel log message missing newlines, from Colin Ian King. 4) Don't enter direct reclaim in netlink dumps, the idea is to use a high order allocation first and fallback quickly to a 0-order allocation if such a high-order one cannot be done cheaply and without reclaim. From Eric Dumazet. 5) Fix firmware download errors in btusb bluetooth driver, from Ethan Hsieh. 6) Missing Kconfig deps for QCOM_EMAC, from Geert Uytterhoeven. 7) Fix MDIO_XGENE dup Kconfig entry. From Laura Abbott. 8) Constrain ipv6 rtr_solicits sysctl values properly, from Maciej Żenczykowski. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits) netfilter: Fix slab corruption. be2net: Enable VF link state setting for BE3 be2net: Fix TX stats for TSO packets be2net: Update Copyright string in be_hw.h be2net: NCSI FW section should be properly updated with ethtool for BE3 be2net: Provide an alternate way to read pf_num for BEx chips wan/fsl_ucc_hdlc: Fix size used in dma_free_coherent() net: macb: NULL out phydev after removing mdio bus xen-netback: make sure that hashes are not send to unaware frontends Fixing a bug in team driver due to incorrect 'unsigned int' to 'int' conversion MAINTAINERS: add myself as a maintainer of xen-netback ipv6 addrconf: disallow rtr_solicits < -1 Bluetooth: btusb: Fix atheros firmware download error drivers: net: phy: Correct duplicate MDIO_XGENE entry ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM net: ethernet: mediatek: remove hwlro property in the device tree net: ethernet: mediatek: get hw lro capability by the chip id instead of by the dtsi net: ethernet: mediatek: get the chip id by ETHDMASYS registers net: bgmac: Fix errant feature flag check netlink: do not enter direct reclaim from netlink_dump() ...
2016-10-11netfilter: Fix slab corruption.Linus Torvalds
Use the correct pattern for singly linked list insertion and deletion. We can also calculate the list head outside of the mutex. Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: David S. Miller <davem@davemloft.net> net/netfilter/core.c | 108 ++++++++++++++++----------------------------------- 1 file changed, 33 insertions(+), 75 deletions(-)
2016-10-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning
2016-10-10Merge remote-tracking branch 'ovl/rename2' into for-linusAl Viro
2016-10-10Merge branch 'work.xattr' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr updates from Al Viro: "xattr stuff from Andreas This completes the switch to xattr_handler ->get()/->set() from ->getxattr/->setxattr/->removexattr" * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Remove {get,set,remove}xattr inode operations xattr: Stop calling {get,set,remove}xattr inode operations vfs: Check for the IOP_XATTR flag in listxattr xattr: Add __vfs_{get,set,remove}xattr helpers libfs: Use IOP_XATTR flag for empty directory handling vfs: Use IOP_XATTR flag for bad-inode handling vfs: Add IOP_XATTR inode operations flag vfs: Move xattr_resolve_name to the front of fs/xattr.c ecryptfs: Switch to generic xattr handlers sockfs: Get rid of getxattr iop sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names kernfs: Switch to generic xattr handlers hfs: Switch to generic xattr handlers jffs2: Remove jffs2_{get,set,remove}xattr macros xattr: Remove unnecessary NULL attribute name check
2016-10-10latent_entropy: Mark functions with __latent_entropyEmese Revfy
The __latent_entropy gcc attribute can be used only on functions and variables. If it is on a function then the plugin will instrument it for gathering control-flow entropy. If the attribute is on a variable then the plugin will initialize it with random contents. The variable must be an integer, an integer array type or a structure with integer fields. These specific functions have been selected because they are init functions (to help gather boot-time entropy), are called at unpredictable times, or they have variable loops, each of which provide some level of latent entropy. Signed-off-by: Emese Revfy <re.emese@gmail.com> [kees: expanded commit message] Signed-off-by: Kees Cook <keescook@chromium.org>
2016-10-10Merge tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull Ceph updates from Ilya Dryomov: "The big ticket item here is support for rbd exclusive-lock feature, with maintenance operations offloaded to userspace (Douglas Fuller, Mike Christie and myself). Another block device bullet is a series fixing up layering error paths (myself). On the filesystem side, we've got patches that improve our handling of buffered vs dio write races (Neil Brown) and a few assorted fixes from Zheng. Also included a couple of random cleanups and a minor CRUSH update" * tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-client: (39 commits) crush: remove redundant local variable crush: don't normalize input of crush_ln iteratively libceph: ceph_build_auth() doesn't need ceph_auth_build_hello() libceph: use CEPH_AUTH_UNKNOWN in ceph_auth_build_hello() ceph: fix description for rsize and rasize mount options rbd: use kmalloc_array() in rbd_header_from_disk() ceph: use list_move instead of list_del/list_add ceph: handle CEPH_SESSION_REJECT message ceph: avoid accessing / when mounting a subpath ceph: fix mandatory flock check ceph: remove warning when ceph_releasepage() is called on dirty page ceph: ignore error from invalidate_inode_pages2_range() in direct write ceph: fix error handling of start_read() rbd: add rbd_obj_request_error() helper rbd: img_data requests don't own their page array rbd: don't call rbd_osd_req_format_read() for !img_data requests rbd: rework rbd_img_obj_exists_submit() error paths rbd: don't crash or leak on errors in rbd_img_obj_parent_read_full_callback() rbd: move bumping img_request refcount into rbd_obj_request_submit() rbd: mark the original request as done if stat request fails ...
2016-10-09Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull main rdma updates from Doug Ledford: "This is the main pull request for the rdma stack this release. The code has been through 0day and I had it tagged for linux-next testing for a couple days. Summary: - updates to mlx5 - updates to mlx4 (two conflicts, both minor and easily resolved) - updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - improvements to uAPI (moved vendor specific API elements to uAPI area) - add hns-roce driver and hns and hns-roce ACPI reset support - conversion of all rdma code away from deprecated create_singlethread_workqueue - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits) staging/lustre: Disable InfiniBand support iw_cxgb4: add fast-path for small REG_MR operations cxgb4: advertise support for FR_NSMR_TPTE_WR IB/core: correctly handle rdma_rw_init_mrs() failure IB/srp: Fix infinite loop when FMR sg[0].offset != 0 IB/srp: Remove an unused argument IB/core: Improve ib_map_mr_sg() documentation IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets IB/mthca: Move user vendor structures IB/nes: Move user vendor structures IB/ocrdma: Move user vendor structures IB/mlx4: Move user vendor structures IB/cxgb4: Move user vendor structures IB/cxgb3: Move user vendor structures IB/mlx5: Move and decouple user vendor structures IB/{core,hw}: Add constant for node_desc ipoib: Make ipoib_warn ratelimited IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue IB/ipoib: Remove deprecated create_singlethread_workqueue ...
2016-10-08Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2016-10-08 Here are a couple of Bluetooth fixes for the 4.9 kernel: - Firmware download fix for Atheros controllers - Fixes to the content of LE scan response - New USB ID for a Marvell chipset Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-07Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: - fsnotify updates - ocfs2 updates - all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (127 commits) console: don't prefer first registered if DT specifies stdout-path cred: simpler, 1D supplementary groups CREDITS: update Pavel's information, add GPG key, remove snail mail address mailmap: add Johan Hovold .gitattributes: set git diff driver for C source code files uprobes: remove function declarations from arch/{mips,s390} spelling.txt: "modeled" is spelt correctly nmi_backtrace: generate one-line reports for idle cpus arch/tile: adopt the new nmi_backtrace framework nmi_backtrace: do a local dump_stack() instead of a self-NMI nmi_backtrace: add more trigger_*_cpu_backtrace() methods min/max: remove sparse warnings when they're nested Documentation/filesystems/proc.txt: add more description for maps/smaps mm, proc: fix region lost in /proc/self/smaps proc: fix timerslack_ns CAP_SYS_NICE check when adjusting self proc: add LSM hook checks to /proc/<tid>/timerslack_ns proc: relax /proc/<tid>/timerslack_ns capability requirements meminfo: break apart a very long seq_printf with #ifdefs seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char proc: faster /proc/*/status ...
2016-10-07ipv6 addrconf: disallow rtr_solicits < -1Maciej Żenczykowski
This disallows setting /proc/sys/net/ipv6/conf/*/router_solicitations to values below -1. -1 continues to mean an unlimited number of retransmits. Note: this depends on 'ipv6 addrconf: remove addrconf_sysctl_hop_limit()' Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-07vfs: Remove {get,set,remove}xattr inode operationsAndreas Gruenbacher
These inode operations are no longer used; remove them. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-07cred: simpler, 1D supplementary groupsAlexey Dobriyan
Current supplementary groups code can massively overallocate memory and is implemented in a way so that access to individual gid is done via 2D array. If number of gids is <= 32, memory allocation is more or less tolerable (140/148 bytes). But if it is not, code allocates full page (!) regardless and, what's even more fun, doesn't reuse small 32-entry array. 2D array means dependent shifts, loads and LEAs without possibility to optimize them (gid is never known at compile time). All of the above is unnecessary. Switch to the usual trailing-zero-len-array scheme. Memory is allocated with kmalloc/vmalloc() and only as much as needed. Accesses become simpler (LEA 8(gi,idx,4) or even without displacement). Maximum number of gids is 65536 which translates to 256KB+8 bytes. I think kernel can handle such allocation. On my usual desktop system with whole 9 (nine) aux groups, struct group_info shrinks from 148 bytes to 44 bytes, yay! Nice side effects: - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing, - fix little mess in net/ipv4/ping.c should have been using GROUP_AT macro but this point becomes moot, - aux group allocation is persistent and should be accounted as such. Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.by Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Vasily Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07mm: memcontrol: consolidate cgroup socket trackingJohannes Weiner
The cgroup core and the memory controller need to track socket ownership for different purposes, but the tracking sites being entirely different is kind of ugly. Be a better citizen and rename the memory controller callbacks to match the cgroup core callbacks, then move them to the same place. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20160914194846.11153-3-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07Merge branch 'work.splice_read' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS splice updates from Al Viro: "There's a bunch of branches this cycle, both mine and from other folks and I'd rather send pull requests separately. This one is the conversion of ->splice_read() to ITER_PIPE iov_iter (and introduction of such). Gets rid of a lot of code in fs/splice.c and elsewhere; there will be followups, but these are for the next cycle... Some pipe/splice-related cleanups from Miklos in the same branch as well" * 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: pipe: fix comment in pipe_buf_operations pipe: add pipe_buf_steal() helper pipe: add pipe_buf_confirm() helper pipe: add pipe_buf_release() helper pipe: add pipe_buf_get() helper relay: simplify relay_file_read() switch default_file_splice_read() to use of pipe-backed iov_iter switch generic_file_splice_read() to use of ->read_iter() new iov_iter flavour: pipe-backed fuse_dev_splice_read(): switch to add_to_pipe() skb_splice_bits(): get rid of callback new helper: add_to_pipe() splice: lift pipe_lock out of splice_to_pipe() splice: switch get_iovec_page_array() to iov_iter splice_to_pipe(): don't open-code wakeup_pipe_readers() consistent treatment of EFAULT on O_DIRECT read/write
2016-10-06sockfs: Get rid of getxattr iopAndreas Gruenbacher
If we allow pseudo-filesystems created with mount_pseudo to have xattr handlers, we can replace sockfs_getxattr with a sockfs_xattr_get handler to use the xattr handler name parsing. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute namesAndreas Gruenbacher
The standard return value for unsupported attribute names is -EOPNOTSUPP, as opposed to undefined but supported attributes (-ENODATA). Also, fail for attribute names like "system.sockprotonameXXX" and simplify the code a bit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-10-06Merge tag 'rxrpc-rewrite-20161004' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fixes This set of patches contains a bunch of fixes: (1) Fix an oops on incoming call to a local endpoint without a bound service. (2) Only ping for a lost reply in a client call (this is inapplicable to service calls). (3) Fix maybe uninitialised variable warnings in the ACK/ABORT sending function by splitting it. (4) Fix loss of PING RESPONSE ACKs due to them being subsumed by PING ACK generation. (5) OpenAFS improperly terminates calls it makes as a client under some circumstances by not fully hard-ACK'ing the last DATA packets. This is alleviated by a new call appearing on the same channel implicitly completing the previous call on that channel. Handle this implicit completion. (6) Properly handle expiry of service calls due to the aforementioned improper termination with no follow up call to implicitly complete it: (a) The call's background processor needs to be queued to complete the call, send an abort and notify the socket. (b) The call's background processor needs to notify the socket (or the kernel service) when it has completed the call. (c) A negative error code must thence be returned to the kernel service so that it knows the call died. (d) The AFS filesystem must detect the fatal error and end the call. (7) Must produce a DELAY ACK when the actual service operation takes a while to process and must cancel the ACK when the reply is ready. (8) Don't request an ACK on the last DATA packet of the Tx phase as this confuses OpenAFS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-06netlink: do not enter direct reclaim from netlink_dump()Eric Dumazet
Since linux-3.15, netlink_dump() can use up to 16384 bytes skb allocations. Due to struct skb_shared_info ~320 bytes overhead, we end up using order-3 (on x86) page allocations, that might trigger direct reclaim and add stress. The intent was really to attempt a large allocation but immediately fallback to a smaller one (order-1 on x86) in case of memory stress. On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to meet the goal. Old kernels would need to remove __GFP_WAIT While we are at it, since we do an order-3 allocation, allow to use all the allocated bytes instead of 16384 to reduce syscalls during large dumps. iproute2 already uses 32KB recvmsg() buffer sizes. Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384) Fixes: 9063e21fb026 ("netlink: autosize skb lengthes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <ast@kernel.org> Cc: Greg Thelen <gthelen@google.com> Reviewed-by: Greg Rose <grose@lightfleet.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-06packet: call fanout_release, while UNREGISTERING a netdevAnoob Soman
If a socket has FANOUT sockopt set, a new proto_hook is registered as part of fanout_add(). When processing a NETDEV_UNREGISTER event in af_packet, __fanout_unlink is called for all sockets, but prot_hook which was registered as part of fanout_add is not removed. Call fanout_release, on a NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the fanout_list. This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo() Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-06Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "This set of changes is a number of smaller things that have been overlooked in other development cycles focused on more fundamental change. The devpts changes are small things that were a distraction until we managed to kill off DEVPTS_MULTPLE_INSTANCES. There is an trivial regression fix to autofs for the unprivileged mount changes that went in last cycle. A pair of ioctls has been added by Andrey Vagin making it is possible to discover the relationships between namespaces when referring to them through file descriptors. The big user visible change is starting to add simple resource limits to catch programs that misbehave. With namespaces in general and user namespaces in particular allowing users to use more kinds of resources, it has become important to have something to limit errant programs. Because the purpose of these limits is to catch errant programs the code needs to be inexpensive to use as it always on, and the default limits need to be high enough that well behaved programs on well behaved systems don't encounter them. To this end, after some review I have implemented per user per user namespace limits, and use them to limit the number of namespaces. The limits being per user mean that one user can not exhause the limits of another user. The limits being per user namespace allow contexts where the limit is 0 and security conscious folks can remove from their threat anlysis the code used to manage namespaces (as they have historically done as it root only). At the same time the limits being per user namespace allow other parts of the system to use namespaces. Namespaces are increasingly being used in application sand boxing scenarios so an all or nothing disable for the entire system for the security conscious folks makes increasing use of these sandboxes impossible. There is also added a limit on the maximum number of mounts present in a single mount namespace. It is nontrivial to guess what a reasonable system wide limit on the number of mount structure in the kernel would be, especially as it various based on how a system is using containers. A limit on the number of mounts in a mount namespace however is much easier to understand and set. In most cases in practice only about 1000 mounts are used. Given that some autofs scenarious have the potential to be 30,000 to 50,000 mounts I have set the default limit for the number of mounts at 100,000 which is well above every known set of users but low enough that the mount hash tables don't degrade unreaonsably. These limits are a start. I expect this estabilishes a pattern that other limits for resources that namespaces use will follow. There has been interest in making inotify event limits per user per user namespace as well as interest expressed in making details about what is going on in the kernel more visible" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (28 commits) autofs: Fix automounts by using current_real_cred()->uid mnt: Add a per mount namespace limit on the number of mounts netns: move {inc,dec}_net_namespaces into #ifdef nsfs: Simplify __ns_get_path tools/testing: add a test to check nsfs ioctl-s nsfs: add ioctl to get a parent namespace nsfs: add ioctl to get an owning user namespace for ns file descriptor kernel: add a helper to get an owning user namespace for a namespace devpts: Change the owner of /dev/pts/ptmx to the mounter of /dev/pts devpts: Remove sync_filesystems devpts: Make devpts_kill_sb safe if fsi is NULL devpts: Simplify devpts_mount by using mount_nodev devpts: Move the creation of /dev/pts/ptmx into fill_super devpts: Move parse_mount_options into fill_super userns: When the per user per user namespace limit is reached return ENOSPC userns; Document per user per user namespace limits. mntns: Add a limit on the number of mount namespaces. netns: Add a limit on the number of net namespaces cgroupns: Add a limit on the number of cgroup namespaces ipcns: Add a limit on the number of ipc namespaces ...
2016-10-06Bluetooth: Refactor append name and appearanceMichał Narajowski
Use eir_append_data to remove code duplication. Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-10-06Bluetooth: Add appearance to default scan rsp dataMichał Narajowski
Add appearance value to beginning of scan rsp data for default advertising instance if the value is not 0. Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-10-06Bluetooth: Fix local name in scan rspMichał Narajowski
Use complete name if it fits. If not and there is short name check if it fits. If not then use shortened name as prefix of complete name. Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-10-06rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phaseDavid Howells
Don't request an ACK on the last DATA packet of a call's Tx phase as for a client there will be a reply packet or some sort of ACK to shift phase. If the ACK is requested, OpenAFS sends a REQUESTED-ACK ACK with soft-ACKs in it and doesn't follow up with a hard-ACK. If we don't set the flag, OpenAFS will send a DELAY ACK that hard-ACKs the reply data, thereby allowing the call to terminate cleanly. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Need to produce an ACK for service op if op takes a long timeDavid Howells
We need to generate a DELAY ACK from the service end of an operation if we start doing the actual operation work and it takes longer than expected. This will hard-ACK the request data and allow the client to release its resources. To make this work: (1) We have to set the ack timer and propose an ACK when the call moves to the RXRPC_CALL_SERVER_ACK_REQUEST and clear the pending ACK and cancel the timer when we start transmitting the reply (the first DATA packet of the reply implicitly ACKs the request phase). (2) It must be possible to set the timer when the caller is holding call->state_lock, so split the lock-getting part of the timer function out. (3) Add trace notes for the ACK we're requesting and the timer we clear. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Return negative error code to kernel serviceDavid Howells
In rxrpc_kernel_recv_data(), when we return the error number incurred by a failed call, we must negate it before returning it as it's stored as positive (that's what we have to pass back to userspace). Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Add missing notificationDavid Howells
The call's background processor work item needs to notify the socket when it completes a call so that recvmsg() or the AFS fs can deal with it. Without this, call expiry isn't handled. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Queue the call on expiryDavid Howells
When a call expires, it must be queued for the background processor to deal with otherwise a service call that is improperly terminated will just sit there awaiting an ACK and won't expire. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Partially handle OpenAFS's improper termination of callsDavid Howells
OpenAFS doesn't always correctly terminate client calls that it makes - this includes calls the OpenAFS servers make to the cache manager service. It should end the client call with either: (1) An ACK that has firstPacket set to one greater than the seq number of the reply DATA packet with the LAST_PACKET flag set (thereby hard-ACK'ing all packets). nAcks should be 0 and acks[] should be empty (ie. no soft-ACKs). (2) An ACKALL packet. OpenAFS, though, may send an ACK packet with firstPacket set to the last seq number or less and soft-ACKs listed for all packets up to and including the last DATA packet. The transmitter, however, is obliged to keep the call live and the soft-ACK'd DATA packets around until they're hard-ACK'd as the receiver is permitted to drop any merely soft-ACK'd packet and request retransmission by sending an ACK packet with a NACK in it. Further, OpenAFS will also terminate a client call by beginning the next client call on the same connection channel. This implicitly completes the previous call. This patch handles implicit ACK of a call on a channel by the reception of the first packet of the next call on that channel. If another call doesn't come along to implicitly ACK a call, then we have to time the call out. There are some bugs there that will be addressed in subsequent patches. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Fix loss of PING RESPONSE ACK production due to PING ACKsDavid Howells
Separate the output of PING ACKs from the output of other sorts of ACK so that if we receive a PING ACK and schedule transmission of a PING RESPONSE ACK, the response doesn't get cancelled by a PING ACK we happen to be scheduling transmission of at the same time. If a PING RESPONSE gets lost, the other side might just sit there waiting for it and refuse to proceed otherwise. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Fix warning by splitting rxrpc_send_call_packet()David Howells
Split rxrpc_send_data_packet() to separate ACK generation (which is more complicated) from ABORT generation. This simplifies the code a bit and fixes the following warning: In file included from ../net/rxrpc/output.c:20:0: net/rxrpc/output.c: In function 'rxrpc_send_call_packet': net/rxrpc/ar-internal.h:1187:27: error: 'top' may be used uninitialized in this function [-Werror=maybe-uninitialized] net/rxrpc/output.c:103:24: note: 'top' was declared here net/rxrpc/output.c:225:25: error: 'hard_ack' may be used uninitialized in this function [-Werror=maybe-uninitialized] Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Only ping for lost reply in client callDavid Howells
When a reply is deemed lost, we send a ping to find out the other end received all the request data packets we sent. This should be limited to client calls and we shouldn't do this on service calls. Signed-off-by: David Howells <dhowells@redhat.com>
2016-10-06rxrpc: Fix oops on incoming call to serviceless endpointDavid Howells
If an call comes in to a local endpoint that isn't listening for any incoming calls at the moment, an oops will happen. We need to check that the local endpoint's service pointer isn't NULL before we dereference it. Signed-off-by: David Howells <dhowells@redhat.com>