linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2015-08-27	Merge branch 'libnvdimm-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull nvdimm fixlet from Dan Williams: "This is a libnvdimm ABI fixup. I pushed back on this change quite hard given the late date, that it appears to be purely cosmetic, sysfs is not necessarily meant to be a user friendly UI, and the kernel interprets the reversed polarity of the ACPI_NFIT_MEM_ARMED flag correctly. When this flag is set, the energy source of an NVDIMM is not armed and any new writes to the DIMM may not be preserved. However, Bob Moore warned me that it is important to get these things named correctly wherever they appear otherwise we run the risk of a less than cautious firmware engineer implementing the polarity the wrong way. Once a mistake like that escapes into production platforms the flag becomes useless and we need to move to a new bit position. Bob has agreed to take a change through ACPICA to rename ACPI_NFIT_MEM_ARMED to ACPI_NFIT_MEM_NOT_ARMED, and the patch below from Toshi brings the sysfs representation of these flags in line with their respective polarities. Please pull for 4.2 as this is the first kernel to expose the ACPI NFIT sysfs representation, and this is likely a kernel that firmware developers will be using for checking out their NVDIMM enabling" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nfit: Clarify memory device state flags strings
2015-08-27	NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payload	Trond Myklebust
	The "FIXME" is outdated. Flexfiles does add a payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	NFSv4.1/flexfiles: Fix a protocol error in layoutreturn	Trond Myklebust
	According to the flexfiles protocol, the layoutreturn should specify an array of errors in the following format: struct ff_ioerr4 { offset4 ffie_offset; length4 ffie_length; stateid4 ffie_stateid; device_error4 ffie_errors<>; }; This patch fixes up the code to ensure that our ffie_errors is indeed encoded as an array (albeit with only a single entry). Reported-by: Tom Haynes <thomas.haynes@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	Merge branch 'iff_no_queue_fixups'	David S. Miller
	Phil Sutter says: ==================== fixup IFF_NO_QUEUE conversion This series serves two purposes: On one hand it fixes a quite embarrassing bug around the warning I added for drivers still setting tx_queue_len = 0 to achieve noqueue operation. It turned out to be quite useless as due to using alloc_netdev(), many in-kernel drivers fell into the trap by accident, as well. Instead this place serves pretty well as a sanitizing point to set IFF_NO_QUEUE for drivers not initializing tx_queue_len, which in turn allows to drop all special treatment of the latter being zero since that can not happen anymore without IFF_NO_QUEUE being set. On the other hand, it provides a better solution for Eric Dumazet's concern regarding how to assign noqueue to an interface which does not default to it already. In order to make this possible, noqueue is being registered so users can 'tc qd add dev eth0 root noqueue'. In addition, it resolves the ugly situation of 'tc qd show' not showing noqueue. Finally, the former changes allow for some code cleanup. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: sched: simplify attach_one_default_qdisc()	Phil Sutter
	Now that noqueue qdisc can be attached just like any other qdisc, no special treatment is necessary anymore when attaching it as default qdisc. This change has the added benefit that 'tc qdisc show' prints noqueue instead of nothing for devices defaulting to noqueue. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: sched: register noqueue qdisc	Phil Sutter
	This way users can attach noqueue just like any other qdisc using tc without having to mess with tx_queue_len first. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: sched: ignore tx_queue_len when assigning default qdisc	Phil Sutter
	Since alloc_netdev_mqs() sets IFF_NO_QUEUE for drivers not initializing tx_queue_len, it is safe to assume that if tx_queue_len is zero, dev->priv flags always contains IFF_NO_QUEUE. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: fix IFF_NO_QUEUE for drivers using alloc_netdev	Phil Sutter
	Printing a warning in alloc_netdev_mqs() if tx_queue_len is zero and IFF_NO_QUEUE not set is not appropriate since drivers may use one of the alloc_netdev* macros instead of alloc_etherdev*, thereby not intentionally leaving tx_queue_len uninitialized. Instead check here if tx_queue_len is zero and set IFF_NO_QUEUE, so the value of tx_queue_len can be ignored in net/sched_generic.c. Fixes: 906470c ("net: warn if drivers set tx_queue_len = 0") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	sock: fix kernel doc error	Jean Sacren
	The symbol '__sk_reclaim' is not present in the current tree. Apparently '__sk_reclaim' was meant to be '__sk_mem_reclaim', so fix it with the right symbol name for the kernel doc. Signed-off-by: Jean Sacren <sakiwit@gmail.com> Cc: Hideo Aoki <haoki@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state	lucien
	Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown") fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not resetting the association overall_error_count. This allowed the association to better enforce assoc.max_retrans limit. However, the same issue still exists when the association is in SHUTDOWN_RECEIVED state. In this state, HB-ACKs will continue to reset the overall_error_count for the association would extend the lifetime of association unnecessarily. This patch solves this by resetting the overall_error_count whenever the current state is small then SCTP_STATE_SHUTDOWN_PENDING. As a small side-effect, we end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT states, but they are not really impacted because we disable Heartbeats in those states. Fixes: Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1	Kinglong Mee
	Client sends a SETATTR request after OPEN for updating attributes. For create file with S_ISGID is set, the S_ISGID in SETATTR will be ignored at nfs server as chmod of no PERMISSION. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	NFS: Get suppattr_exclcreat when getting server capabilities	Kinglong Mee
	Create file with attributs as NFS4_CREATE_EXCLUSIVE4_1 mode depends on suppattr_exclcreat attribut. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	NFS: Update NFS4_BITMAP_SIZE	Kinglong Mee
	v4.1/v4.2 have define attributes at word2, nfs client also support security label now. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	NFS: Make opened as optional argument in _nfs4_do_open	Kinglong Mee
	Check opened, only update it when non-NULL. It's not needs define an unused value for the opened when calling _nfs4_do_open. v3, same as v2. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	NFS: Check size by inode_newsize_ok in nfs_setattr	Kinglong Mee
	Set rlimit for NFS's files is useless right now. For local process's rlimit, it should be checked by nfs client. The same, CIFS also call inode_change_ok checking rlimit at its client in cifs_setattr_nounix() and cifs_setattr_unix(). v3, fix bad using of error Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB	Dan Williams
	Given that a write-back (WB) mapping plus non-temporal stores is expected to be the most efficient way to access PMEM, update the definition of ARCH_HAS_PMEM_API to imply arch support for WB-mapped-PMEM. This is needed as a pre-requisite for adding PMEM to the direct map and mapping it with struct page. The above clarification for X86_64 means that memcpy_to_pmem() is permitted to use the non-temporal arch_memcpy_to_pmem() rather than needlessly fall back to default_memcpy_to_pmem() when the pcommit instruction is not available. When arch_memcpy_to_pmem() is not guaranteed to flush writes out of cache, i.e. on older X86_32 implementations where non-temporal stores may just dirty cache, ARCH_HAS_PMEM_API is simply disabled. The default fall back for persistent memory handling remains. Namely, map it with the WT (write-through) cache-type and hope for the best. arch_has_pmem_api() is updated to only indicate whether the arch provides the proper helpers to meet the minimum "writes are visible outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()". Code that cares whether wmb_pmem() actually flushes writes to pmem must now call arch_has_wmb_pmem() directly. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> [hch: set ARCH_HAS_PMEM_API=n on x86_32] Reviewed-by: Christoph Hellwig <hch@lst.de> [toshi: x86_32 compile fixes] Signed-off-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	add devm_memremap_pages	Christoph Hellwig
	This behaves like devm_memremap except that it ensures we have page structures available that can back the region. Signed-off-by: Christoph Hellwig <hch@lst.de> [djbw: catch attempts to remap RAM, drop flags] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	mm: ZONE_DEVICE for "device memory"	Dan Williams
	While pmem is usable as a block device or via DAX mappings to userspace there are several usage scenarios that can not target pmem due to its lack of struct page coverage. In preparation for "hot plugging" pmem into the vmemmap add ZONE_DEVICE as a new zone to tag these pages separately from the ones that are subject to standard page allocations. Importantly "device memory" can be removed at will by userspace unbinding the driver of the device. Having a separate zone prevents allocation and otherwise marks these pages that are distinct from typical uniform memory. Device memory has different lifetime and performance characteristics than RAM. However, since we have run out of ZONES_SHIFT bits this functionality currently depends on sacrificing ZONE_DMA. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Jerome Glisse <j.glisse@gmail.com> [hch: various simplifications in the arch interface] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h	Christoph Hellwig
	Three architectures already define these, and we'll need them genericly soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	dax: drop size parameter to ->direct_access()	Dan Williams
	None of the implementations currently use it. The common bdev_direct_access() entry point handles all the size checks before calling ->direct_access(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	net/mlx4_core: Fix unintialized variable used in error path	Carol L Soto
	The uninitialized value name in mlx4_en_activate_cq was used in order to print an error message. Fixing it by replacing it with cq->vector. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	Merge branch 'pmem-api' into libnvdimm-for-next	Dan Williams

2015-08-27	net/mlx4_core: Capping number of requested MSIXs to MAX_MSIX	Carol L Soto
	We currently manage IRQs in pool_bm which is a bit field of MAX_MSIX bits. Thus, allocating more than MAX_MSIX interrupts can't be managed in pool_bm. Fixing this by capping number of requested MSIXs to MAX_MSIX. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	bridge: fdb: rearrange net_bridge_fdb_entry	Nikolay Aleksandrov
	While looking into fixing the local entries scalability issue I noticed that the structure is badly arranged because vlan_id would fall in a second cache line while keeping rcu which is used only when deleting in the first, so re-arrange the structure and push rcu to the end so we can get 16 bytes which can be used for other fields (by pushing rcu fully in the second 64 byte chunk). With this change all the core necessary information when doing fdb lookups will be available in a single cache line. pahole before (note vlan_id): struct net_bridge_fdb_entry { struct hlist_node hlist; /* 0 16 / struct net_bridge_port dst; /* 16 8 / struct callback_head rcu; / 24 16 / long unsigned int updated; / 40 8 / long unsigned int used; / 48 8 / mac_addr addr; / 56 6 / unsigned char is_local:1; / 62: 7 1 / unsigned char is_static:1; / 62: 6 1 / unsigned char added_by_user:1; / 62: 5 1 / unsigned char added_by_external_learn:1; / 62: 4 1 / / XXX 4 bits hole, try to pack / / XXX 1 byte hole, try to pack / / --- cacheline 1 boundary (64 bytes) --- / __u16 vlan_id; / 64 2 / / size: 72, cachelines: 2, members: 11 / / sum members: 65, holes: 1, sum holes: 1 / / bit holes: 1, sum bit holes: 4 bits / / padding: 6 / / last cacheline: 8 bytes / } pahole after (note vlan_id): struct net_bridge_fdb_entry { struct hlist_node hlist; / 0 16 / struct net_bridge_port dst; /* 16 8 / long unsigned int updated; / 24 8 / long unsigned int used; / 32 8 / mac_addr addr; / 40 6 / __u16 vlan_id; / 46 2 / unsigned char is_local:1; / 48: 7 1 / unsigned char is_static:1; / 48: 6 1 / unsigned char added_by_user:1; / 48: 5 1 / unsigned char added_by_external_learn:1; / 48: 4 1 / / XXX 4 bits hole, try to pack / / XXX 7 bytes hole, try to pack / struct callback_head rcu; / 56 16 / / --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- / / size: 72, cachelines: 2, members: 11 / / sum members: 65, holes: 1, sum holes: 7 / / bit holes: 1, sum bit holes: 4 bits / / last cacheline: 8 bytes */ } Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	nd_blk: change aperture mapping from WC to WB	Ross Zwisler
	This should result in a pretty sizeable performance gain for reads. For rough comparison I did some simple read testing using PMEM to compare reads of write combining (WC) mappings vs write-back (WB). This was done on a random lab machine. PMEM reads from a write combining mapping: # dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000 100000+0 records in 100000+0 records out 409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s PMEM reads from a write-back mapping: # dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000 1000000+0 records in 1000000+0 records out 4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s To be able to safely support a write-back aperture I needed to add support for the "read flush" _DSM flag, as outlined in the DSM spec: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf This flag tells the ND BLK driver that it needs to flush the cache lines associated with the aperture after the aperture is moved but before any new data is read. This ensures that any stale cache lines from the previous contents of the aperture will be discarded from the processor cache, and the new data will be read properly from the DIMM. We know that the cache lines are clean and will be discarded without any writeback because either a) the previous aperture operation was a read, and we never modified the contents of the aperture, or b) the previous aperture operation was a write and we must have written back the dirtied contents of the aperture to the DIMM before the I/O was completed. In order to add support for the "read flush" flag I needed to add a generic routine to invalidate cache lines, mmio_flush_range(). This is protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently only supported on x86. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	Merge branch 'ovs-v6-build-err'	David S. Miller
	Joe Stringer says: ==================== OPENVSWITCH && !NETFILTER build fix. Fix issues reported by kbuild test robot: All error/warnings (new ones prefixed by >>): net/openvswitch/actions.c: In function 'ovs_fragment': >> net/openvswitch/actions.c:705:16: error: implicit declaration of function 'nf_get_ipv6_ops' [-Werror=implicit-function-declaration] const struct nf_ipv6_ops v6ops = nf_get_ipv6_ops(); ^ >> net/openvswitch/actions.c:705:37: warning: initialization makes pointer from integer without a cast const struct nf_ipv6_ops v6ops = nf_get_ipv6_ops(); ^ >> net/openvswitch/actions.c:707:19: error: storage size of 'ovs_rt' isn't known struct rt6_info ovs_rt; ^ >> net/openvswitch/actions.c:724:8: error: dereferencing pointer to incomplete type v6ops->fragment(skb->sk, skb, ovs_vport_output); ^ >> net/openvswitch/actions.c:707:19: warning: unused variable 'ovs_rt' [-Wunused-variable] struct rt6_info ovs_rt; ^ cc1: some warnings being treated as errors ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	openvswitch: Include ip6_fib.h.	Joe Stringer
	kbuild test robot reports that certain configurations will not automatically pick up on the "struct rt6_info" definition, so explicitly include the header for this structure. Fixes: 7f8a436 "openvswitch: Add conntrack action" Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	netfilter: Define v6ops in !CONFIG_NETFILTER case.	Joe Stringer
	When CONFIG_OPENVSWITCH is set, and CONFIG_NETFILTER is not set, the openvswitch IPv6 fragmentation handling cannot refer to ipv6_ops because it isn't defined. Add a dummy version to avoid #ifdefs in source files. Fixes: 7f8a436 "openvswitch: Add conntrack action" Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	nvdimm: change to use generic kvfree()	yalin wang
	Signed-off-by: yalin wang <yalin.wang2010@gmail.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27	Merge branch 'mlxsw-small-updates'	David S. Miller
	Jiri Pirko says: ==================== mlxsw: small driver update Ido Schimmel (2): mlxsw: Remove duplicate included header mlxsw: Make mailboxes 4KB aligned Jiri Pirko (1): mlxsw: adjust transmit fail log message level in __mlxsw_emad_transmit ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	mlxsw: Make mailboxes 4KB aligned	Ido Schimmel
	The HW-SW contract requires mailboxes passed to the firmware to be 4KB aligned. Previously, these mailboxes were mapped using streaming DMA routines, which do not guarantee the bus addresses to be 4KB aligned. Under certain conditions this constraint was indeed violated and errors were observed. By using consistent DMA mapping routines together with a mailbox size of 4KB we are guaranteed not to violate the constraint. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	mlxsw: adjust transmit fail log message level in __mlxsw_emad_transmit	Jiri Pirko
	When transmit fails, it is an error, not a warning. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	mlxsw: Remove duplicate included header	Ido Schimmel
	Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	mtd: mtd_oobtest: Fix the address offset with vary_offset case	Roger Quadros
	When vary_offset is set (e.g. test case 3), the offset is not always zero so memcmpshow() will show the wrong offset in the print message. To fix this we introduce a new function memcmpshowoffset() which takes offset as a parameter and displays the right offset and use it in the case where offset is non zero. The old memcmpshow() functionality is preserved by converting it into a macro with offset preset to 0. Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
2015-08-27	Merge branch 'rocker-master-change'	David S. Miller
	Jiri Pirko says: ==================== rocker: make master change handling nicer Jiri Pirko (6): net: introduce change upper device notifier change info net: add netif_is_bridge_master helper net: add netif_is_ovs_master helper with IFF_OPENVSWITCH private flag net: kill long time unused bonding private flags rocker: use new helper to figure out master kind rocker: use change upper info ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	rocker: use change upper info	Jiri Pirko
	Since now information about changed upper is passed along, benefit from that and use this info directly. This also fixes possible issues that could happen when non-master device is added (current code does not distinguish between master and non-master upper device). Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	rocker: use new helper to figure out master kind	Jiri Pirko
	Looking at rtnl kind string is kind of ugly. So use new helpers to do this in nicer way. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: kill long time unused bonding private flags	Jiri Pirko
	We don't use them for years, just kill them now. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: add netif_is_ovs_master helper with IFF_OPENVSWITCH private flag	Jiri Pirko
	Add this helper so code can easily figure out if netdev is openswitch. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: add netif_is_bridge_master helper	Jiri Pirko
	Add this helper so code can easily figure out if netdev is a bridge. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	net: introduce change upper device notifier change info	Jiri Pirko
	Add info that is passed along with NETDEV_CHANGEUPPER event. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return must notify of layout return	Trond Myklebust
	It's not sufficient to just mark the layout segment for layout return. We also need to set the NFS_LAYOUT_RETURN_BEFORE_CLOSE flag in the layout header. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-27	virtio-net: avoid unnecessary sg initialzation	Jason Wang
	Usually an skb does not have up to MAX_SKB_FRAGS frags. So no need to initialize the unuse part of sg. This patch initialize the sg based on the real number it will used: - during xmit, it could be inferred from nr_frags and can_push. - for small receive buffer, it will also be 2. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	Merge branch 'geneve-consolidation'	David S. Miller
	Pravin B Shelar says: ==================== Geneve: Add support for tunnel metadata mode Following patches adds support for Geneve tunnel metadata mode. OVS can make use of Geneve net-device with tunnel metadata API from kernel. This also allows us to consolidate Geneve implementation from two kernel modules geneve_core and geneve to single geneve module. geneve_core module was targeted to share Geneve encap and decap code between Geneve netdevice and OVS Geneve tunnel implementation, Since OVS no longer needs these API, Geneve code can be consolidated into single geneve module. v3-v4: - Drop NETIF_F_NETNS_LOCAL feature. - Fix geneve device newlink check v2-v3: - make tunnel medata device and regular device mutually exclusive. - Fix Kconfig dependency for Geneve. - Fix dst-port netlink encoding. - drop changelink patch. v1-v2: - Replaced per hash table tunnel pointer (metadata enabled) with flag. - Added support for changelink. - Improve geneve device route lookup with more parameters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	geneve: Move device hash table to geneve socket.	Pravin B Shelar
	This change simplifies Geneve Tunnel hash table management. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Reviewed-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	geneve: Consolidate Geneve functionality in single module.	Pravin B Shelar
	geneve_core module handles send and receive functionality. This way OVS could use the Geneve API. Now with use of tunnel meatadata mode OVS can directly use Geneve netdevice. So there is no need for separate module for Geneve. Following patch consolidates Geneve protocol processing in single module. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	openvswitch: Use Geneve device.	Pravin B Shelar
	With help of tunnel metadata mode OVS can directly use Geneve devices to implement Geneve tunnels. This patch removes all of the OVS specific Geneve code and make OVS use a Geneve net_device. Basic geneve vport is still there to handle compatibility with current userspace application. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	geneve: Add support to collect tunnel metadata.	Pravin B Shelar
	Following patch create new tunnel flag which enable tunnel metadata collection on given device. These devices can be used by tunnel metadata based routing or by OVS. Geneve Consolidation patch get rid of collect_md_tun to simplify tunnel lookup further. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	geneve: Make dst-port configurable.	Pravin B Shelar
	Add netlink interface to configure Geneve UDP port number. So that user can configure it for a Gevene device. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27	tunnel: introduce udp_tun_rx_dst()	Pravin B Shelar
	Introduce function udp_tun_rx_dst() to initialize tunnel dst on receive path. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>