linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-09-01	btrfs: fix one bug that process may endlessly wait for ticket in ↵	Wang Xiaoguang
	wait_reserve_ticket() If can_overcommit() in btrfs_calc_reclaim_metadata_size() returns true, btrfs_async_reclaim_metadata_space() will not reclaim metadata space, just return directly and also forget to wake up process which are waiting for their tickets, so these processes will wait endlessly. Fstests case generic/172 with mount option "-o compress=lzo" have revealed this bug in my test machine. Here if we have tickets to handle, we must handle them first. Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-01	Btrfs: fix endless loop in balancing block groups	Liu Bo
	Qgroup function may overwrite the saved error 'err' with 0 in case quota is not enabled, and this ends up with a endless loop in balance because we keep going back to balance the same block group. It really should use 'ret' instead. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-01	Btrfs: kill invalid ASSERT() in process_all_refs()	Josef Bacik
	Suppose you have the following tree in snap1 on a file system mounted with -o inode_cache so that inode numbers are recycled └── [ 258] a └── [ 257] b and then you remove b, rename a to c, and then re-create b in c so you have the following tree └── [ 258] c └── [ 257] b and then you try to do an incremental send you will hit ASSERT(pending_move == 0); in process_all_refs(). This is because we assume that any recycling of inodes will not have a pending change in our path, which isn't the case. This is the case for the DELETE side, since we want to remove the old file using the old path, but on the create side we could have a pending move and need to do the normal pending rename dance. So remove this ASSERT() and put a comment about why we ignore pending_move. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-09-01	Merge tag 'iwlwifi-next-for-kalle-2016-08-30-2' of ↵	Kalle Valo
	git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next * Preparation for new HW continues; * Some DQA improvements; * Support for GMAC;
2016-09-01	Merge tag 'iwlwifi-for-kalle-2016-08-29' of ↵	Kalle Valo
	git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes * Fix P2P dump trigger * Prevent a potential null dereference in iwlmvm * Prevent an uninitialized value from being returned in iwlmvm * Advertise support for channel width change in AP mode
2016-09-01	ovl: update doc	Miklos Szeredi
	Some of the documented quirks no longer apply. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: listxattr: use strnlen()	Miklos Szeredi
	Be defensive about what underlying fs provides us in the returned xattr list buffer. If it's not properly null terminated, bail out with a warning insead of BUG. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-09-01	ovl: Switch to generic_getxattr	Andreas Gruenbacher
	Now that overlayfs has xattr handlers for iop->{set,remove}xattr, use those same handlers for iop->getxattr as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: copyattr after setting POSIX ACL	Miklos Szeredi
	Setting POSIX acl may also modify the file mode, so need to copy that up to the overlay inode. Reported-by: Eryu Guan <eguan@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: Switch to generic_removexattr	Andreas Gruenbacher
	Commit d837a49bd57f ("ovl: fix POSIX ACL setting") switches from iop->setxattr from ovl_setxattr to generic_setxattr, so switch from ovl_removexattr to generic_removexattr as well. As far as permission checking goes, the same rules should apply in either case. While doing that, rename ovl_setxattr to ovl_xattr_set to indicate that this is not an iop->setxattr implementation and remove the unused inode argument. Move ovl_other_xattr_set above ovl_own_xattr_set so that they match the order of handlers in ovl_xattr_handlers. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: Get rid of ovl_xattr_noacl_handlers array	Andreas Gruenbacher
	Use an ordinary #ifdef to conditionally include the POSIX ACL handlers in ovl_xattr_handlers, like the other filesystems do. Flag the code that is now only used conditionally with __maybe_unused. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: Fix OVL_XATTR_PREFIX	Andreas Gruenbacher
	Make sure ovl_own_xattr_handler only matches attribute names starting with "overlay.", not "overlayXXX". Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: fix spelling mistake: "directries" -> "directories"	Colin Ian King
	Trivial fix to spelling mistake in pr_err message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: don't cache acl on overlay layer	Miklos Szeredi
	Some operations (setxattr/chmod) can make the cached acl stale. We either need to clear overlay's acl cache for the affected inode or prevent acl caching on the overlay altogether. Preventing caching has the following advantages: - no double caching, less memory used - overlay cache doesn't go stale when fs clears it's own cache Possible disadvantage is performance loss. If that becomes a problem get_acl() can be optimized for overlayfs. This patch disables caching by pre setting i_*acl to a value that - has bit 0 set, so is_uncached_acl() will return true - is not equal to ACL_NOT_CACHED, so get_acl() will not overwrite it The constant -3 was chosen for this purpose. Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: use cached acl on underlying layer	Miklos Szeredi
	Instead of calling ->get_acl() directly, use get_acl() to get the cached value. We will have the acl cached on the underlying inode anyway, because we do permission checking on the both the overlay and the underlying fs. So, since we already have double caching, this improves performance without any cost. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-09-01	ovl: proper cleanup of workdir	Miklos Szeredi
	When mounting overlayfs it needs a clean "work" directory under the supplied workdir. Previously the mount code removed this directory if it already existed and created a new one. If the removal failed (e.g. directory was not empty) then it fell back to a read-only mount not using the workdir. While this has never been reported, it is possible to get a non-empty "work" dir from a previous mount of overlayfs in case of crash in the middle of an operation using the work directory. In this case the left over state should be discarded and the overlay filesystem will be consistent, guaranteed by the atomicity of operations on moving to/from the workdir to the upper layer. This patch implements cleaning out any files left in workdir. It is implemented using real recursion for simplicity, but the depth is limited to 2, because the worst case is that of a directory containing whiteouts under "work". Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-09-01	ovl: remove posix_acl_default from workdir	Miklos Szeredi
	Clear out posix acl xattrs on workdir and also reset the mode after creation so that an inherited sgid bit is cleared. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-09-01	ovl: handle umask and posix_acl_default correctly on creation	Miklos Szeredi
	Setting MS_POSIXACL in sb->s_flags has the side effect of passing mode to create functions without masking against umask. Another problem when creating over a whiteout is that the default posix acl is not inherited from the parent dir (because the real parent dir at the time of creation is the work directory). Fix these problems by: a) If upper fs does not have MS_POSIXACL, then mask mode with umask. b) If creating over a whiteout, call posix_acl_create() to get the inherited acls. After creation (but before moving to the final destination) set these acls on the created file. posix_acl_create() also updates the file creation mode as appropriate. Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-08-31	Merge branch 'asix-pm-improvements'	David S. Miller
	Robert Foss says: ==================== net/usb: asix driver improvements This is a resubmission of v3, since the netdev mailinlist was not sent the previous submission. This series improves power management of the asix driver. - Suspend/resume support is improved to save needed registers. - Device disconnection is improved. - Fixes AX88772x resume failures - Implementes IEEE 802.3 spec section "22.2.4.1.1 Reset" correctly - Fixes AX_CMD_WRITE_MEDIUM_MODE being set incorrectly Changes since v1: - Added proper metadata tags to series. - Added two more patches to series. Changes since v2: - Added coverletter - Tested patches on AX88772A/AX88772B/AX88178/AX88179 hardware ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: asix: autoneg will set WRITE_MEDIUM reg	Robert Foss
	From: Grant Grundler <grundler@chromium.org> The miii_nway_restart() causes a PHY link change activity and ax88772_link_reset will be called. link_reset will set AX_CMD_WRITE_MEDIUM_MODE register correctly. The asix_write_medium_mode in reset() fills in a default value to the register which may be different from the negotiation result. So do this first. Ignore the ret value since it's ignored in XXX_link_reset() functions. Signed-off-by: Grant Grundler <grundler@google.com> Signed-off-by: Robert Foss <robert.foss@collabora.com> Tested-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: asix: see 802.3 spec for phy reset	Robert Foss
	From: Grant Grundler <grundler@chromium.org> https://lkml.org/lkml/2014/11/11/947 Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET bit is clear. Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Tested-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: asix: Fix AX88772x resume failures	Robert Foss
	From: Allan Chou <allan@asix.com.tw> The change fixes AX88772x resume failure by - Restore incorrect AX88772A PHY registers when resetting - Need to stop MAC operation when suspending - Need to restart MII when restoring PHY Signed-off-by: Allan Chou <allan@asix.com.tw> Signed-off-by: Robert Foss <robert.foss@collabora.com> Tested-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: asix: Avoid looping when the device is disconnected	Robert Foss
	From: Vincent Palatin <vpalatin@chromium.org> Check the answers from the USB stack and avoid re-sending multiple times the request if the device has disappeared. Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Signed-off-by: Robert Foss <robert.foss@collabora.com> Tested-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: asix: Add in_pm parameter	Robert Foss
	From: Freddy Xin <freddy@asix.com.tw> In order to R/W registers in suspend/resume functions, in_pm flags are added to some functions to determine whether the nopm version of usb functions is called. Save BMCR and ANAR PHY registers in suspend function and restore them in resume function. Reset HW in resume function to ensure the PHY works correctly. Signed-off-by: Freddy Xin <freddy@asix.com.tw> Signed-off-by: Robert Foss <robert.foss@collabora.com> Tested-by: Robert Foss <robert.foss@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	Merge branch 'qed-fixes'	David S. Miller
	Sudarsana Reddy Kalluru says: ==================== qed: dcbx fix series. The series contains several small fixes for qed dcbx module. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	qed: Clear dcbx memory buffers before the usage.	Sudarsana Reddy Kalluru
	This patch takes care of clearing the uninitialized buffer before using it. 1. pfc pri-enable bitmap need to be cleared before setting the requested enable bits. Without this, the un-touched values will be merged with requested values and sent to MFW. 2. The data in app-entry field need to be cleared before using it. 3. Clear the output data buffer used in qed_dcbx_query_params(). Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	qed: Set selection-field while configuring the app entry in ieee mode.	Sudarsana Reddy Kalluru
	Management firmware requires the selection-field (SF) to be set for configuring the application/protocol entry in IEEE mode. Without this setting, the app entry will be configured incorrectly in MFW. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	qed*: Disallow dcbx configuration for VF interfaces.	Sudarsana Reddy Kalluru
	Dcbx configuration is not supported for VF interfaces. Hence don't populate the callbacks for VFs and also fail the dcbx-query for VFs. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	kcm: fix a socket double free	WANG Cong
	Dmitry reported a double free on kcm socket, which could be easily reproduced by: #include <unistd.h> #include <sys/syscall.h> int main() { int fd = syscall(SYS_socket, 0x29ul, 0x5ul, 0x0ul, 0, 0, 0); syscall(SYS_ioctl, fd, 0x89e2ul, 0x20a98000ul, 0, 0, 0); return 0; } This is because on the error path, after we install the new socket file, we call sock_release() to clean up the socket, which leaves the fd pointing to a freed socket. Fix this by calling sys_close() on that fd directly. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	Merge branch 'mediatek-fixes'	David S. Miller
	Sean Wang says: ==================== net: ethernet: mediatek: a couple of fixes a couple of fixes come out from integrating with linux-4.8 rc1 they all are verified and workable on linux-4.8 rc1 Changes since v1: - usage of loops to work out if all required clock are ready instead of tedious coding - remove redundant pinctrl setup that is already done by core driver thanks for careful and patient reviewing by Andrew Lunn - splitting distinct changes into the separate patches - change variable naming from err to ret for readable coding Changes since v2: - restore to original clock disabling sequence that is changed accidentally in the last version - refine the commit log that would cause misunderstanding what has been done in the changes - refine the commit log that would cause footnote losing due to improper delimiter use Changes since v3: - fix git rejects caused by mixing a change from net-next, so remake the patch set based on the current net branch again. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix error handling inside mtk_mdio_init	Sean Wang
	Return -ENODEV if the MDIO bus is disabled in the device tree. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Acked-by: John Crispin <john@phrozen.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: use devm_mdiobus_alloc instead of mdiobus_alloc ↵	Sean Wang
	inside mtk_mdio_init a lot of parts in the driver uses devm_* APIs to gain benefits from the device resource management, so devm_mdiobus_alloc is also used instead of mdiobus_alloc to have more elegant code flow. Using common code provided by the devm_* helps to 1) have simplified the code flow as [1] says 2) decrease the risk of incorrect error handling by human 3) only a few drivers used it since it was proposed on linux 3.16, so just hope to promote for this. Ref: [1] https://patchwork.ozlabs.org/patch/344093/ Signed-off-by: Sean Wang <sean.wang@mediatek.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix the missing of_node_put() after node is used ↵	Sean Wang
	done inside mtk_mdio_init This patch adds the missing of_node_put() after finishing the usage of of_get_child_by_name. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix issue of driver removal with interface is up	Sean Wang
	mtk_stop() must be called to stop for freeing DMA resources acquired and restoring state changed by mtk_open() firstly when module removal. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix logic unbalance between probe and remove	Sean Wang
	original mdio_cleanup is not in the symmetric place against where mdio_init is, so relocate mdio_cleanup to the right one. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: remove redundant free_irq for devm_request_irq ↵	Sean Wang
	allocated irq these irqs are not used for shared irq and disabled during ethernet stops. irq requested by devm_request_irq is safe to be freed automatically on driver detach. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix API usage with skb_free_frag	Sean Wang
	use skb_free_frag() instead of legacy put_page() Signed-off-by: Sean Wang <sean.wang@mediatek.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix incorrect return value of devm_clk_get with ↵	Sean Wang
	EPROBE_DEFER 1) If the return value of devm_clk_get is EPROBE_DEFER, we should defer probing the driver. The change is verified and works based on 4.8-rc1 staying with the latest clk-next code for MT7623. 2) Changing with the usage of loops to work out if all clocks required are fine Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: ethernet: mediatek: fix fails from TX housekeeping due to incorrect ↵	Sean Wang
	port setup which net device the SKB is complete for depends on the forward port on txd4 on the corresponding TX descriptor, but the information isn't set up well in case of SKB fragments that would lead to watchdog timeout from the upper layer, so fix it up. Signed-off-by: Sean Wang <sean.wang@mediatek.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: axienet: constify ethtool_ops structures	Julia Lawall
	Check for ethtool_ops structures that are only stored in the ethtool_ops field of a net_device structure or passed as the second argument to netdev_set_default_ethtool_ops. These contexts are declared const, so ethtool_ops structures that have these properties can be declared as const also. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r disable optional_qualifier@ identifier i; position p; @@ static struct ethtool_ops i@p = { ... }; @ok1@ identifier r.i; struct net_device e; position p; @@ e.ethtool_ops = &i@p; @ok2@ identifier r.i; expression e; position p; @@ netdev_set_default_ethtool_ops(e, &i@p) @bad@ position p != {r.p,ok1.p,ok2.p}; identifier r.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r.i; @@ static +const struct ethtool_ops i = { ... }; // </smpl> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	r8152: constify ethtool_ops structures	Julia Lawall
	Check for ethtool_ops structures that are only stored in the ethtool_ops field of a net_device structure or passed as the second argument to netdev_set_default_ethtool_ops. These contexts are declared const, so ethtool_ops structures that have these properties can be declared as const also. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r disable optional_qualifier@ identifier i; position p; @@ static struct ethtool_ops i@p = { ... }; @ok1@ identifier r.i; struct net_device e; position p; @@ e.ethtool_ops = &i@p; @ok2@ identifier r.i; expression e; position p; @@ netdev_set_default_ethtool_ops(e, &i@p) @bad@ position p != {r.p,ok1.p,ok2.p}; identifier r.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r.i; @@ static +const struct ethtool_ops i = { ... }; // </smpl> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: mediatek: constify ethtool_ops structures	Julia Lawall
	Check for ethtool_ops structures that are only stored in the ethtool_ops field of a net_device structure or passed as the second argument to netdev_set_default_ethtool_ops. These contexts are declared const, so ethtool_ops structures that have these properties can be declared as const also. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r disable optional_qualifier@ identifier i; position p; @@ static struct ethtool_ops i@p = { ... }; @ok1@ identifier r.i; struct net_device e; position p; @@ e.ethtool_ops = &i@p; @ok2@ identifier r.i; expression e; position p; @@ netdev_set_default_ethtool_ops(e, &i@p) @bad@ position p != {r.p,ok1.p,ok2.p}; identifier r.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r.i; @@ static +const struct ethtool_ops i = { ... }; // </smpl> Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	Merge branch 'ppp-recursion'	David S. Miller
	Guillaume Nault says: ==================== ppp: fix deadlock upon recursive xmit This series fixes the issue reported by Feng where packets looping through a ppp device makes the module deadlock: https://marc.info/?l=linux-netdev&m=147134567319038&w=2 The problem can occur on virtual interfaces (e.g. PPP over L2TP, or PPPoE on vxlan devices), when a PPP packet is routed back to the PPP interface. PPP's xmit path isn't reentrant, so patch #1 uses a per-cpu variable to detect and break recursion. Patch #2 sets the NETIF_F_LLTX flag to avoid lock inversion issues between ppp and txqueue locks. There are multiple entry points to the PPP xmit path. This series has been tested with lockdep and should address recursion issues no matter how the packet entered the path. A similar issue in L2TP is not covered by this series: l2tp_xmit_skb() also isn't reentrant, and it can be called as part of PPP's xmit path (pppol2tp_xmit()), or directly from the L2TP socket (l2tp_ppp_sendmsg()). If a packet is sent by l2tp_ppp_sendmsg() and routed to the parent PPP interface, then it's going to hit l2tp_xmit_skb() again. Breaking recursion as done in ppp_generic is not enough, because we'd still have a lock inversion issue (locking in l2tp_xmit_skb() can happen before or after locking in ppp_generic). The best approach would be to use the ip_tunnel functions and remove the socket locking in l2tp_xmit_skb(). But that'd be something for net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	ppp: declare PPP devices as LLTX	Guillaume Nault
	ppp_xmit_process() already locks the xmit path. If HARD_TX_LOCK() tries to hold the _xmit_lock we can get lock inversion. [ 973.726130] ====================================================== [ 973.727311] [ INFO: possible circular locking dependency detected ] [ 973.728546] 4.8.0-rc2 #1 Tainted: G O [ 973.728986] ------------------------------------------------------- [ 973.728986] accel-pppd/1806 is trying to acquire lock: [ 973.728986] (&qdisc_xmit_lock_key){+.-...}, at: [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [ 973.728986] but task is already holding lock: [ 973.728986] (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] [ 973.728986] which lock already depends on the new lock. [ 973.728986] [ 973.728986] [ 973.728986] the existing dependency chain (in reverse order) is: [ 973.728986] -> #3 (l2tp_sock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] -> #2 (&(&pch->downl)->rlock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40 [ 973.728986] [<ffffffffa01808e2>] ppp_push+0xa7/0x82d [ppp_generic] [ 973.728986] [<ffffffffa0184675>] __ppp_xmit_process+0x48/0x877 [ppp_generic] [ 973.728986] [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic] [ 973.728986] [<ffffffffa01853f7>] ppp_write+0x10e/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] -> #1 (&(&ppp->wlock)->rlock){+.-...}: [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40 [ 973.728986] [<ffffffffa0184654>] __ppp_xmit_process+0x27/0x877 [ppp_generic] [ 973.728986] [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic] [ 973.728986] [<ffffffffa01852da>] ppp_start_xmit+0x21b/0x22a [ppp_generic] [ 973.728986] [<ffffffff8143f767>] dev_hard_start_xmit+0x1a9/0x43d [ 973.728986] [<ffffffff8146f747>] sch_direct_xmit+0xd6/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff8150e62b>] ip6_finish_output2+0x5a9/0x623 [ 973.728986] [<ffffffff81512128>] ip6_output+0x15e/0x16a [ 973.728986] [<ffffffff8153ef86>] dst_output+0x76/0x7f [ 973.728986] [<ffffffff8153f737>] mld_sendpack+0x335/0x404 [ 973.728986] [<ffffffff81541c61>] mld_send_initial_cr.part.21+0x99/0xa2 [ 973.728986] [<ffffffff8154441d>] ipv6_mc_dad_complete+0x42/0x71 [ 973.728986] [<ffffffff8151c4bd>] addrconf_dad_completed+0x1cf/0x2ea [ 973.728986] [<ffffffff8151e4fa>] addrconf_dad_work+0x453/0x520 [ 973.728986] [<ffffffff8107a393>] process_one_work+0x365/0x6f0 [ 973.728986] [<ffffffff8107aecd>] worker_thread+0x2de/0x421 [ 973.728986] [<ffffffff810816fb>] kthread+0x121/0x130 [ 973.728986] [<ffffffff81575dbf>] ret_from_fork+0x1f/0x40 [ 973.728986] -> #0 (&qdisc_xmit_lock_key){+.-...}: [ 973.728986] [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483 [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff81487811>] ip_finish_output2+0x5db/0x609 [ 973.728986] [<ffffffff81489590>] ip_finish_output+0x152/0x15e [ 973.728986] [<ffffffff8148a0d4>] ip_output+0x8c/0x96 [ 973.728986] [<ffffffff81489652>] ip_local_out+0x41/0x4a [ 973.728986] [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609 [ 973.728986] [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] [ 973.728986] other info that might help us debug this: [ 973.728986] [ 973.728986] Chain exists of: &qdisc_xmit_lock_key --> &(&pch->downl)->rlock --> l2tp_sock [ 973.728986] Possible unsafe locking scenario: [ 973.728986] [ 973.728986] CPU0 CPU1 [ 973.728986] ---- ---- [ 973.728986] lock(l2tp_sock); [ 973.728986] lock(&(&pch->downl)->rlock); [ 973.728986] lock(l2tp_sock); [ 973.728986] lock(&qdisc_xmit_lock_key); [ 973.728986] [ 973.728986] * DEADLOCK * [ 973.728986] [ 973.728986] 6 locks held by accel-pppd/1806: [ 973.728986] #0: (&(&pch->downl)->rlock){+.-...}, at: [<ffffffffa0184efa>] ppp_channel_push+0x56/0x14a [ppp_generic] [ 973.728986] #1: (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core] [ 973.728986] #2: (rcu_read_lock){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #3: (rcu_read_lock_bh){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #4: (rcu_read_lock_bh){......}, at: [<ffffffff814340e3>] rcu_lock_acquire+0x0/0x20 [ 973.728986] #5: (dev->qdisc_running_key ?: &qdisc_running_key#2){+.....}, at: [<ffffffff8144011e>] __dev_queue_xmit+0x564/0x912 [ 973.728986] [ 973.728986] stack backtrace: [ 973.728986] CPU: 2 PID: 1806 Comm: accel-pppd Tainted: G O 4.8.0-rc2 #1 [ 973.728986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 973.728986] ffff7fffffffffff ffff88003436f850 ffffffff812a20f4 ffffffff82156e30 [ 973.728986] ffffffff82156920 ffff88003436f890 ffffffff8115c759 ffff88003344ae00 [ 973.728986] ffff88003344b5c0 0000000000000002 0000000000000006 ffff88003344b5e8 [ 973.728986] Call Trace: [ 973.728986] [<ffffffff812a20f4>] dump_stack+0x67/0x90 [ 973.728986] [<ffffffff8115c759>] print_circular_bug+0x22e/0x23c [ 973.728986] [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483 [ 973.728986] [<ffffffff810b3130>] lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff810b3130>] ? lock_acquire+0x150/0x217 [ 973.728986] [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c [ 973.728986] [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221 [ 973.728986] [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912 [ 973.728986] [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd [ 973.728986] [<ffffffff81449978>] neigh_direct_output+0xc/0xe [ 973.728986] [<ffffffff81487811>] ip_finish_output2+0x5db/0x609 [ 973.728986] [<ffffffff81486853>] ? dst_mtu+0x29/0x2e [ 973.728986] [<ffffffff81489590>] ip_finish_output+0x152/0x15e [ 973.728986] [<ffffffff8148a0bc>] ? ip_output+0x74/0x96 [ 973.728986] [<ffffffff8148a0d4>] ip_output+0x8c/0x96 [ 973.728986] [<ffffffff81489652>] ip_local_out+0x41/0x4a [ 973.728986] [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609 [ 973.728986] [<ffffffff814c559e>] ? udp_set_csum+0x207/0x21e [ 973.728986] [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 973.728986] [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp] [ 973.728986] [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic] [ 973.728986] [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic] [ 973.728986] [<ffffffff811b2ec6>] __vfs_write+0x56/0x120 [ 973.728986] [<ffffffff8124c11d>] ? fsnotify_perm+0x27/0x95 [ 973.728986] [<ffffffff8124d41d>] ? security_file_permission+0x4d/0x54 [ 973.728986] [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b [ 973.728986] [<ffffffff811b4cb2>] SyS_write+0x5e/0x96 [ 973.728986] [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 973.728986] [<ffffffff810ae0fa>] ? trace_hardirqs_off_caller+0x121/0x12f Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	ppp: avoid dealock on recursive xmit	Guillaume Nault
	In case of misconfiguration, a virtual PPP channel might send packets back to their parent PPP interface. This typically happens in misconfigured L2TP setups, where PPP's peer IP address is set with the IP of the L2TP peer. When that happens the system hangs due to PPP trying to recursively lock its xmit path. [ 243.332155] BUG: spinlock recursion on CPU#1, accel-pppd/926 [ 243.333272] lock: 0xffff880033d90f18, .magic: dead4ead, .owner: accel-pppd/926, .owner_cpu: 1 [ 243.334859] CPU: 1 PID: 926 Comm: accel-pppd Not tainted 4.8.0-rc2 #1 [ 243.336010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 243.336018] ffff7fffffffffff ffff8800319a77a0 ffffffff8128de85 ffff880033d90f18 [ 243.336018] ffff880033ad8000 ffff8800319a77d8 ffffffff810ad7c0 ffffffff0000039e [ 243.336018] ffff880033d90f18 ffff880033d90f60 ffff880033d90f18 ffff880033d90f28 [ 243.336018] Call Trace: [ 243.336018] [<ffffffff8128de85>] dump_stack+0x4f/0x65 [ 243.336018] [<ffffffff810ad7c0>] spin_dump+0xe1/0xeb [ 243.336018] [<ffffffff810ad7f0>] spin_bug+0x26/0x28 [ 243.336018] [<ffffffff810ad8b9>] do_raw_spin_lock+0x5c/0x160 [ 243.336018] [<ffffffff815522aa>] _raw_spin_lock_bh+0x35/0x3c [ 243.336018] [<ffffffffa01a88e2>] ? ppp_push+0xa7/0x82d [ppp_generic] [ 243.336018] [<ffffffffa01a88e2>] ppp_push+0xa7/0x82d [ppp_generic] [ 243.336018] [<ffffffff810adada>] ? do_raw_spin_unlock+0xc2/0xcc [ 243.336018] [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7 [ 243.336018] [<ffffffff81552438>] ? _raw_spin_unlock_irqrestore+0x34/0x49 [ 243.336018] [<ffffffffa01ac657>] ppp_xmit_process+0x48/0x877 [ppp_generic] [ 243.336018] [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7 [ 243.336018] [<ffffffff81408cd3>] ? skb_queue_tail+0x71/0x7c [ 243.336018] [<ffffffffa01ad1c5>] ppp_start_xmit+0x21b/0x22a [ppp_generic] [ 243.336018] [<ffffffff81426af1>] dev_hard_start_xmit+0x15e/0x32c [ 243.336018] [<ffffffff81454ed7>] sch_direct_xmit+0xd6/0x221 [ 243.336018] [<ffffffff814273a8>] __dev_queue_xmit+0x52a/0x820 [ 243.336018] [<ffffffff814276a9>] dev_queue_xmit+0xb/0xd [ 243.336018] [<ffffffff81430a3c>] neigh_direct_output+0xc/0xe [ 243.336018] [<ffffffff8146b5d7>] ip_finish_output2+0x4d2/0x548 [ 243.336018] [<ffffffff8146a8e6>] ? dst_mtu+0x29/0x2e [ 243.336018] [<ffffffff8146d49c>] ip_finish_output+0x152/0x15e [ 243.336018] [<ffffffff8146df84>] ? ip_output+0x74/0x96 [ 243.336018] [<ffffffff8146df9c>] ip_output+0x8c/0x96 [ 243.336018] [<ffffffff8146d55e>] ip_local_out+0x41/0x4a [ 243.336018] [<ffffffff8146dd15>] ip_queue_xmit+0x531/0x5c5 [ 243.336018] [<ffffffff814a82cd>] ? udp_set_csum+0x207/0x21e [ 243.336018] [<ffffffffa01f2f04>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core] [ 243.336018] [<ffffffffa01ea458>] pppol2tp_xmit+0x1eb/0x257 [l2tp_ppp] [ 243.336018] [<ffffffffa01acf17>] ppp_channel_push+0x91/0x102 [ppp_generic] [ 243.336018] [<ffffffffa01ad2d8>] ppp_write+0x104/0x11c [ppp_generic] [ 243.336018] [<ffffffff811a3c1e>] __vfs_write+0x56/0x120 [ 243.336018] [<ffffffff81239801>] ? fsnotify_perm+0x27/0x95 [ 243.336018] [<ffffffff8123ab01>] ? security_file_permission+0x4d/0x54 [ 243.336018] [<ffffffff811a4ca4>] vfs_write+0xbd/0x11b [ 243.336018] [<ffffffff811a5a0a>] SyS_write+0x5e/0x96 [ 243.336018] [<ffffffff81552a1b>] entry_SYSCALL_64_fastpath+0x13/0x94 The main entry points for sending packets over a PPP unit are the .write() and .ndo_start_xmit() callbacks (simplified view): .write(unit fd) or .ndo_start_xmit() \ CALL ppp_xmit_process() \ LOCK unit's xmit path (ppp->wlock) \| CALL ppp_push() \ LOCK channel's xmit path (chan->downl) \| CALL lower layer's .start_xmit() callback \ ... might recursively call .ndo_start_xmit() ... / RETURN from .start_xmit() \| UNLOCK channel's xmit path / RETURN from ppp_push() \| UNLOCK unit's xmit path / RETURN from ppp_xmit_process() Packets can also be directly sent on channels (e.g. LCP packets): .write(channel fd) or ppp_output_wakeup() \ CALL ppp_channel_push() \ LOCK channel's xmit path (chan->downl) \| CALL lower layer's .start_xmit() callback \ ... might call .ndo_start_xmit() ... / RETURN from .start_xmit() \| UNLOCK channel's xmit path / RETURN from ppp_channel_push() Key points about the lower layer's .start_xmit() callback: * It can be called directly by a channel fd .write() or by ppp_output_wakeup() or indirectly by a unit fd .write() or by .ndo_start_xmit(). * In any case, it's always called with chan->downl held. * It might route the packet back to its parent unit using .ndo_start_xmit() as entry point. This patch detects and breaks recursion in ppp_xmit_process(). This function is a good candidate for the task because it's called early enough after .ndo_start_xmit(), it's always part of the recursion loop and it's on the path of whatever entry point is used to send a packet on a PPP unit. Recursion detection is done using the per-cpu ppp_xmit_recursion variable. Since ppp_channel_push() too locks the channel's xmit path and calls the lower layer's .start_xmit() callback, we need to also increment ppp_xmit_recursion there. However there's no need to check for recursion, as it's out of the recursion loop. Reported-by: Feng Gao <gfree.wind@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	xgbe: constify get_netdev_ops and get_ethtool_ops	stephen hemminger
	Casting away const is bad practice. Since this is ARM specific driver don't have hardware actually test this. Having getter functions for ops is really unnecessary code bloat, but not going to touch that. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	Merge branch 'dsa-mdb-support'	David S. Miller
	Vivien Didelot says: ==================== net: dsa: add MDB support This patchset adds the switchdev MDB object support to the DSA layer. The MDB support for the mv88e6xxx driver is very similar to the FDB support. The FDB operations care about unicast addresses while the MDB operations care about multicast addresses. Both operation set load/purge/dump the Address Translation Table (ATU), thus common code is used. Changes in v2 based on Andrew's comments: - drop "group" in multicast database related doc and comment - change _one for more relevant _fid in mv88e6xxx_port_db_dump_one - return -EOPNOTSUPP if switchdev obj ID is neither _FDB nor _MDB ==================== Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: dsa: mv88e6xxx: add MDB support	Vivien Didelot
	Add support for the MDB operations. This consists of loading/purging/dumping multicast addresses for a given port in the ATU. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: dsa: mv88e6xxx: make switchdev DB ops generic	Vivien Didelot
	The MDB support for the mv88e6xxx driver will be very similar to the FDB support, since it consists of loading/purging/dumping address to/from the Address Translation Unit (ATU). Prepare the support for MDB by making the FDB code accessing the ATU generic. The FDB operations now provide access to the unicast addresses while the MDB operations will provide access to the multicast addresses. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31	net: dsa: add MDB support	Vivien Didelot
	Add SWITCHDEV_OBJ_ID_PORT_MDB support to the DSA layer. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>