linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2016-11-28	Documentation: net: phy: remove description of function pointers	Florian Fainelli
	Remove the function pointers documentation which duplicates information found in include/linux/phy.h. Maintaining documentation about two different locations just does not work, but the code is less likely to be outdated. Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net: dsa: mv88e6xxx: Fix mv88e6xxx_g1_irq_free() interrupt count	Andreas Färber
	mv88e6xxx_g1_irq_setup() sets up chip->g1_irq.nirqs interrupt mappings, so free the same amount. This will be 8 or 9 in practice, less than 16. Fixes: dc30c35be720 ("net: dsa: mv88e6xxx: Implement interrupt support.") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	dbri: Fix compiler warning	Tushar Dave
	dbri uses 'u32' for dma handle while invoking kernel DMA APIs, instead of using dma_addr_t. This hasn't caused any 'incompatible pointer type' warning on SPARC because until now dma_addr_t is of type u32. However, recent changes in SPARC ATU (iommu) enabled 64bit DMA and therefore dma_addr_t became of type u64. This makes 'incompatible pointer type' warnings inevitable. e.g. sound/sparc/dbri.c: In function ‘snd_dbri_create’: sound/sparc/dbri.c:2538: warning: passing argument 3 of ‘dma_zalloc_coherent’ from incompatible pointer type ./include/linux/dma-mapping.h:608: note: expected ‘dma_addr_t ’ but argument is of type ‘u32 ’ For the record, dbri(sbus) driver never executes on sun4v. Therefore even though 64bit DMA is enabled on SPARC, dbri continues to use legacy iommu that guarantees DMA address is always in 32bit range. This patch resolves above compiler warning. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Reviewed-by: thomas tai <thomas.tai@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	qlogicpti: Fix compiler warnings	Tushar Dave
	qlogicpti uses '__u32' for dma handle while invoking kernel DMA APIs, instead of using dma_addr_t. This hasn't caused any 'incompatible pointer type' warning on SPARC because until now dma_addr_t is of type u32. However, recent changes in SPARC ATU (iommu) enabled 64bit DMA and therefore dma_addr_t became of type u64. This makes 'incompatible pointer type' warnings inevitable. e.g. drivers/scsi/qlogicpti.c: In function ‘qpti_map_queues’: drivers/scsi/qlogicpti.c:813: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type ./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t ’ but argument is of type ‘__u32 ’ drivers/scsi/qlogicpti.c:822: warning: passing argument 3 of ‘dma_alloc_coherent’ from incompatible pointer type ./include/linux/dma-mapping.h:445: note: expected ‘dma_addr_t ’ but argument is of type ‘__u32 ’ For the record, qlogicpti never executes on sun4v. Therefore even though 64bit DMA is enabled on SPARC, qlogicpti continues to use legacy iommu that guarantees DMA address is always in 32bit range. This patch resolves aforementioned compiler warnings. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Reviewed-by: thomas tai <thomas.tai@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	Merge branch 'mlx4-fixes'	David S. Miller
	Tariq Toukan says: ==================== mlx4 bug fixes for 4.9 This patchset includes 2 bug fixes: * In patch 1 we revert the commit that avoids invoking unregister_netdev in shutdown flow, as it introduces netdev presence issues where it can be accessed unsafely by ndo operations during the flow. * Patch 2 is a simple fix for a variable uninitialization issue. Series generated against net commit: 6998cc6ec237 tipc: resolve connection flow control compatibility problem ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx4: Fix uninitialized fields in rule when adding promiscuous mode to ↵	Jack Morgenstein
	device managed flow steering In procedure mlx4_flow_steer_promisc_add(), several fields were left uninitialized in the rule structure. Correctly initialize these fields. Fixes: 592e49dda812 ("net/mlx4: Implement promiscuous mode with device managed flow-steering") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	Revert "net/mlx4_en: Avoid unregister_netdev at shutdown flow"	Tariq Toukan
	This reverts commit 9d76931180557270796f9631e2c79b9c7bb3c9fb. Using unregister_netdev at shutdown flow prevents calling the netdev's ndos or trying to access its freed resources. This fixes crashes like the following: Call Trace: [<ffffffff81587a6e>] dev_get_phys_port_id+0x1e/0x30 [<ffffffff815a36ce>] rtnl_fill_ifinfo+0x4be/0xff0 [<ffffffff815a53f3>] rtmsg_ifinfo_build_skb+0x73/0xe0 [<ffffffff815a5476>] rtmsg_ifinfo.part.27+0x16/0x50 [<ffffffff815a54c8>] rtmsg_ifinfo+0x18/0x20 [<ffffffff8158a6c6>] netdev_state_change+0x46/0x50 [<ffffffff815a5e78>] linkwatch_do_dev+0x38/0x50 [<ffffffff815a6165>] __linkwatch_run_queue+0xf5/0x170 [<ffffffff815a6205>] linkwatch_event+0x25/0x30 [<ffffffff81099a82>] process_one_work+0x152/0x400 [<ffffffff8109a325>] worker_thread+0x125/0x4b0 [<ffffffff8109a200>] ? rescuer_thread+0x350/0x350 [<ffffffff8109fc6a>] kthread+0xca/0xe0 [<ffffffff8109fba0>] ? kthread_park+0x60/0x60 [<ffffffff816a1285>] ret_from_fork+0x25/0x30 Fixes: 9d7693118055 ("net/mlx4_en: Avoid unregister_netdev at shutdown flow") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	Merge branch 'mlx5-DCBX-and-ethtool-updates'	David S. Miller
	Saeed Mahameed says: ==================== Mellanox 100G mlx5 DCBX and ethtool updates This series provides the following mlx5 updates: From Huy: DCBX CEE API and DCBX firmware/host modes support. - 1st patch ensures the dcbnl_rtnl_ops is published only when the qos capability bits is on. - 2nd patch adds the support for CEE interfaces into mlx5 dcbnl_rtnl_ops - 3rd patch refactors ETS query to read ETS configuration directly from firmware rather than having a software shadow to it. The existing IEEE interfaces stays the same. - 4th patch adds the support for MLX5_REG_DCBX_PARAM and MLX5_REG_DCBX_APP firmware commands to manipulate mlx5 DCBX mode. - 5th patch adds the driver support for the new DCBX firmware. This ensures the backward compatibility versus the old and new firmware. With the new DCBX firmware, qos settings can be controlled by either firmware or software depending on the DCBX mode. From Kamal and Saeed: - mlx5 self-test support. From Shaker: - Private flag to give the user the ability to enable/disable mlx5 CQE compression. V1->V2: - Check ETS capability where needed in: ("net/mlx5e: Read ETS settings directly from firmware") - Fix return value of mlx5e_dcbnl_switch_to_host_mode in: ("net/mlx5e: ConnectX-4 firmware support for DCBX") - Update commit message of: ("net/mlx5e: ConnectX-4 firmware support for DCBX") - Fix two sparse static check warnings in en_selftest.c This series was generated against commit: e5f12b3f5ebb ("Merge branch 'mlxsw-trap-groups-and-policers'") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Add CQE compression user control	Shaker Daibes
	The user can now override the automatic driver decision using the rx_cqe_compress flag, which is the preference for CQE compression. The flag is initialized with the automatic driver decision. Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Moves pflags to priv->params	Shaker Daibes
	pflags is a configuration parameter for the netdev, naturally it belongs to priv->params. Also introduce MLX5E_GET_PFLAG Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Add support for loopback selftest	Saeed Mahameed
	Extend the self diagnostic tests to support loopback test. The loopback test doesn't require the offline flag, it will use the generic dev_queue_xmit and a dedicated packet_type to capture and verify mlx5e selftest loopback packets. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Add support for ethtool self diagnostics test	Kamal Heib
	The self diagnostics test implementaion include the following features: 1. Link Test: Check that link is in up state. 2. Speed Test: Check that link was negotiated correctly. 3. Health Test: Check the device health. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Add DCBX control interface	Huy Nguyen
	Use setdcbx interface to set the DCBX mode to firmware or os. If setdcbx is called with mode value of zero, the DCBX mode is set to firmware. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: ConnectX-4 firmware support for DCBX	Huy Nguyen
	DBCX by default is controlled by firmware where dcbx capability bit is set. In this mode, firmware is responsible for reading/sending the TLV packets from/to the remote partner. This patch sets up the infrastructure to move between HOST/FW DCBX control mode. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5: Add DCBX firmware commands support	Huy Nguyen
	Add set/query commands for DCBX_PARAM register Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Read ETS settings directly from firmware	Huy Nguyen
	Issue description: Current implementation saves the ETS settings from user in a temporal soft copy and returns this settings when user queries the ETS settings. With the new DCBX firmware, the ETS settings can be changed by firmware when the DCBX is in firmware controlled mode. Therefore, user will obtain wrong values from the temporal soft copy. Solution: 1. Read the ETS settings directly from firmware. 2. For tc_tsa: a. Initialize tc_tsa to vendor IEEE_8021QAZ_TSA_VENDOR at netdev creation. b. When reading ETS setting from FW, if the traffic class bandwidth is less than 100, set tc_tsa to IEEE_8021QAZ_TSA_ETS. This implementation solves the scenarios when the DCBX is in FW control and willing bit is on which means the ETS setting is dictated by remote switch. Also check ETS capability where needed. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Support DCBX CEE API	Huy Nguyen
	Add DCBX CEE API interface for ConnectX-4. Configurations are stored in a temporary structure and are applied to the card's firmware when the CEE's setall callback function is called. Note: priority group in CEE is equivalent to traffic class in ConnectX-4 hardware spec. bw allocation per priority in CEE is not supported because ConnectX-4 only supports bw allocation per traffic class. user priority in CEE does not have an equivalent term in ConnectX-4. Therefore, user priority to priority mapping in CEE is not supported. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/mlx5e: Add qos capability check	Huy Nguyen
	Make sure firmware supports qos before exposing the DCB API. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/sched: Export tc_tunnel_key so its UAPI accessible	Roi Dayan
	Export tc_tunnel_key so it can be used from user space. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	virtio-net: enable multiqueue by default	Jason Wang
	We use single queue even if multiqueue is enabled and let admin to enable it through ethtool later. This is used to avoid possible regression (small packet TCP stream transmission). But looks like an overkill since: - single queue user can disable multiqueue when launching qemu - brings extra troubles for the management since it needs extra admin tool in guest to enable multiqueue - multiqueue performs much better than single queue in most of the cases So this patch enables multiqueue by default: if #queues is less than or equal to #vcpu, enable as much as queue pairs; if #queues is greater than #vcpu, enable #vcpu queue pairs. Cc: Hannes Frederic Sowa <hannes@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Neil Horman <nhorman@redhat.com> Cc: Jeremy Eder <jeder@redhat.com> Cc: Marko Myllynen <myllynen@redhat.com> Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	amd-xgbe: Fix unused suspend handlers build warning	Borislav Petkov
	Fix: drivers/net/ethernet/amd/xgbe/xgbe-main.c:835:12: warning: ‘xgbe_suspend’ defined but not used [-Wunused-function] drivers/net/ethernet/amd/xgbe/xgbe-main.c:855:12: warning: ‘xgbe_resume’ defined but not used [-Wunused-function] I see it during randconfig builds here. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	ARC: mm: IOC: Don't enable IOC by default	Vineet Gupta
	Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-11-28	ARC: Don't use "+l" inline asm constraint	Vineet Gupta
	Apparenty this is coming in the way of gcc fix which inhibits the usage of LP_COUNT as a gpr. Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-11-28	tcp: Set DEFAULT_TCP_CONG to bbr if DEFAULT_BBR is set	Julian Wollrath
	Signed-off-by: Julian Wollrath <jwollrath@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	Merge branch 'fix-RTL8211F-TX-delay-handling'	David S. Miller
	Martin Blumenstingl says: ==================== net: phy: realtek: fix RTL8211F TX-delay handling The RTL8211F PHY driver currently enables the TX-delay only when the phy-mode is PHY_INTERFACE_MODE_RGMII. This is incorrect, because there are three RGMII variations of the phy-mode which explicitly request the PHY to enable the RX and/or TX delay, while PHY_INTERFACE_MODE_RGMII specifies that the PHY should disable the RX and/or TX delays. Additionally to the RTL8211F PHY driver change this contains a small update to the phy-mode documentation to clarify the purpose of the RGMII phy-modes. While this may not be perfect yet it's at least a start. Please feel free to drop this patch from this series and send an improved version yourself. These patches are the results of recent discussions, see [0] [0] http://lists.infradead.org/pipermail/linux-amlogic/2016-November/001688.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net: phy: realtek: fix enabling of the TX-delay for RTL8211F	Martin Blumenstingl
	The old logic always enabled the TX-delay when the phy-mode was set to PHY_INTERFACE_MODE_RGMII. There are dedicated phy-modes which tell the PHY driver to enable the RX and/or TX delays: - PHY_INTERFACE_MODE_RGMII should disable the RX and TX delay in the PHY (if required, the MAC should add the delays in this case) - PHY_INTERFACE_MODE_RGMII_ID should enable RX and TX delay in the PHY - PHY_INTERFACE_MODE_RGMII_TXID should enable the TX delay in the PHY - PHY_INTERFACE_MODE_RGMII_RXID should enable the RX delay in the PHY (currently not supported by RTL8211F) With this patch we enable the TX delay for PHY_INTERFACE_MODE_RGMII_ID and PHY_INTERFACE_MODE_RGMII_TXID. Additionally we now explicity disable the TX-delay, which seems to be enabled automatically after a hard-reset of the PHY (by triggering it's reset pin) to get a consistent state (as defined by the phy-mode). This fixes a compatibility problem with some SoCs where the TX-delay was also added by the MAC. With the TX-delay being applied twice the TX clock was off and TX traffic was broken or very slow (<10Mbit/s) on 1000Mbit/s links. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	Documentation: devicetree: clarify usage of the RGMII phy-modes	Martin Blumenstingl
	RGMII requires special RX and/or TX delays depending on the actual hardware circuit/wiring. These delays can be added by the MAC, the PHY or the designer of the circuit (the latter means that no delay has to be added by PHY or MAC). There are 4 RGMII phy-modes used describe where a delay should be applied: - rgmii: the RX and TX delays are either added by the MAC (where the exact delay is typically configurable, and can be turned off when no extra delay is needed) or not needed at all (because the hardware wiring adds the delay already). The PHY should neither add the RX nor TX delay in this case. - rgmii-rxid: configures the PHY to enable the RX delay. The MAC should not add the RX delay in this case. - rgmii-txid: configures the PHY to enable the TX delay. The MAC should not add the TX delay in this case. - rgmii-id: combines rgmii-rxid and rgmii-txid and thus configures the PHY to enable the RX and TX delays. The MAC should neither add the RX nor TX delay in this case. Document these cases in the ethernet.txt documentation to make it clear when to use each mode. If applied incorrectly one might end up with MAC and PHY both enabling for example the TX delay, which breaks ethernet TX traffic on 1000Mbit/s links. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	Merge branch 'MV88E6097-fixes'	David S. Miller
	Stefan Eichenberger says: ==================== Fix support for the MV88E6097 This patchset fixes the following two issues for the MV88E6097: - Add missing definition of g1_irqs - Add missing comment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net: dsa: mv88e6xxx: add missing comment for MV88E6097	Stefan Eichenberger
	Add a missing comment for the MV88E6097 because of unification. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@netmodule.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net: dsa: mv88e6xxx: add g1_irqs definition for MV88E6097	Stefan Eichenberger
	Add the missing definition of g1_irqs for MV88E6097. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@netmodule.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	net/iucv: Use explicit clean up labels in iucv_init()	Sebastian Andrzej Siewior
	Ursula suggested to use explicit labels for clean up in the error path instead of one `out_free' label, which handles multiple exits, introduced in commit 38b482929e8f ("net/iucv: Convert to hotplug state machine"). Suggested-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-s390@vger.kernel.org Cc: netdev@vger.kernel.org Cc: rt@linutronix.de Cc: "David S. Miller" <davem@davemloft.net> Link: http://lkml.kernel.org/r/20161124161013.dukr42y2nwscosk6@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-28	net, sched: respect rcu grace period on cls destruction	Daniel Borkmann
	Roi reported a crash in flower where tp->root was NULL in ->classify() callbacks. Reason is that in ->destroy() tp->root is set to NULL via RCU_INIT_POINTER(). It's problematic for some of the classifiers, because this doesn't respect RCU grace period for them, and as a result, still outstanding readers from tc_classify() will try to blindly dereference a NULL tp->root. The tp->root object is strictly private to the classifier implementation and holds internal data the core such as tc_ctl_tfilter() doesn't know about. Within some classifiers, such as cls_bpf, cls_basic, etc, tp->root is only checked for NULL in ->get() callback, but nowhere else. This is misleading and seemed to be copied from old classifier code that was not cleaned up properly. For example, d3fa76ee6b4a ("[NET_SCHED]: cls_basic: fix NULL pointer dereference") moved tp->root initialization into ->init() routine, where before it was part of ->change(), so ->get() had to deal with tp->root being NULL back then, so that was indeed a valid case, after d3fa76ee6b4a, not really anymore. We used to set tp->root to NULL long ago in ->destroy(), see 47a1a1d4be29 ("pkt_sched: remove unnecessary xchg() in packet classifiers"); but the NULLifying was reintroduced with the RCUification, but it's not correct for every classifier implementation. In the cases that are fixed here with one exception of cls_cgroup, tp->root object is allocated and initialized inside ->init() callback, which is always performed at a point in time after we allocate a new tp, which means tp and thus tp->root was not globally visible in the tp chain yet (see tc_ctl_tfilter()). Also, on destruction tp->root is strictly kfree_rcu()'ed in ->destroy() handler, same for the tp which is kfree_rcu()'ed right when we return from ->destroy() in tcf_destroy(). This means, the head object's lifetime for such classifiers is always tied to the tp lifetime. The RCU callback invocation for the two kfree_rcu() could be out of order, but that's fine since both are independent. Dropping the RCU_INIT_POINTER(tp->root, NULL) for these classifiers here means that 1) we don't need a useless NULL check in fast-path and, 2) that outstanding readers of that tp in tc_classify() can still execute under respect with RCU grace period as it is actually expected. Things that haven't been touched here: cls_fw and cls_route. They each handle tp->root being NULL in ->classify() path for historic reasons, so their ->destroy() implementation can stay as is. If someone actually cares, they could get cleaned up at some point to avoid the test in fast path. cls_u32 doesn't set tp->root to NULL. For cls_rsvp, I just added a !head should anyone actually be using/testing it, so it at least aligns with cls_fw and cls_route. For cls_flower we additionally need to defer rhashtable destruction (to a sleepable context) after RCU grace period as concurrent readers might still access it. (Note that in this case we need to hold module reference to keep work callback address intact, since we only wait on module unload for all call_rcu()s to finish.) This fixes one race to bring RCU grace period guarantees back. Next step as worked on by Cong however is to fix 1e052be69d04 ("net_sched: destroy proto tp when all filters are gone") to get the order of unlinking the tp in tc_ctl_tfilter() for the RTM_DELTFILTER case right by moving RCU_INIT_POINTER() before tcf_destroy() and let the notification for removal be done through the prior ->delete() callback. Both are independant issues. Once we have that right, we can then clean tp->root up for a number of classifiers by not making them RCU pointers, which requires a new callback (->uninit) that is triggered from tp's RCU callback, where we just kfree() tp->root from there. Fixes: 1f947bf151e9 ("net: sched: rcu'ify cls_bpf") Fixes: 9888faefe132 ("net: sched: cls_basic use RCU") Fixes: 70da9f0bf999 ("net: sched: cls_flow use RCU") Fixes: 77b9900ef53a ("tc: introduce Flower classifier") Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier") Fixes: 952313bd6258 ("net: sched: cls_cgroup use RCU") Reported-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Roi Dayan <roid@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-28	rtl8xxxu: tx rate reported before set	Barry Day
	Move the dev_info call that attempts to show the rate used before it is set. Signed-off-by: Barry Day <briselec@gmail.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Barry Day <briselec@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2016-11-28	powerpc/boot: Fix build failure in 32-bit boot wrapper	Ben Hutchings
	OPAL is not callable from 32-bit mode and the assembly code for it may not even build (depending on how binutils was configured). References: https://buildd.debian.org/status/fetch.php?pkg=linux&arch=powerpcspe&ver=4.8.7-1&stamp=1479203712 Fixes: 656ad58ef19e ("powerpc/boot: Add OPAL console to epapr wrappers") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-28	x86/sched: Use #include <linux/mutex.h> instead of #include <asm/mutex.h>	Ingo Molnar
	asm/mutex.h is gone from the locking tree, which makes sched/core break the build. Use linux/mutex.h instead, which is the canonical method. Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: peterz@infradead.org Cc: jolsa@redhat.com Cc: rjw@rjwysocki.net Cc: bp@suse.de Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28	x86/build: Remove three unneeded genhdr-y entries	Paul Bolle
	In x86's include/asm/Kbuild three entries are appended to the genhdr-y make variable: genhdr-y += unistd_32.h genhdr-y += unistd_64.h genhdr-y += unistd_x32.h The same entries are also appended to that variable in include/uapi/asm/Kbuild. So commit: 10b63956fce7 ("UAPI: Plumb the UAPI Kbuilds into the user header installation and checking") ... removed these three entries from include/asm/Kbuild. But, apparently, some merge conflict resolution re-added them. The net effect is, in short, that the genhdr-y make variable contains these file names twice and, as a consequence, that the corresponding headers get installed twice. And so the build prints: INSTALL usr/include/asm/ (65 files) ... while in reality only 62 files are installed in that directory. Nothing breaks because of all that, but it's a good idea to finally remove these unneeded entries nevertheless. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1480077707-2837-1-git-send-email-pebolle@tiscali.nl Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28	x86/build: Don't use $(LINUXINCLUDE) twice	Paul Bolle
	The make variable KBUILD_CFLAGS contains $(LINUXINCLUDE). But the build already picks up $(LINUXINCLUDE) from scripts/Makefile.lib. The net effect is that the (long) list of include directories is used twice. This is harmless but pointless. So stop using $(LINUXINCLUDE) twice. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1480077514-2586-1-git-send-email-pebolle@tiscali.nl Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28	x86/unwind: Fix guess-unwinder regression	Josh Poimboeuf
	My attempt at fixing some KASAN false positive warnings was rather brain dead, and it broke the guess unwinder. With frame pointers disabled, /proc/<pid>/stack is broken: # cat /proc/1/stack [<ffffffffffffffff>] 0xffffffffffffffff Restore the code flow to more closely resemble its previous state, while still using READ_ONCE_NOCHECK() macros to silence KASAN false positives. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: c2d75e03d630 ("x86/unwind: Prevent KASAN false positive warnings in guess unwinder") Link: http://lkml.kernel.org/r/b824f92c2c22eca5ec95ac56bd2a7c84cf0b9df9.1480309971.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28	x86/build: Annotate die() with noreturn to fix build warning on clang	Peter Foley
	Fixes below warning with clang: In file included from ../arch/x86/tools/relocs_64.c:17: ../arch/x86/tools/relocs.c:977:6: warning: variable 'do_reloc' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] Signed-off-by: Peter Foley <pefoley2@pefoley.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161126222229.673-1-pefoley2@pefoley.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28	x86/platform/olpc: Fix resume handler build warning	Borislav Petkov
	Fix: arch/x86/platform/olpc/olpc-xo15-sci.c:199:12: warning: ‘xo15_sci_resume’ defined but not used [-Wunused-function] static int xo15_sci_resume(struct device *dev) ^ which I see in randconfig builds here. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161126142706.13602-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-28	x86/boot/64: Optimize fixmap page fixup	Borislav Petkov
	Single-stepping through head_64.S made me look at the fixmap page PTEs fixup loop: So we're going through the whole level2_fixmap_pgt 4K page, looking at whether PAGE_PRESENT is set in those PTEs and add the delta between where we're compiled to run and where we actually end up running. However, if that delta is 0 (most cases) we go through all those 512 PTEs for no reason at all. Oh well, we add 0 but that's no reason to me. Skipping that useless fixup gives us a boot speedup of 0.004 seconds in my guest. Not a lot but considering how cheap it is, I'll take it. Here is the printk time difference: before: ... [ 0.000000] tsc: Marking TSC unstable due to TSCs unsynchronized [ 0.013590] Calibrating delay loop (skipped), value calculated using timer frequency.. 8027.17 BogoMIPS (lpj=16054348) [ 0.017094] pid_max: default: 32768 minimum: 301 ... after: ... [ 0.000000] tsc: Marking TSC unstable due to TSCs unsynchronized [ 0.009587] Calibrating delay loop (skipped), value calculated using timer frequency.. 8026.86 BogoMIPS (lpj=16053724) [ 0.013090] pid_max: default: 32768 minimum: 301 ... For the other two changes converting naked numbers to defines: # arch/x86/kernel/head_64.o: text data bss dec hex filename 1124 290864 4096 296084 48494 head_64.o.before 1124 290864 4096 296084 48494 head_64.o.after md5: 87086e202588939296f66e892414ffe2 head_64.o.before.asm 87086e202588939296f66e892414ffe2 head_64.o.after.asm Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161125111448.23623-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-27	Merge branch 'bpf-misc-next'	David S. Miller
	Daniel Borkmann says: ==================== BPF cleanups and misc updates This patch set adds couple of cleanups in first few patches, exposes owner_prog_type for array maps as well as mlocked mem for maps in fdinfo, allows for mount permissions in fs and fixes various outstanding issues in selftests and samples. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	bpf: fix multiple issues in selftest suite and samples	Daniel Borkmann
	1) The test_lru_map and test_lru_dist fails building on my machine since the sys/resource.h header is not included. 2) test_verifier fails in one test case where we try to call an invalid function, since the verifier log output changed wrt printing function names. 3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for retrieving the number of possible CPUs. This is broken at least in our scenario and really just doesn't work. glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF. First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* \| wc -l, if that fails, depending on the config, it either tries to count CPUs in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead. If /proc/cpuinfo has some issue, it returns just 1 worst case. This oddity is nothing new [1], but semantics/behaviour seems to be settled. _SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if that fails it looks into /proc/stat for cpuX entries, and if also that fails for some reason, /proc/cpuinfo is consulted (and returning 1 if unlikely all breaks down). While that might match num_possible_cpus() from the kernel in some cases, it's really not guaranteed with CPU hotplugging, and can result in a buffer overflow since the array in user space could have too few number of slots, and on perpcu map lookup, the kernel will write beyond that memory of the value buffer. William Tu reported such mismatches: [...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu() happens when CPU hotadd is enabled. For example, in Fusion when setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64 -smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf() will be 2 [2]. [...] Documentation/cputopology.txt says /sys/devices/system/cpu/possible outputs cpu_possible_mask. That is the same as in num_possible_cpus(), so first step would be to fix the _SC_NPROCESSORS_CONF calls with our own implementation. Later, we could add support to bpf(2) for passing a mask via CPU_SET(3), for example, to just select a subset of CPUs. BPF samples code needs this fix as well (at least so that people stop copying this). Thus, define bpf_num_possible_cpus() once in selftests and import it from there for the sample code to avoid duplicating it. The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated. After all three issues are fixed, the test suite runs fine again: # make run_tests \| grep self selftests: test_verifier [PASS] selftests: test_maps [PASS] selftests: test_lru_map [PASS] selftests: test_kmod.sh [PASS] [1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps") Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link") Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY") Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH") Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id") Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: William Tu <u9012063@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	bpf: allow for mount options to specify permissions	Daniel Borkmann
	Since we recently converted the BPF filesystem over to use mount_nodev(), we now have the possibility to also hold mount options in sb's s_fs_info. This work implements mount options support for specifying permissions on the sb's inode, which will be used by tc when it manually needs to mount the fs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	bpf: add owner_prog_type and accounted mem to array map's fdinfo	Daniel Borkmann
	Allow for checking the owner_prog_type of a program array map. In some cases bpf(2) can return -EINVAL /after/ the verifier passed and did all the rewrites of the bpf program. The reason that lets us fail at this late stage is that program array maps are incompatible. Allow users to inspect this earlier after they got the map fd through BPF_OBJ_GET command. tc will get support for this. Also, display how much we charged the map with regards to RLIMIT_MEMLOCK. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	bpf: reuse dev_is_mac_header_xmit for redirect	Daniel Borkmann
	Commit dcf800344a91 ("net/sched: act_mirred: Refactor detection whether dev needs xmit at mac header") added dev_is_mac_header_xmit(); since it's also useful elsewhere, move it to if_arp.h and reuse it for BPF. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	bpf: drop useless bpf_fd member from cls/act	Daniel Borkmann
	After setup we don't need to keep user space fd number around anymore, as it also has no useful meaning for anyone, just remove it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	bpf: drop unnecessary context cast from BPF_PROG_RUN	Daniel Borkmann
	Since long already bpf_func is not only about struct sk_buff * as input anymore. Make it generic as void *, so that callers don't need to cast for it each time they call BPF_PROG_RUN(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	tipc: fix link statistics counter errors	Jon Paul Maloy
	In commit e4bf4f76962b ("tipc: simplify packet sequence number handling") we changed the internal representation of the packet sequence number counters from u32 to u16, reflecting what is really sent over the wire. Since then some link statistics counters have been displaying incorrect values, partially because the counters meant to be used as sequence number snapshots are now used as direct counters, stored as u32, and partially because some counter updates are just missing in the code. In this commit we correct this in two ways. First, we base the displayed packet sent/received values on direct counters instead of as previously a calculated difference between current sequence number and a snapshot. Second, we add the missing updates of the counters. This change is compatible with the current netlink API, and requires no changes to the user space tools. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-27	sfc: remove unneeded variable	Dan Carpenter
	We don't use ->heap_buf after commit 46d1efd852cc ("sfc: remove Software TSO") so let's remove the last traces. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>