linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-11-09	irqchip/mips-gic: Add pr_fmt and reword pr_* messages	Matt Redfearn
	Several messages from the MIPS GIC driver include the text "GIC", but the format is not standard. Add a pr_fmt of "irq-mips-gic: " and reword the messages now that they will be prefixed with the driver name. Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Reviewed-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2017-11-09	Merge tag 'timers-conversion-next5' of ↵	Thomas Gleixner
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into timers/core Pull the 5th batch of timer conversions from Kees Cook - qla2xxx patches have passed testing - ipvs patches have been Acked - prepare for more tree-wide DEFINE_TIMER() changes
2017-11-09	soc: amlogic: meson-gx-pwrc-vpu: fix power-off when powered by bootloader	Neil Armstrong
	In the case the VPU power domain has been powered on by the bootloader and no driver are attached to this power domain, the genpd will power it off after a certain amount of time, but the clocks hasn't been enabled by the kernel itself and the power-off will trigger some faults. This patch enable the clocks to have a coherent state for an eventual poweroff and switches to the pm_domain_always_on_gov governor. Fixes: 75fcb5ca4b46 ("soc: amlogic: add Meson GX VPU Domains driver") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2017-11-09	dt-bindings: mfd: mc13xxx: Remove obsolete property	Fabio Estevam
	The 'fsl,spi-num-chipselects' property is obsolete according to Documentation/devicetree/bindings/spi/fsl-imx-cspi.txt, so remove it from the example. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2017-11-09	KVM: s390: clear_io_irq() requests are not expected for adapter interrupts	Michael Mueller
	There is a chance to delete not yet delivered I/O interrupts if an exploiter uses the subsystem identification word 0x0000 while processing a KVM_DEV_FLIC_CLEAR_IO_IRQ ioctl. -EINVAL will be returned now instead in that case. Classic interrupts will always have bit 0x10000 set in the schid while adapter interrupts have a zero schid. The clear_io_irq interface is only useful for classic interrupts (as adapter interrupts belong to many devices). Let's make this interface more strict and forbid a schid of 0. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-11-09	KVM: s390: abstract conversion between isc and enum irq_types	Michael Mueller
	The abstraction of the conversion between an isc value and an irq_type by means of functions isc_to_irq_type() and irq_type_to_isc() allows to clarify the respective operations where used. Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-11-09	KVM: s390: vsie: use common code functions for pinning	David Hildenbrand
	We will not see -ENOMEM (gfn_to_hva() will return KVM_ERR_PTR_BAD_PAGE for all errors). So we can also get rid of special handling in the callers of pin_guest_page() and always assume that it is a g2 error. As also kvm_s390_inject_program_int() should never fail, we can simplify pin_scb(), too. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170901151143.22714-1-david@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-11-09	KVM: s390: SIE considerations for AP Queue virtualization	Tony Krowiak
	The Crypto Control Block (CRYCB) is referenced by the SIE state description and controls KVM guest access to the Adjunct Processor (AP) adapters, usage domains and control domains. This patch defines the AP control blocks to be used for controlling guest access to the AP adapters and domains. Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com> Message-Id: <1507916344-3896-2-git-send-email-akrowiak@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-11-09	Merge branch 'gpio-irqchip-rework' of /home/linus/linux-gpio into devel	Linus Walleij

2017-11-09	pinctrl: bcm2835: Fix some merge fallout	Linus Walleij
	Fixing a small merge problem in BCM2835 related to the new irqchip code. Cc: Thierry Reding <treding@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-11-09	KVM: s390: document memory ordering for kvm_s390_vcpu_wakeup	Christian Borntraeger
	swait_active does not enforce any ordering and it can therefore trigger some subtle races when the CPU moves the read for the check before a previous store and that store is then used on another CPU that is preparing the swait. On s390 there is a call to swait_active in kvm_s390_vcpu_wakeup. The good thing is, on s390 all potential races cannot happen because all callers of kvm_s390_vcpu_wakeup do not store (no race) or use an atomic operation, which handles memory ordering. Since this is not guaranteed by the Linux semantics (but by the implementation on s390) let's add smp_mb_after_atomic to make this obvious and document the ordering. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-11-09	rbd: use GFP_NOIO for parent stat and data requests	Ilya Dryomov
	rbd_img_obj_exists_submit() and rbd_img_obj_parent_read_full() are on the writeback path for cloned images -- we attempt a stat on the parent object to see if it exists and potentially read it in to call copyup. GFP_NOIO should be used instead of GFP_KERNEL here. Cc: stable@vger.kernel.org Link: http://tracker.ceph.com/issues/22014 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: David Disseldorp <ddiss@suse.de>
2017-11-09	ALSA: hda - fix headset mic problem for Dell machines with alc274	Hui Wang
	Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and the pin 0x1a is for Headphone Mic, he suggested to apply ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we verified applying this FIXUP can fix this problem. Cc: <stable@vger.kernel.org> Cc: Kailang Yang <kailang@realtek.com> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-11-09	sched/core: Optimize sched_feat() for !CONFIG_SCHED_DEBUG builds	Patrick Bellasi
	When the kernel is compiled with !CONFIG_SCHED_DEBUG support, we expect that all SCHED_FEAT are turned into compile time constants being propagated to support compiler optimizations. Specifically, we expect that code blocks like this: if (sched_feat(FEATURE_NAME) [&& <other_conditions>]) { /* FEATURE CODE / } are turned into dead-code in case FEATURE_NAME defaults to FALSE, and thus being removed by the compiler from the finale image. For this mechanism to properly work it's required for the compiler to have full access, from each translation unit, to whatever is the value defined by the sched_feat macro. This macro is defined as: #define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x)) and thus, the compiler can optimize that code only if the value of sysctl_sched_features is visible within each translation unit. Since: 029632fbb ("sched: Make separate sched.c translation units") the scheduler code has been split into separate translation units however the definition of sysctl_sched_features is part of kernel/sched/core.c while, for all the other scheduler modules, it is visible only via kernel/sched/sched.h as an: extern const_debug unsigned int sysctl_sched_features Unfortunately, an extern reference does not allow the compiler to apply constants propagation. Thus, on !CONFIG_SCHED_DEBUG kernel we still end up with code to load a memory reference and (eventually) doing an unconditional jump of a chunk of code. This mechanism is unavoidable when sched_features can be turned on and off at run-time. However, this is not the case for "production" kernels compiled with !CONFIG_SCHED_DEBUG. In this case, sysctl_sched_features is just a constant value which cannot be changed at run-time and thus memory loads and jumps can be avoided altogether. This patch fixes the case of !CONFIG_SCHED_DEBUG kernel by declaring a local version of the sysctl_sched_features constant for each translation unit. This will ultimately allow the compiler to perform constants propagation and dead-code pruning. Tests have been done, with !CONFIG_SCHED_DEBUG on a v4.14-rc8 with and without the patch, by running 30 iterations of: perf bench sched messaging --pipe --thread --group 4 --loop 50000 on a 40 cores Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz using the powersave governor to rule out variations due to frequency scaling. Statistics on the reported completion time: count mean std min 99% max v4.14-rc8 30.0 15.7831 0.176032 15.442 16.01226 16.014 v4.14-rc8+patch 30.0 15.5033 0.189681 15.232 15.93938 15.962 ... show a 1.8% speedup on average completion time and 0.5% speedup in the 99 percentile. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Brendan Jackman <brendan.jackman@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Guittot <vincent.guittot@linaro.org> Link: http://lkml.kernel.org/r/20171108184101.16006-1-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-09	x86/build: Make the boot image generation less verbose	Changbin Du
	This change suppresses the 'dd' output and adds the '-quiet' parameter to mkisofs tool. It also removes the 'Using ...' messages, as none of the messages matter to the user normally. "make V=1" can still be used for a more verbose build. The new build messages are now a streamlined set of: $ make isoimage ... Kernel: arch/x86/boot/bzImage is ready (#75) GENIMAGE arch/x86/boot/image.iso Kernel: arch/x86/boot/image.iso is ready Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1510207751-22166-1-git-send-email-changbin.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-09	selftests/powerpc: Check FP/VEC on exception in TM	Gustavo Romero
	Add a self test to check if FP/VEC/VSX registers are sane (restored correctly) after a FP/VEC/VSX unavailable exception is caught during a transaction. This test checks all possibilities in a thread regarding the combination of MSR.[FP\|VEC] states in a thread and for each scenario raises a FP/VEC/VSX unavailable exception in transactional state, verifying if vs0 and vs32 registers, which are representatives of FP/VEC/VSX reg sets, are not corrupted. Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-09	KVM: PPC: Book3S HV: Cosmetic post-merge cleanups	Paul Mackerras
	This rearranges the code in kvmppc_run_vcpu() and kvmppc_run_vcpu_hv() to be neater and clearer. Deeply indented code in kvmppc_run_vcpu() is moved out to a helper function, kvmhv_setup_mmu(). In kvmppc_vcpu_run_hv(), make use of the existing variable 'kvm' in place of 'vcpu->kvm'. No functional change. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-09	net/mlx5e: CHECKSUM_COMPLETE offload for VLAN/QinQ packets	Gal Pressman
	When the VLAN tag is present in the packet buffer (i.e VLAN stripping disabled, QinQ) the driver will currently report CHECKSUM_UNNECESSARY. Instead of using CHECKSUM_COMPLETE offload for packets with first ethertype of IPv4/6, use it for packets with last ethertype of IPv4/6 to cover the former cases as well. The checksum field present in the CQE is calculated from the IP header until the end of the packet. When the first ethertype is different than IPv4/6 (for ex. 802.1Q VLAN) a checksum of the VLAN header/s should be added. The small header/s checksum calculation will allow us to use CHECKSUM_COMPLETE instead of CHECKSUM_UNNECESSARY. Testing bandwidth of one and 8 TCP streams to a single RQ, LRO and VLAN stripping offloads disabled: CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz NIC: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] Before: +--------------+--------------------+---------------------+----------------------+ \| Traffic type \| 1 Stream BW [Mbps] \| 8 Streams BW [Mbps] \| Checksum offload \| +--------------+--------------------+---------------------+----------------------+ \| Untagged \| 28,247.35 \| 24,716.88 \| CHECKSUM_COMPLETE \| \| VLAN \| 27,516.69 \| 23,752.26 \| CHECKSUM_UNNECESSARY \| \| QinQ \| 6,961.30 \| 20,667.04 \| CHECKSUM_UNNECESSARY \| +--------------+--------------------+---------------------+----------------------+ Now: +--------------+--------------------+---------------------+-------------------+ \| Traffic type \| 1 Stream BW [Mbps] \| 8 Streams BW [Mbps] \| Checksum offload \| +--------------+--------------------+---------------------+-------------------+ \| Untagged \| 28,521.28 \| 24,926.32 \| CHECKSUM_COMPLETE \| \| VLAN \| 27,389.37 \| 23,715.34 \| CHECKSUM_COMPLETE \| \| QinQ \| 6,901.77 \| 20,845.73 \| CHECKSUM_COMPLETE \| +--------------+--------------------+---------------------+-------------------+ No performance degradation observed. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net/mlx5e: Add VLAN offloads statistics	Gal Pressman
	The following counters are now exposed through ethtool -S: rx[i]_removed_vlan_packets (per channel) rx_removed_vlan_packets tx[i]_added_vlan_packets (per channel) tx_added_vlan_packets rx_removed_vlan_packets: The number of packets that had their outer VLAN header stripped to the CQE by the hardware. tx_added_vlan_packets: The number of packets that had their outer VLAN header inserted by the hardware. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net/mlx5e: Add 802.1ad VLAN insertion support	Gal Pressman
	Report VLAN insertion support for S-tagged packets and add support by choosing the correct VLAN type in the WQE. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net/mlx5e: Add 802.1ad VLAN filter steering rules	Gal Pressman
	When a user chooses to use 802.1ad VLAN the proper steering rules will be added to the VLAN flow table (matching the specific S-tag VID). Due to current hardware limitation, when using 802.1ad, we must disable C-tag VLAN stripping on the RQs. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net/mlx5e: Declare bitmap using kernel macro	Gal Pressman
	Replace explicit declaration of bitmap with DECLARE_BITMAP kernel macro. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net: Introduce netdev_*_once functions	Gal Pressman
	Extend the net device error logging with netdev__once macros. netdev__once are the equivalents of the dev_*_once macros which are useful for messages that should only be logged once. Also add netdev_WARN_ONCE, which is the "once" extension for the already existing netdev_WARN macro. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net/mlx5e: Add rollback on add VLAN failure	Gal Pressman
	When add VLAN rule fails the active vlan bit should be cleared. Fixes: afb736e9330a ("net/mlx5: Ethernet resource handling files") Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	net/mlx5e: Rename VLAN related variables and functions	Gal Pressman
	Rename VLAN related symbols to better reflect the fact that they are associated to C-tag VLAN. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-09	Merge branch 'kvm-ppc-fixes' into kvm-ppc-next	Paul Mackerras
	This merges in a couple of fixes from the kvm-ppc-fixes branch that modify the same areas of code as some commits from the kvm-ppc-next branch, in order to resolve the conflicts. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-11-08	ext4: improve smp scalability for inode generation	Theodore Ts'o
	->s_next_generation is protected by s_next_gen_lock but its usage pattern is very primitive. We don't actually need sequentially increasing new generation numbers, so let's use prandom_u32() instead. Reported-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-11-09	Merge tag 'drm-misc-next-fixes-2017-11-08' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-next 4.15 merge window fixes, round 2: randconfig fix from Arnd, plus the vblank WARN_ON fix from Ville. * tag 'drm-misc-next-fixes-2017-11-08' of git://anongit.freedesktop.org/drm/drm-misc: drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug drm/rockchip: add CONFIG_OF dependency for lvds
2017-11-09	Merge branch 'master' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2017-11-09 1) Fix a use after free due to a reallocated skb head. From Florian Westphal. 2) Fix sporadic lookup failures on labeled IPSEC. From Florian Westphal. 3) Fix a stack out of bounds when a socket policy is applied to an IPv6 socket that sends IPv4 packets. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	Merge tag 'drm-intel-fixes-2017-11-08' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix possible NULL dereference (Chris). - Avoid miss usage of syncobj by rejecting unknown flags (Tvrtko). * tag 'drm-intel-fixes-2017-11-08' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915: Deconstruct struct sgt_dma initialiser drm/i915: Reject unknown syncobj flags
2017-11-09	Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux ↵	Dave Airlie
	into drm-next A few more fixes for 4.15. * 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: use irq-safe lock for kiq->ring_lock drm/amdgpu: bypass lru touch for KIQ ring submission drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories() drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs() drm/amd/powerplay: initialize a variable before using it drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
2017-11-09	Merge branch 'net-sched-race-fix'	David S. Miller
	Cong Wang says: ==================== net_sched: close the race between call_rcu() and cleanup_net() This patchset tries to fix the race between call_rcu() and cleanup_net() again. Without holding the netns refcnt the tc_action_net_exit() in netns workqueue could be called before filter destroy works in tc filter workqueue. This patchset moves the netns refcnt from tc actions to tcf_exts, without breaking per-netns tc actions. Patch 1 reverts the previous fix, patch 2 introduces two new API's to help to address the bug and the rest patches switch to the new API's. Please see each patch for details. I was not able to reproduce this bug, but now after adding some delay in filter destroy work I manage to trigger the crash. After this patchset, the crash is not reproducible any more and the debugging printk's show the order is expected too. ==================== Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Reported-by: Lucas Bates <lucasb@mojatatu.com> Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_u32: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_tcindex: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_rsvp: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_route: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_matchall: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_fw: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_flower: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_flow: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_cgroup: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_bpf: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	cls_basic: use tcf_exts_get_net() before call_rcu()	Cong Wang
	Hold netns refcnt before call_rcu() and release it after the tcf_exts_destroy() is done. Note, on ->destroy() path we have to respect the return value of tcf_exts_get_net(), on other paths it should always return true, so we don't need to care. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()	Cong Wang
	Instead of holding netns refcnt in tc actions, we can minimize the holding time by saving it in struct tcf_exts instead. This means we can just hold netns refcnt right before call_rcu() and release it after tcf_exts_destroy() is done. However, because on netns cleanup path we call tcf_proto_destroy() too, obviously we can not hold netns for a zero refcnt, in this case we have to do cleanup synchronously. It is fine for RCU too, the caller cleanup_net() already waits for a grace period. For other cases, refcnt is non-zero and we can safely grab it as normal and release it after we are done. This patch provides two new API for each filter to use: tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can use the following pattern: void __destroy_filter() { tcf_exts_destroy(); tcf_exts_put_net(); // <== release netns refcnt kfree(); } void some_work() { rtnl_lock(); __destroy_filter(); rtnl_unlock(); } void some_rcu_callback() { tcf_queue_work(some_work); } if (tcf_exts_get_net()) // <== hold netns refcnt call_rcu(some_rcu_callback); else __destroy_filter(); Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	Revert "net_sched: hold netns refcnt for each action"	Cong Wang
	This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59. If we hold that refcnt, the netns can never be destroyed until all actions are destroyed by user, this breaks our netns design which we expect all actions are destroyed when we destroy the whole netns. Cc: Lucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	Merge branch 'dsa-setup-stage'	David S. Miller
	Vivien Didelot says: ==================== net: dsa: setup stage When probing a DSA switch, there is basically two stages. The first stage is the parsing of the switch device, from either device tree or platform data. It fetches the DSA tree to which it belongs, and validates its ports. The switch device is then added to the tree, and the second stage is called if this was the last switch of the tree. The second stage is the setup of the tree, which validates that the tree is complete, sets up the routing tables, the default CPU port for user ports, sets up the switch drivers and finally the master interfaces, which makes the whole switch fabric functional. This patch series covers the second setup stage. The setup and teardown of a switch tree have been separated into logical steps, and the probing of a switch now simply parses and adds a switch to a tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	net: dsa: rename probe and remove switch functions	Vivien Didelot
	This commit brings no functional changes. It gets rid of the underscore prefixed _dsa_register_switch and _dsa_unregister_switch functions in favor of dsa_switch_probe() which parses and adds a switch to a tree and dsa_switch_remove() which removes a switch from a tree. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	net: dsa: setup a tree when adding a switch to it	Vivien Didelot
	Now that the tree setup is centralized, we can simplify the code a bit more by setting up or tearing down the tree directly when adding or removing a switch to/from it. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	net: dsa: setup routing table	Vivien Didelot
	The *_complete() functions take too much arguments to do only one thing: they try to fetch the dsa_port structures corresponding to device nodes under the "link" list property of DSA ports, and use them to setup the routing table of switches. This patch simplifies them by providing instead simpler dsa_{port,switch,tree}_setup_routing_table functions which return a boolean value, true if the tree is complete. dsa_tree_setup_routing_table is called inside dsa_tree_setup which simplifies the switch registering function as well. A switch's routing table is now initialized before its setup. This also makes dsa_port_is_valid obsolete, remove it. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09	net: dsa: use of_for_each_phandle	Vivien Didelot
	The OF code provides a of_for_each_phandle() helper to iterate over phandles. Use it instead of arbitrary iterating ourselves over the list of phandles hanging to the "link" property of the port's device node. The of_phandle_iterator_next() helper calls of_node_put() itself on it.node. Thus We must only do it ourselves if we break the loop. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>