summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-10-26drm/i915: Add information needed to track engine preempt stateMichał Winiarski
We shouldn't inspect ELSP context status (or any other bits depending on specific submission backend) when using GuC submission. Let's use another piece of HWSP for preempt context, to write its bit of information, meaning that preemption has finished, and hardware is now idle. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-9-michal.winiarski@intel.com
2017-10-26drm/i915: Extract "emit write" part of emit breadcrumb functionsMichał Winiarski
Let's separate the "emit" part from touching any internal structures, this way we can have a generic "emit coherent GGTT write" function. We would like to reuse this functionality for emitting HWSP write, to confirm that preempt-to-idle has finished. v2: Reorder args to match emit_pipe_control, s/render/rcs (Chris) Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-8-michal.winiarski@intel.com
2017-10-26drm/i915/guc: Split guc_wq_item_appendMichał Winiarski
We're using a special preempt context for HW to preempt into. We don't want to emit any requests there, but we still need to wrap this context into a valid GuC work item. Let's cleanup the functions operating on GuC work items. We can extract guc_request_add - responsible for adding GuC work item and ringing the doorbell, and guc_wq_item_append - used by the function above, not tied to the concept of gem request. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-7-michal.winiarski@intel.com
2017-10-26drm/i915/guc: Add a second client, to be used for preemptionDave Gordon
This second client is created with priority KMD_HIGH, and marked as preemptive. This will allow us to request preemption using GuC actions. v2: Extract clients creation into a helper, debugfs fixups. (Michał) Recreate doorbell on init. (Daniele) Move clients into an array. v3: And move clients back from an array, to get rid of the enum (Michał) v4: Use is_high_priority, move DRM_ERROR into __create_doorbell, move GEM_BUG_ON inside guc_clients_create (Michał) v5: Split the BUG_ON (Michał) v6: Cleanup after error during doorbell reinit (Michał) Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171026141737.31656-1-michal.winiarski@intel.com
2017-10-26drm/i915/guc: Add preemption action to GuC firmware interfaceMichał Winiarski
We're using GuC action to request preemption. However, after requesting preemption we need to wait for GuC to finish its own post-processing before we start submitting our requests. Firmware is using shared context to report its status. Let's update GuC firmware interface with those new definitions. v2: Drop unused INTEL_GUC_PREEMPT_OPTION_IMMEDIATE Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Jeff McGee <jeff.mcgee@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-5-michal.winiarski@intel.com
2017-10-26drm/i915/guc: Allocate separate shared data object for GuC communicationMichał Winiarski
We were using first page of kernel context render state for sharing data with GuC. While it's justified by the fact that those pages are not used (note, GuC still enforces this layout and refuses to work if we remove the extra page in front), it's also confusing (why are we using this particular page?). Let's allocate a separate object instead. v2: Drop kernel_context from GuC suspend/resume action handlers (Michel) Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-4-michal.winiarski@intel.com
2017-10-26drm/i915/guc: Extract GuC stage desc pool creation into a helperMichał Winiarski
Since it's a two-step process, we can have a cleaner error handling in the caller if we do the allocations in a helper. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Jeff McGee <jeff.mcgee@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-3-michal.winiarski@intel.com
2017-10-26drm/i915/guc: Do not use 0 for GuC doorbell cookieMichał Winiarski
Apparently, this value is reserved and may be interpreted as changing doorbell ownership. Even though we're not observing any side effects now, let's skip over it to be consistent with the spec. v2: Apply checkpatch (Sagar) Suggested-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-2-michal.winiarski@intel.com
2017-10-26drm/i915/cnl: Fix SSEU Device Status.Rodrigo Vivi
CNL adds an extra register for slice/subslice information. Although no SKU is planed with an extra slice let's already handle this extra piece of information so we don't have the risk in future of getting a part that might have chosen this part of the die instead of other slices or anything like that. Also if subslice is disabled the information of eu ack for that is garbage, so let's skip checks for eu if subslice is disabled as we skip the subslice if slice is disabled. The rest is pretty much like gen9. v2: Remove IS_CANNONLAKE from gen9 status function. v3: Consider s_max = 6 and ss_max=4 to run over all possible slices and subslices possible by spec. Although no real hardware will have that many slices/subslices. To match with sseu info init. v4: Fix offset calculation for slices 4 and 5. Removed Oscar's rv-b since this change also needs review. v5: Let's consider only valid bits for SLICE*_PGCTL_ACK. This looks like wrong in Spec, but seems to be enough for now. Whenever Spec gets updated and fixed we come back and properly update the masks. Also add a FIXME, so we can revisit this later when we find some strange info on debugfs or when we noitce spec got updated. Cc: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171026001546.28203-1-rodrigo.vivi@intel.com
2017-10-26drm/i915/gvt: Adding ACTHD mmio read handlerXiong Zhang
When a workload is too heavy to finish it in gpu hang check timer intervals(1.5), gpu hang check function will check ACTHD register value to decide whether gpu is real dead or not. On real hw, ACTHD is updated by HW when workload is running, then host kernel won't think it is gpu hang. while guest kernel always read a constant ACTHD value as GVT doesn't supply ACTHD emulate handler, then guest kernel detects a fake gpu hang. To remove such guest fake gpu hang, this patch supply ACTHD mmio read handler which read real HW ACTHD register directly. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b4c9a097-3e62-124e-6856-b0c37764df7b@intel.com
2017-10-27drm/i915/gvt: Extract mmio_read_from_hw() common functionXiong Zhang
The mmio read handler for ring timestmap / instdone register are same as reading hw value directly. Extract it as common function to reduce code duplications. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2017-10-27drm/i915/gvt: Refine MMIO_RING_F()Zhi Wang
Inspect if the host has VCS2 ring by host i915 macro in MMIO_RING_F(). Also this helps on reducing some LOCs. Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2017-10-27drm/i915/gvt: properly check per_ctx bb valid stateZhenyu Wang
Need to check valid state for per_ctx bb and bypass batch buffer combine for scan if necessary. Otherwise adding invalid MI batch buffer start cmd for per_ctx bb will cause scan failure, which is taken as -EFAULT now so vGPU would be put in failsafe. This trys to fix that by checking per_ctx bb valid state. Also remove old invalid WARNING that indirect ctx bb shouldn't depend on valid per_ctx bb. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2017-10-26Revert "apparmor: add base infastructure for socket mediation"Linus Torvalds
This reverts commit 651e28c5537abb39076d3949fb7618536f1d242e. This caused a regression: "The specific problem is that dnsmasq refuses to start on openSUSE Leap 42.2. The specific cause is that and attempt to open a PF_LOCAL socket gets EACCES. This means that networking doesn't function on a system with a 4.14-rc2 system." Sadly, the developers involved seemed to be in denial for several weeks about this, delaying the revert. This has not been a good release for the security subsystem, and this area needs to change development practices. Reported-and-bisected-by: James Bottomley <James.Bottomley@hansenpartnership.com> Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info> Cc: John Johansen <john.johansen@canonical.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Seth Arnold <seth.arnold@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-26SMB3: Validate negotiate request must always be signedSteve French
According to MS-SMB2 3.2.55 validate_negotiate request must always be signed. Some Windows can fail the request if you send it unsigned See kernel bugzilla bug 197311 CC: Stable <stable@vger.kernel.org> Acked-by: Ronnie Sahlberg <lsahlber.redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
2017-10-26Merge tag 'pm-4.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "This fixes a device power management quality of service (PM QoS) framework implementation issue causing 'no restriction' requests for device resume latency, including 'no restriction' set by user space, to effectively override requests with specific device resume latency requirements. It is late in the cycle, but the bug in question is in the 'user space can trigger unexpected behavior' category and the fix is stable-candidate, so here it goes" * tag 'pm-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / QoS: Fix device resume latency PM QoS
2017-10-26alpha/PCI: Move pci_map_irq()/pci_swizzle() out of initdataLorenzo Pieralisi
The introduction of {map/swizzle}_irq() hooks in the struct pci_host_bridge allowed to replace the pci_fixup_irqs() PCI IRQ allocation in alpha arch PCI code with per-bridge map/swizzle functions with commit 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks"). As a side effect of converting PCI IRQ allocation to the struct pci_host_bridge {map/swizzle}_irq() hooks mechanism, the actual PCI IRQ allocation function (ie pci_assign_irq()) is carried out per-device in pci_device_probe() that is called when a PCI device driver is about to be probed. This means that, for drivers compiled as loadable modules, the actual PCI device IRQ allocation can now happen after the system has booted so the struct pci_host_bridge {map/swizzle}_irq() hooks pci_assign_irq() relies on must stay valid after the system has booted so that PCI core can carry out PCI IRQ allocation correctly. Most of the alpha board structures pci_map_irq() and pci_swizzle() hooks (that are used to initialize their struct pci_host_bridge equivalent through the alpha_mv global variable - that represents the struct alpha_machine_vector of the running kernel) are marked as __init/__initdata; this causes freed memory dereferences when PCI IRQ allocation is carried out after the kernel has booted (ie when loading PCI drivers as loadable module) because when the kernel tries to bind the PCI device to its (module) driver, the function pci_assign_irq() is called, that in turn retrieves the struct pci_host_bridge {map/swizzle}_irq() hooks to carry out PCI IRQ allocation; if those hooks are marked as __init code/__initdata they point at freed/invalid memory. Fix the issue by removing the __init/__initdata markers from all subarch struct alpha_machine_vector.pci_map_irq()/pci_swizzle() functions (and data). Fixes: 0e4c2eeb758a ("alpha/PCI: Replace pci_fixup_irqs() call with host bridge IRQ mapping hooks") Link: http://lkml.kernel.org/r/alpine.LRH.2.21.1710251043170.7098@math.ut.ee Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Meelis Roos <mroos@linux.ee> Cc: Matt Turner <mattst88@gmail.com>
2017-10-26Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A few select fixes that should go into this series. Mainly for NVMe, but also a single stable fix for nbd from Josef" * 'for-linus' of git://git.kernel.dk/linux-block: nbd: handle interrupted sendmsg with a sndtimeo set nvme-rdma: Fix error status return in tagset allocation failure nvme-rdma: Fix possible double free in reconnect flow nvmet: synchronize sqhd update nvme-fc: retry initial controller connections 3 times nvme-fc: fix iowait hang
2017-10-26Merge tag 'spi-fix-v4.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "There are a bunch of device specific fixes (more than I'd like, I've been lax sending these) plus one important core fix for the conversion to use an IDR for bus number allocation which avoids issues with collisions when some but not all of the buses in the system have a fixed bus number specified. The Armada changes are rather large, specificially "spi: armada-3700: Fix padding when sending not 4-byte aligned data", but it's a storage corruption issue and there's things like indentation changes which make it look bigger than it really is. It's been cooking in -next for quite a while now and is part of the reason for the delay" * tag 'spi-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers spi: bcm-qspi: Fix use after free in bcm_qspi_probe() in error path spi: a3700: Return correct value on timeout detection spi: uapi: spidev: add missing ioctl header spi: stm32: Fix logical error in stm32_spi_prepare_mbr() spi: armada-3700: Fix padding when sending not 4-byte aligned data spi: armada-3700: Fix failing commands with quad-SPI
2017-10-26Merge tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "A small lock imbalance fix, marked for stable" * tag 'ceph-for-4.14-rc7' of git://github.com/ceph/ceph-client: ceph: unlock dangling spinlock in try_flush_caps()
2017-10-26i40e: Add programming descriptors to cleaned_countAlexander Duyck
This patch updates the i40e driver to include programming descriptors in the cleaned_count. Without this change it becomes possible for us to leak memory as we don't trigger a large enough allocation when the time comes to allocate new buffers and we end up overwriting a number of rx_buffers equal to the number of programming descriptors we encountered. Fixes: 0e626ff7ccbf ("i40e: Fix support for flow director programming status") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26i40e: Fix incorrect use of tx_itr_setting when checking for Rx ITR setupAlexander Duyck
It looks like there was either a copy/paste error or just a typo that resulted in the Tx ITR setting being used to determine if we were using adaptive Rx interrupt moderation or not. This patch fixes the typo. Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26ixgbe: Fix Tx map failure pathAlexander Duyck
This patch is a partial revert of "ixgbe: Don't bother clearing buffer memory for descriptor rings". Specifically I messed up the exception handling path a bit and this resulted in us incorrectly adding the count back in when we didn't need to. In order to make this simpler I am reverting most of the exception handling path change and instead just replacing the bit that was handled by the unmap_and_free call. Fixes: ffed21bcee7a ("ixgbe: Don't bother clearing buffer memory for descriptor rings") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26igb: Fix TX map failure pathJean-Philippe Brucker
When the driver cannot map a TX buffer, instead of rolling back gracefully and retrying later, we currently get a panic: [ 159.885994] igb 0000:00:00.0: TX DMA map failed [ 159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8 ... [ 159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8 Fix the erroneous test that leads to this situation. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26e1000: avoid null pointer dereference on invalid stat typeColin Ian King
Currently if the stat type is invalid then data[i] is being set either by dereferencing a null pointer p, or it is reading from an incorrect previous location if we had a valid stat type previously. Fix this by skipping over the read of p on an invalid stat type. Detected by CoverityScan, CID#113385 ("Explicit null dereferenced") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26e1000: fix race condition between e1000_down() and e1000_watchdogVincenzo Maffione
This patch fixes a race condition that can result into the interface being up and carrier on, but with transmits disabled in the hardware. The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which allows e1000_watchdog() interleave with e1000_down(). CPU x CPU y -------------------------------------------------------------------- e1000_down(): netif_carrier_off() e1000_watchdog(): if (carrier == off) { netif_carrier_on(); enable_hw_transmit(); } disable_hw_transmit(); e1000_watchdog(): /* carrier on, do nothing */ Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26xen: fix booting ballooned down hvm guestJuergen Gross
Commit 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't online new memory initially") introduced a regression when booting a HVM domain with memory less than mem-max: instead of ballooning down immediately the system would try to use the memory up to mem-max resulting in Xen crashing the domain. For HVM domains the current size will be reflected in Xenstore node memory/static-max instead of memory/target. Additionally we have to trigger the ballooning process at once. Cc: <stable@vger.kernel.org> # 4.13 Fixes: 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't online new memory initially") Reported-by: Simon Gaiser <hw42@ipsumj.de> Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-10-26tap: double-free in error path in tap_open()Girish Moodalbail
Double free of skb_array in tap module is causing kernel panic. When tap_set_queue() fails we free skb_array right away by calling skb_array_cleanup(). However, later on skb_array_cleanup() is called again by tap_sock_destruct through sock_put(). This patch fixes that issue. Fixes: 362899b8725b35e3 (macvtap: switch to use skb array) Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26tcp: call tcp_rate_skb_sent() when retransmit with unaligned skb->dataYousuk Seung
Current implementation calls tcp_rate_skb_sent() when tcp_transmit_skb() is called when it clones skb only. Not calling tcp_rate_skb_sent() is OK for all such code paths except from __tcp_retransmit_skb() which happens when skb->data address is not aligned. This may rarely happen e.g. when small amount of data is sent initially and the receiver partially acks odd number of bytes for some reason, possibly malicious. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26tcp/dccp: fix other lockdep splats accessing ireq_optEric Dumazet
In my first attempt to fix the lockdep splat, I forgot we could enter inet_csk_route_req() with a freshly allocated request socket, for which refcount has not yet been elevated, due to complex SLAB_TYPESAFE_BY_RCU rules. We either are in rcu_read_lock() section _or_ we own a refcount on the request. Correct RCU verb to use here is rcu_dereference_check(), although it is not possible to prove we actually own a reference on a shared refcount :/ In v2, I added ireq_opt_deref() helper and use in three places, to fix other possible splats. [ 49.844590] lockdep_rcu_suspicious+0xea/0xf3 [ 49.846487] inet_csk_route_req+0x53/0x14d [ 49.848334] tcp_v4_route_req+0xe/0x10 [ 49.850174] tcp_conn_request+0x31c/0x6a0 [ 49.851992] ? __lock_acquire+0x614/0x822 [ 49.854015] tcp_v4_conn_request+0x5a/0x79 [ 49.855957] ? tcp_v4_conn_request+0x5a/0x79 [ 49.858052] tcp_rcv_state_process+0x98/0xdcc [ 49.859990] ? sk_filter_trim_cap+0x2f6/0x307 [ 49.862085] tcp_v4_do_rcv+0xfc/0x145 [ 49.864055] ? tcp_v4_do_rcv+0xfc/0x145 [ 49.866173] tcp_v4_rcv+0x5ab/0xaf9 [ 49.868029] ip_local_deliver_finish+0x1af/0x2e7 [ 49.870064] ip_local_deliver+0x1b2/0x1c5 [ 49.871775] ? inet_del_offload+0x45/0x45 [ 49.873916] ip_rcv_finish+0x3f7/0x471 [ 49.875476] ip_rcv+0x3f1/0x42f [ 49.876991] ? ip_local_deliver_finish+0x2e7/0x2e7 [ 49.878791] __netif_receive_skb_core+0x6d3/0x950 [ 49.880701] ? process_backlog+0x7e/0x216 [ 49.882589] __netif_receive_skb+0x1d/0x5e [ 49.884122] process_backlog+0x10c/0x216 [ 49.885812] net_rx_action+0x147/0x3df Fixes: a6ca7abe53633 ("tcp/dccp: fix lockdep splat in inet_csk_route_req()") Fixes: c92e8c02fe66 ("tcp/dccp: fix ireq->opt races") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: kernel test robot <fengguang.wu@intel.com> Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26rds: Fix inaccurate accounting of unsignaled wrsHåkon Bugge
The number of unsignaled work-requests posted to the IB send queue is tracked by a counter in the rds_ib_connection struct. When it reaches zero, or the caller explicitly asks for it, the send-signaled bit is set in send_flags and the counter is reset. This is performed by the rds_ib_set_wr_signal_state() function. However, this function is not always used which yields inaccurate accounting. This commit fixes this, re-factors a code bloat related to the matter, and makes the actual parameter type to the function consistent. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26rds: ib: Fix uninitialized variableHåkon Bugge
send_flags needs to be initialized before calling rds_ib_set_wr_signal_state(). Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26Merge tag 'linux-can-fixes-for-4.14-20171024' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2017-10-24 here's another pull request for net/master. The patch by Gerhard Bertelsmann fixes the CAN_CTRLMODE_LOOPBACK in the sun4i driver. Two patches by Jimmy Assarsson for the kvaser_usb driver fix a print in the error path of the kvaser_usb_close() and remove a wrong warning message with the Leaf v2 firmware version v4.1.844. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26net: mvpp2: do not sleep in set_rx_modeAntoine Tenart
This patch replaces GFP_KERNEL by GFP_ATOMIC to avoid sleeping in the ndo_set_rx_mode() call which is called with BH disabled. Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26net: mvpp2: fix invalid parameters order when calling the tcam initAntoine Tenart
When calling mvpp2_prs_mac_multi_set() from mvpp2_prs_mac_init(), two parameters (the port index and the table index) are inverted. Fixes this. Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26net: mvpp2: fix typo in the tcam setupAntoine Tenart
This patch fixes a typo in the mvpp2_prs_tcam_data_cmp() function, as the shift value is inverted with the data. Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26net/mlx5e: DCBNL, Implement tc with ets type and zero bandwidthHuy Nguyen
Previously, tc with ets type and zero bandwidth is not accepted by driver. This behavior does not follow the IEEE802.1qaz spec. If there are tcs with ets type and zero bandwidth, these tcs are assigned to the lowest priority tc_group #0. We equally distribute 100% bw of the tc_group #0 to these zero bandwidth ets tcs. Also, the non zero bandwidth ets tcs are assigned to tc_group #1. If there is no zero bandwidth ets tc, the non zero bandwidth ets tcs are assigned to tc_group #0. Fixes: cdcf11212b22 ("net/mlx5e: Validate BW weight values of ETS") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26net/mlx5e: Properly deal with encap flows add/del under neigh updateOr Gerlitz
Currently, the encap action offload is handled in the actions parse function and not in mlx5e_tc_add_fdb_flow() where we deal with all the other aspects of offloading actions (vlan, modify header) and the rule itself. When the neigh update code (mlx5e_tc_encap_flows_add()) recreates the encap entry and offloads the related flows, we wrongly call again into mlx5e_tc_add_fdb_flow(), this for itself would cause us to handle again the offloading of vlans and header re-write which puts things in non consistent state and step on freed memory (e.g the modify header parse buffer which is already freed). Since on error, mlx5e_tc_add_fdb_flow() detaches and may release the encap entry, it causes a corruption at the neigh update code which goes over the list of flows associated with this encap entry, or double free when the tc flow is later deleted by user-space. When neigh update (mlx5e_tc_encap_flows_del()) unoffloads the flows related to an encap entry which is now invalid, we do a partial repeat of the eswitch flow removal code which is wrong too. To fix things up we do the following: (1) handle the encap action offload in the eswitch flow add function mlx5e_tc_add_fdb_flow() as done for the other actions and the rule itself. (2) modify the neigh update code (mlx5e_tc_encap_flows_add/del) to only deal with the encap entry and rules delete/add and not with any of the other offloaded actions. Fixes: 232c001398ae ('net/mlx5e: Add support to neighbour update flow') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26net/mlx5: Delay events till mlx5 interface's add complete for pci resumeHuy Nguyen
mlx5_ib_add is called during mlx5_pci_resume after a pci error. Before mlx5_ib_add completes, there are multiple events which trigger function mlx5_ib_event. This cause kernel panic because mlx5_ib_event accesses unitialized resources. The fix is to extend Erez Shitrit's patch <97834eba7c19> ("net/mlx5: Delay events till ib registration ends") to cover the pci resume code path. Trace: mlx5_core 0001:01:00.6: mlx5_pci_resume was called mlx5_core 0001:01:00.6: firmware version: 16.20.1011 mlx5_core 0001:01:00.6: mlx5_attach_interface:164:(pid 779): mlx5_ib_event:2996:(pid 34777): warning: event on port 1 mlx5_ib_event:2996:(pid 34782): warning: event on port 1 Unable to handle kernel paging request for data at address 0x0001c104 Faulting instruction address: 0xd000000008f411fc Oops: Kernel access of bad area, sig: 11 [#1] ... ... Call Trace: [c000000fff77bb70] [d000000008f4119c] mlx5_ib_event+0x64/0x470 [mlx5_ib] (unreliable) [c000000fff77bc60] [d000000008e67130] mlx5_core_event+0xb8/0x210 [mlx5_core] [c000000fff77bd10] [d000000008e4bd00] mlx5_eq_int+0x528/0x860[mlx5_core] Fixes: 97834eba7c19 ("net/mlx5: Delay events till ib registration ends") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26net/mlx5: Fix health work queue spin lock to IRQ safeMoshe Shemesh
spin_lock/unlock of health->wq_lock should be IRQ safe. It was changed to spin_lock_irqsave since adding commit 0179720d6be2 ("net/mlx5: Introduce trigger_health_work function") which uses spin_lock from asynchronous event (IRQ) context. Thus, all spin_lock/unlock of health->wq_lock should have been moved to IRQ safe mode. However, one occurrence on new code using this lock missed that change, resulting in possible deadlock: kernel: Possible unsafe locking scenario: kernel: CPU0 kernel: ---- kernel: lock(&(&health->wq_lock)->rlock); kernel: <Interrupt> kernel: lock(&(&health->wq_lock)->rlock); kernel: #012 *** DEADLOCK *** Fixes: 2a0165a034ac ("net/mlx5: Cancel delayed recovery work when unloading the driver") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26Merge tag 'xfs-4.14-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fix from Darrick Wong: "Here's (hopefully) the last bugfix for 4.14: - Rework nowait locking code to reduce locking overhead penalty" * tag 'xfs-4.14-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix AIM7 regression
2017-10-26Merge tag 'hwmon-for-linus-v4.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fix initial temperature readings for TMP102 - Fix timeouts in DA9052 driver by increasing its sampling rate * tag 'hwmon-for-linus-v4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (tmp102) Fix first temperature reading hwmon: (da9052) Increase sample rate when using TSI
2017-10-26Merge tag 'sound-4.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Just two HD-audio fixups for a recent Realtek codec model. It's pretty safe to apply (and unsurprisingly boring)" * tag 'sound-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - fix headset mic problem for Dell machines with alc236 ALSA: hda/realtek - Add support for ALC236/ALC3204
2017-10-26Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next Just a few fixes for 4.15. * 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: drm/amd/amdgpu: Remove workaround for suspend/resume in uvd7 drm/amdgpu: don't flush the TLB before initializing GART drm/amdgpu: minor cleanup for amdgpu_ttm_bind drm/amdgpu/psp: prevent page fault by checking write_frame address(v4) drm/amd/powerplay: retrieve the real-time coreClock values drm/amd/powerplay: fix performance drop on Vega10 drm/amd/powerplay: add one smc message for Vega10 drm/amd/powerplay: fix amd_powerplay_reset() amdgpu: add padding to the fence to handle ioctl. drm/amdgpu:fix wb_clear drm/amdgpu:fix vf_error_put drm/amdgpu/sriov:now must reinit psp drm/amdgpu: merge bios post checking functions
2017-10-25drm/amd/amdgpu: Remove workaround for suspend/resume in uvd7Tom St Denis
The workaround is not required anymor and would result in hangs during suspend/resume cycles if the uvd block were busy. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-25drm/amdgpu: don't flush the TLB before initializing GARTChristian König
No point in doing this. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-25drm/amdgpu: minor cleanup for amdgpu_ttm_bindChristian König
Filter the placement mask before using it. In theory it could be that we have other flags set here as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-25drm/amdgpu/psp: prevent page fault by checking write_frame address(v4)Evan Quan
- Prevent a possible buffer overflow when updating the ring buffer by bounds checking the command frame against the available space in the ring buffer. v2: update the ring_buffer_end address v3: update the commit log v4: squash in print fix (Michel) Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-25drm/amd/powerplay: retrieve the real-time coreClock valuesEvan Quan
- Currently, the coreClock value for min/max performance level on raven is hard-coded. Use the real-time value retrieved by GetGfxMinFreqLimit and GetGfxMaxFreqLimit PPSMC messages Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-10-25drm/amd/powerplay: fix performance drop on Vega10Eric Huang
Setting package power PID to 1 fixes performance drop caused by updated SMU FW, before DPM is enabled. Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>