linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2015-12-09	drm/nouveau/pmu: remove whitelist for PGOB-exit WAR, enable by default	Ben Skeggs
	NVIDIA have indicated that the workaround is required on all GK10[467] boards that have the PGOB fuse set. I've left the commandline option in place for now, as paranoia. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-12-08	IB/mlx5: Postpone remove_keys under knowledge of coming preemption	Leon Romanovsky
	The remove_keys() logic is performed as garbage collection task. Such task is intended to be run when no other active processes are running. The need_resched() will return TRUE if there are user tasks to be activated in near future. In such case, we don't execute remove_keys() and postpone the garbage collection work to try to run in next cycle, in order to free CPU resources to other tasks. The possible pseudo-code to trigger such scenario: 1. Allocate a lot of MR to fill the cache above the limit. 2. Wait a small amount of time "to calm" the system. 3. Start CPU extensive operations on multi-node cluster. 4. Expect performance degradation during MR cache shrink operation. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	IB/mlx4: Use vmalloc for WR buffers when needed	Wengang Wang
	There are several hits that WR buffer allocation(kmalloc) failed. It failed at order 3 and/or 4 contigous pages allocation. At the same time there are actually 100MB+ free memory but well fragmented. So try vmalloc when kmalloc failed. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	IB/mlx4: Use correct order of variables in log message	Wengang Wang
	There is a mis-order in mlx4 log. Fix it. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	Merge branch 'for-4.4-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Nothing too interesting. All are device specific additions and workarounds" * 'for-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata/sata_fsl.c: add ATA_FLAG_NO_LOG_PAGE to blacklist the controller for log page reads libata-eh.c: Introduce new ata port flag for controller which lockup on read log page sata_sil: disable trim AHCI: Fix softreset failed issue of Port Multiplier sata/mvebu: use #ifdef around suspend/resume code ahci: Order SATA device IDs for codename Lewisburg ahci: Add Device ID for Intel Sunrise Point PCH
2015-12-08	null_blk: Fix error path in module initialization	Minfei Huang
	Module couldn't release resource properly during the initialization. To fix this issue, we will clean up the proper resource before returning. Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-08	iser-target: Remove explicit mlx4 work-around	Sagi Grimberg
	The driver now exposes sufficient limits so we can avoid having mlx4 specific work-around. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	mlx4: Expose correct max_sge_rd limit	Sagi Grimberg
	mlx4 devices (ConnectX-2, ConnectX-3) has a limitation where rdma read work queue entries cannot exceed 512 bytes. A rdma_read wqe needs to fit in 512 bytes: - wqe control segment (16 bytes) - rdma segment (16 bytes) - scatter elements (16 bytes each) So max_sge_rd should be: (512 - 16 - 16) / 16 = 30. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	IB/mad: Require CM send method for everything except ClassPortInfo	Hal Rosenstock
	Receipt of CM MAD with other than the Send method for an attribute other than the ClassPortInfo attribute is invalid. CM attributes other than ClassPortInfo only use the send method. The SRP initiator does not maintain a timeout policy for CM connect requests relies on the CM layer to do that. The result was that the SRP initiator hung as the connect request never completed. A new SRP target has been observed to respond to Send CM REQ with GetResp of CM REQ with bad status. This is non conformant with IBA spec but exposes a vulnerability in the current MAD/CM code which will respond to the incoming GetResp of CM REQ as if it was a valid incoming Send of CM REQ rather than tossing this on the floor. It also causes the MAD layer not to retransmit the original REQ even though it has not received a REP. Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	IB/cma: Add a missing rcu_read_unlock()	Bart Van Assche
	Ensure that validate_ipv4_net_dev() calls rcu_read_unlock() if fib_lookup() fails. Detected by sparse. Compile-tested only. Fixes: "IB/cma: Validate routing of incoming requests" (commit f887f2ac87c2). Cc: Haggai Eran <haggaie@mellanox.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08	of/fdt: Add mutex protection for calls to __unflatten_device_tree()	Guenter Roeck
	__unflatten_device_tree() calls unflatten_dt_node(), which declares a static variable. It is therefore not reentrant. One of the callers of __unflatten_device_tree(), unflatten_device_tree(), is only called once during early initialization and does not need to be protected. The other caller, of_fdt_unflatten_tree(), can be called at any time, possibly multiple times in parallel. This can happen, for example, if multiple devicetree overlays have to be loaded and installed. Without this protection, errors such as the following may be seen. kernel: End of tree marker overwritten: e6a3a458 kernel: find_target_node: Failed to find target-indirect node at /fragment@0 kernel: __of_overlay_create: of_build_overlay_info() failed for tree@/ Add a mutex to of_fdt_unflatten_tree() to make the call reentrant. Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Rob Herring <robh@kernel.org>
2015-12-08	usb: gadget: uvc: fix permissions of configfs attributes	Mian Yousaf Kaukab
	76e0da3 "usb-gadget/uvc: use per-attribute show and store methods" removed write permission for writeable attributes. Correct attribute permissions. Fixes: 76e0da3 "usb-gadget/uvc: use per-attribute show and store methods" Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mian Yousaf Kaukab <yousaf.kaukab@intel.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-12-08	usb: musb: core: Fix pm runtime for deferred probe	Tony Lindgren
	If musb_init_controller fails at musb_platform_init, we have already called pm_runtime_irq_safe for musb and that causes the pm runtime count to be enabled for parent before the parent has completed initialization. This causes pm to stop working as on unload nothing gets idled. This issue can be reproduced at least with: # modprobe omap2430 HS USB OTG: no transceiver configured musb-hdrc musb-hdrc.0.auto: musb_init_controller failed with status -517 # modprobe phy-twl4030-usb # rmmod omap2430 And after the steps above omap2430 will block deeper idle states on omap3. To fix this, let's not enable pm runtime until we need to and the parent has been initialized. Note that this does not fix the issue of PM being broken for musb during runtime. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-12-08	drm/i915/skl: Double RC6 WRL always on	Mika Kuoppala
	WaRsDoubleRc6WrlWithCoarsePowerGating should be enabled for all Skylakes. Make it so. Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1449505785-20812-2-git-send-email-mika.kuoppala@intel.com (cherry picked from commit e7674b8c31717dd0c58b3a9493d43249722071eb) Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08	drm/i915/skl: Disable coarse power gating up until F0	Mika Kuoppala
	There is conflicting info between E0 and F0 steppings for this workarounds. Trust more authoritative source and be conservative and extend also for F0. This prevents numerous (>50) gpu hangs with SKL GT4e during piglit run. References: HSD: gen9lp/2134184 Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1449505785-20812-1-git-send-email-mika.kuoppala@intel.com (cherry picked from commit 6686ece19f7446f0e29c77d9e0402e1d0ce10c48) Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08	usb: phy: msm: fix a possible NULL dereference	LABBE Corentin
	of_match_device could return NULL, and so cause a NULL pointer dereference later. Even if the probability of this case is very low, fixing it made static analyzers happy. Solving this with of_device_get_match_data made also code simplier. Reported-by: coverity (CID 1324133) Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2015-12-08	drm/vmwgfx: Implement the cursor_set2 callback v2	Thomas Hellstrom
	Fixes native drm clients like Fedora 23 Wayland which now appears to be able to use cursor hotspots without strange cursor offsets. Also fixes a couple of ignored error paths. Since the core drm cursor hotspot is incompatible with the legacy vmwgfx hotspot (the core drm hotspot is reset when the drm_mode_cursor ioctl is used), we need to keep track of both and add them when the device hotspot is set. We assume that either is always zero. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2015-12-08	drm/i915: Remove incorrect warning in context cleanup	Tvrtko Ursulin
	Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Mon Oct 5 13:26:36 2015 +0100 drm/i915: Clean up associated VMAs on context destruction Added a warning based on an incorrect assumption that all VMAs in a VM will be on the inactive list at the point last reference to a context and VM is dropped. This is not true because i915_gem_object_retire__read will not put VMA on the inactive list until all activities on the object in question (in all VMs) have been retired. As a consequence, whether or not a context/VM will be destroyed with its VMAs still on the active list, can depend on completely unrelated activities using the same object from a different context or engine. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 Testcase: igt/gem_request_retire/retire-vma-not-inactive Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1448025816-25584-1-git-send-email-tvrtko.ursulin@linux.intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit 408952d43b27a54437244c56c0e0d8efa5607926) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08	cxl: Set endianess of kernel contexts	Frederic Barrat
	A process element (defined in CAIA) keeps track of the endianess of contexts through the Little Endian (LE) bit of the State Register. It is currently set for user contexts, but was somehow forgotten for kernel contexts, so this patch fixes it. It could lead to erratic behavior from an AFU when the context is attached through the kernel API. Fixes: 2f663527bd6a ("cxl: Configure PSL for kernel contexts and merge code") Cc: stable@vger.kernel.org # 4.2+ Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Suggested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-07	IB core: Fix ib_sg_to_pages()	Bart Van Assche
	On 12/03/2015 01:18 AM, Christoph Hellwig wrote: > The patch looks good to me, but while we touch this area, how about > throwing in a few cosmetic fixes as well? How about the patch below ? In that version of the ib_sg_to_pages() fix these concerns have been addressed and additionally to more bugs have been fixed. ------------ [PATCH] IB core: Fix ib_sg_to_pages() Fix the code for detecting gaps. A gap occurs not only if the second or later scatterlist element is not aligned but also if any scatterlist element other than the last does not end at a page boundary. In the code for coalescing contiguous elements, ensure that mr->length is correct and that last_page_addr is up-to-date. Ensure that this function returns a negative error code instead of zero if the first set_page() call fails. Fixes: commit 4c67e2bfc8b7 ("IB/core: Introduce new fast registration API") Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/srp: Fix srp_map_sg_fr()	Bart Van Assche
	After dma_map_sg() has been called the return value of that function must be used as the number of elements in the scatterlist instead of scsi_sg_count(). Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.4+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/srp: Fix indirect data buffer rkey endianness	Bart Van Assche
	Detected by sparse. Fixes: commit 330179f2fa93 ("IB/srp: Register the indirect data buffer descriptor") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.3+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/srp: Initialize dma_length in srp_map_idb	Christoph Hellwig
	Without this sg_dma_len will return 0 on architectures tha have the dma_length field. Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/srp: Fix possible send queue overflow	Sagi Grimberg
	When using work request based memory registration (fast_reg) we must reserve SQ entries for registration and invalidation in addition to send operations. Each IO consumes 3 SQ entries (registration, send, invalidation) so we need to allocate 3x larger send-queue instead of 2x. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/srp: Fix a memory leak	Bart Van Assche
	If srp_connect_ch() returns a positive value then that is considered by its caller as a connection failure but this does not result in a scsi_host_put() call and additionally causes the srp_create_target() function to return a positive value while it should return a negative value. Avoid all this confusion and additionally fix a memory leak by ensuring that srp_connect_ch() always returns a value that is <= 0. This patch avoids that a rejected login triggers the following memory leak: unreferenced object 0xffff88021b24a220 (size 8): comm "srp_daemon", pid 56421, jiffies 4295006762 (age 4240.750s) hex dump (first 8 bytes): 68 6f 73 74 35 38 00 a5 host58.. backtrace: [<ffffffff8151014a>] kmemleak_alloc+0x7a/0xc0 [<ffffffff81165c1e>] __kmalloc_track_caller+0xfe/0x160 [<ffffffff81260d2b>] kvasprintf+0x5b/0x90 [<ffffffff81260e2d>] kvasprintf_const+0x8d/0xb0 [<ffffffff81254b0c>] kobject_set_name_vargs+0x3c/0xa0 [<ffffffff81337e3c>] dev_set_name+0x3c/0x40 [<ffffffff81355757>] scsi_host_alloc+0x327/0x4b0 [<ffffffffa03edc8e>] srp_create_target+0x4e/0x8a0 [ib_srp] [<ffffffff8133778b>] dev_attr_store+0x1b/0x20 [<ffffffff811f27fa>] sysfs_kf_write+0x4a/0x60 [<ffffffff811f1e8e>] kernfs_fop_write+0x14e/0x180 [<ffffffff81176eef>] __vfs_write+0x2f/0xf0 [<ffffffff811771e4>] vfs_write+0xa4/0x100 [<ffffffff81177c64>] SyS_write+0x54/0xc0 [<ffffffff8151b257>] entry_SYSCALL_64_fastpath+0x12/0x6f Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	vxlan: interpret IP headers for ECN correctly	Jiri Benc
	When looking for outer IP header, use the actual socket address family, not the address family of the default destination which is not set for metadata based interfaces (and doesn't have to match the address family of the received packet even if it was set). Fix also the misleading comment. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	IB/sa: Put netlink request into the request list before sending	Kaike Wan
	It was found by Saurabh Sengar that the netlink code tried to allocate memory with GFP_KERNEL while holding a spinlock. While it is possible to fix the issue by replacing GFP_KERNEL with GFP_ATOMIC, it is better to get rid of the spinlock while sending the packet. However, in order to protect against a race condition that a quick response may be received before the request is put on the request list, we need to put the request on the list first. Signed-off-by: Kaike Wan <kaike.wan@intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reported-by: Saurabh Sengar <saurabh.truth@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/iser: use sector_div instead of do_div	Arnd Bergmann
	do_div is the wrong way to divide a sector_t, as it is less efficient when sector_t is 32-bit wide. With the upcoming do_div optimizations, the kernel starts warning about this: drivers/infiniband/ulp/iser/iser_verbs.c:1296:4: note: in expansion of macro 'do_div' include/asm-generic/div64.h:224:22: warning: passing argument 1 of '__div64_32' from incompatible pointer type This changes the code to use sector_div instead, which always produces optimal code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/core: use RCU for uverbs id lookup	Mike Marciniszyn
	The current implementation gets a spin_lock, and at any scale with qib and hfi1 post send, the lock contention grows exponentially with the number of QPs. idr_find() is RCU compatibile, so read doesn't need the lock. Change to use rcu_read_lock() and rcu_read_unlock() in __idr_get_uobj(). kfree_rcu() is used to insure a grace period between the idr removal and actual free. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/qib: Minor fixes to qib per SFF 8636	Easwar Hariharan
	Minor errors found via code inspection during future development. SFF 8636 defines bit position 2 to hold the status indication of QSFP memory paging. The mask used to test for the value was incorrect and is fixed in this patch. Additionally, the dump function had a mismatch between the field being printed out and the field used to source the data which was fixed. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reported-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	vxlan: support ndo_fill_metadata_dst also for IPv6	Jiri Benc
	Fill the metadata correctly even when tunneling over IPv6. Also, check that the provided metadata is of an address family that is supported by the tunnel. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	vxlan: move IPv6 outpute route calculation to a function	Jiri Benc
	Will be used also for ndo_fill_metadata_dst. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	IB/core: Fix user mode post wr corruption	Mike Marciniszyn
	Commit e622f2f4ad21 ("IB: split struct ib_send_wr") introduced a regression for HCAs whose user mode post sends go through ib_uverbs_post_send(). The code didn't account for the fact that the first sge is offset by an operation dependent length. The allocation did, but the pointer to the destination sge list is computed without that knowledge. The sge list copy_from_user() then corrupts fields in the work request Store the operation dependent length in a local variable and compute the sge list copy_from_user() destination using that length. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	IB/qib: Fix qib_mr structure	Ira Weiny
	struct qib_mr requires the mr member be the last because struct qib_mregion contains a dynamic array at the end. The additions of members should have been placed before this structure as the comment noted. Failure to do so was causing random memory corruption. Reproducing this bug was easy to do by running the client and server of ib_write_bw -s 8 -n 5 on the same node. This BUG() was tripped in a slab debug kernel: kernel BUG at mm/slab.c:2572! Fixes: 38071a461f0a ("IB/qib: Support the new memory registration API") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-07	qed: Correct slowpath interrupt scheme	Sudarsana Kalluru
	When using INTa, ISR might be called before device is configured for INTa [E.g., due to other device asserting the shared interrupt line], in which case the ISR would read the SISR registers that shouldn't be read unless HW is already configured for INTa. This might break interrupts later on. There's also an MSI-X issue due to this difference, although it's mostly theoretical. This patch changes the initialization order, calling request_irq() for the slowpath interrupt only after the chip is configured for working in the preferred interrupt mode. Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com> Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	qed: Fix BAR size split for some servers	Ariel Elior
	Can't rely on pci config space to discover bar size, as in some environments this returns a wrong, too large value. Instead, rely on device register, which contains the value provided by MFW at preboot. Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	qed: fix handling of concurrent ramrods.	Tomer Tayar
	Concurrent non-blocking slowpath ramrods can be completed out-of-order on the completion chain. Recycling completed elements, while previously sent elements are still completion pending, can lead to overriding of active elements on the chain. Furthermore, sending pending slowpath ramrods currently lacks the update of the chain element physical pointer. This patch: * Ensures that ramrods are sent to the FW with consecutive echo values. * Handles out-of-order completions by freeing only first successive completed entries. * Updates the chain element physical pointer when copying a pending element into a free element for sending. Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com> Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	ethernet: aurora: AURORA_NB8800 should depend on HAS_DMA	Geert Uytterhoeven
	If NO_DMA=y: ERROR: "dma_map_single" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_unmap_page" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_sync_single_for_cpu" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_unmap_single" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_alloc_coherent" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_mapping_error" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_map_page" [drivers/net/ethernet/aurora/nb8800.ko] undefined! ERROR: "dma_free_coherent" [drivers/net/ethernet/aurora/nb8800.ko] undefined! Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Mans Rullgard <mans@mansr.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-07	lightnvm: do not compile in debugging by default	Matias Bjørling
	The LightNVM module exposes a debug interface when CONFIG_NVM_DEBUG is set. This interfaces takes a string to configure media managers and targets. Make sure this interface is only exposed when chosen deliberately. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: prevent gennvm module unload on use	Matias Bjørling
	After the gennvm module has been initialized. It might be attached to one or several devices. In that case, the module is in use. Make sure that it can not be unloaded. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: fix media mgr registration	Matias Bjørling
	This patch fixes two issues during media manager registration. 1. The ppa pool can be used at media manager registration. Allocate the ppa pool before that. 2. If a media manager can't be found, this should not lead to the device being unallocated. A media manager can be registered later, that can manage the device. Only warn if a media manager fails initialization. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: replace req queue with nvmdev for lld	Matias Bjørling
	In the case where a request queue is passed to the low lever lightnvm device drive integration, the device driver might pass its admin commands through another queue. Instead pass nvm_dev, and let the low level drive the appropriate queue. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: check mm before use	Matias Bjørling
	The core can may issue I/Os before a media manager is registered with the lightnvm subsystem. Make sure that we don't call the media manager ->end_io prematurely with a null pointer. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: refactor spin_unlock in gennvm_get_blk	Wenwei Tao
	The spin_unlock is duplicated multiple times. Jump to a single unlock to improve the code flow. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: put blks when luns configure failed	Wenwei Tao
	Put the allocated blocks back to the free list when the luns configure failed, to make these blocks useable to others. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	lightnvm: use flags in rrpc_get_blk	Wenwei Tao
	rrpc_get_blk use constant 0 as the input parameter of nvm_get_blk, this may result in getting gc block failed unexpectedly. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07	virtio_ring: shadow available ring flags & index	Venkatesh Srinivas
	Improves cacheline transfer flow of available ring header. Virtqueues are implemented as a pair of rings, one producer->consumer avail ring and one consumer->producer used ring; preceding the avail ring in memory are two contiguous u16 fields -- avail->flags and avail->idx. A producer posts work by writing to avail->idx and a consumer reads avail->idx. The flags and idx fields only need to be written by a producer CPU and only read by a consumer CPU; when the producer and consumer are running on different CPUs and the virtio_ring code is structured to only have source writes/sink reads, we can continuously transfer the avail header cacheline between 'M' states between cores. This flow optimizes core -> core bandwidth on certain CPUs. (see: "Software Optimization Guide for AMD Family 15h Processors", Section 11.6; similar language appears in the 10h guide and should apply to CPUs w/ exclusive caches, using LLC as a transfer cache) Unfortunately the existing virtio_ring code issued reads to the avail->idx and read-modify-writes to avail->flags on the producer. This change shadows the flags and index fields in producer memory; the vring code now reads from the shadows and only ever writes to avail->flags and avail->idx, allowing the cacheline to transfer core -> core optimally. In a concurrent version of vring_bench, the time required for 10,000,000 buffer checkout/returns was reduced by ~2% (average across many runs) on an AMD Piledriver (15h) CPU: (w/o shadowing): Performance counter stats for './vring_bench': 5,451,082,016 L1-dcache-loads ... 2.221477739 seconds time elapsed (w/ shadowing): Performance counter stats for './vring_bench': 5,405,701,361 L1-dcache-loads ... 2.168405376 seconds time elapsed The further away (in a NUMA sense) virtio producers and consumers are from each other, the more we expect to benefit. Physical implementations of virtio devices and implementations of virtio where the consumer polls vring avail indexes (vhost) should also benefit. Signed-off-by: Venkatesh Srinivas <venkateshs@google.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07	virtio: Do not drop __GFP_HIGH in alloc_indirect	Michal Hocko
	b92b1b89a33c ("virtio: force vring descriptors to be allocated from lowmem") tried to exclude highmem pages for descriptors so it cleared __GFP_HIGHMEM from a given gfp mask. The patch also cleared __GFP_HIGH which doesn't make much sense for this fix because __GFP_HIGH only controls access to memory reserves and it doesn't have any influence on the zone selection. Some of the call paths use GFP_ATOMIC and dropping __GFP_HIGH will reduce their changes for success because the lack of access to memory reserves. Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
2015-12-07	vhost: replace % with & on data path	Michael S. Tsirkin
	We know vring num is a power of 2, so use & to mask the high bits. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07	virtio: fix memory leak of virtio ida cache layers	Suman Anna
	The virtio core uses a static ida named virtio_index_ida for assigning index numbers to virtio devices during registration. The ida core may allocate some internal idr cache layers and an ida bitmap upon any ida allocation, and all these layers are truely freed only upon the ida destruction. The virtio_index_ida is not destroyed at present, leading to a memory leak when using the virtio core as a module and atleast one virtio device is registered and unregistered. Fix this by invoking ida_destroy() in the virtio core module exit. Cc: stable@vger.kernel.org Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>