summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-08-25btrfs: relocation: Fix leaking qgroups numbers on data extentsQu Wenruo
This patch fixes a REGRESSION introduced in 4.2, caused by the big quota rework. When balancing data extents, qgroup will leak all its numbers for relocated data extents. The relocation is done in the following steps for data extents: 1) Create data reloc tree and inode 2) Copy all data extents to data reloc tree And commit transaction 3) Create tree reloc tree(special snapshot) for any related subvolumes 4) Replace file extent in tree reloc tree with new extents in data reloc tree And commit transaction 5) Merge tree reloc tree with original fs, by swapping tree blocks For 1)~4), since tree reloc tree and data reloc tree doesn't count to qgroup, everything is OK. But for 5), the swapping of tree blocks will only info qgroup to track metadata extents. If metadata extents contain file extents, qgroup number for file extents will get lost, leading to corrupted qgroup accounting. The fix is, before commit transaction of step 5), manually info qgroup to track all file extents in data reloc tree. Since at commit transaction time, the tree swapping is done, and qgroup will account these data extents correctly. Cc: Mark Fasheh <mfasheh@suse.de> Reported-by: Mark Fasheh <mfasheh@suse.de> Reported-by: Filipe Manana <fdmanana@gmail.com> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent()Qu Wenruo
Refactor btrfs_qgroup_insert_dirty_extent() function, to two functions: 1. btrfs_qgroup_insert_dirty_extent_nolock() Almost the same with original code. For delayed_ref usage, which has delayed refs locked. Change the return value type to int, since caller never needs the pointer, but only needs to know if they need to free the allocated memory. 2. btrfs_qgroup_insert_dirty_extent() The more encapsulated version. Will do the delayed_refs lock, memory allocation, quota enabled check and other things. The original design is to keep exported functions to minimal, but since more btrfs hacks exposed, like replacing path in balance, we need to record dirty extents manually, so we have to add such functions. Also, add comment for both functions, to info developers how to keep qgroup correct when doing hacks. Cc: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-and-Tested-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: waiting on qgroup rescan should not always be interruptibleJeff Mahoney
We wait on qgroup rescan completion in three places: file system shutdown, the quota disable ioctl, and the rescan wait ioctl. If the user sends a signal while we're waiting, we continue happily along. This is expected behavior for the rescan wait ioctl. It's racy in the shutdown path but mostly works due to other unrelated synchronization points. In the quota disable path, it Oopses the kernel pretty much immediately. Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: properly track when rescan worker is runningJeff Mahoney
The qgroup_flags field is overloaded such that it reflects the on-disk status of qgroups and the runtime state. The BTRFS_QGROUP_STATUS_FLAG_RESCAN flag is used to indicate that a rescan operation is in progress, but if the file system is unmounted while a rescan is running, the rescan operation is paused. If the file system is then mounted read-only, the flag will still be present but the rescan operation will not have been resumed. When we go to umount, btrfs_qgroup_wait_for_completion will see the flag and interpret it to mean that the rescan worker is still running and will wait for a completion that will never come. This patch uses a separate flag to indicate when the worker is running. The locking and state surrounding the qgroup rescan worker needs a lot of attention beyond this patch but this is enough to avoid a hung umount. Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by; Jeff Mahoney <jeffm@suse.com> Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: flush_space: treat return value of do_chunk_alloc properlyAlex Lyakas
do_chunk_alloc returns 1 when it succeeds to allocate a new chunk. But flush_space will not convert this to 0, and will also return 1. As a result, reserve_metadata_bytes will think that flush_space failed, and may potentially return this value "1" to the caller (depends how reserve_metadata_bytes was called). The caller will also treat this as an error. For example, btrfs_block_rsv_refill does: int ret = -ENOSPC; ... ret = reserve_metadata_bytes(root, block_rsv, num_bytes, flush); if (!ret) { block_rsv_add_bytes(block_rsv, num_bytes, 0); return 0; } return ret; So it will return -ENOSPC. Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: add ASSERT for block group's memory leakLiu Bo
This adds several ASSERT()' s to report memory leak of block group cache. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25btrfs: backref: Fix soft lockup in __merge_refs functionQu Wenruo
When over 1000 file extents refers to one extent, find_parent_nodes() will be obviously slow, due to the O(n^2)~O(n^3) loops inside __merge_refs(). The following ftrace shows the cubic growth of execution time: 256 refs 5) + 91.768 us | __add_keyed_refs.isra.12 [btrfs](); 5) 1.447 us | __add_missing_keys.isra.13 [btrfs](); 5) ! 114.544 us | __merge_refs [btrfs](); 5) ! 136.399 us | __merge_refs [btrfs](); 512 refs 6) ! 279.859 us | __add_keyed_refs.isra.12 [btrfs](); 6) 3.164 us | __add_missing_keys.isra.13 [btrfs](); 6) ! 442.498 us | __merge_refs [btrfs](); 6) # 2091.073 us | __merge_refs [btrfs](); and 1024 refs 7) ! 368.683 us | __add_keyed_refs.isra.12 [btrfs](); 7) 4.810 us | __add_missing_keys.isra.13 [btrfs](); 7) # 2043.428 us | __merge_refs [btrfs](); 7) * 18964.23 us | __merge_refs [btrfs](); And sort them into the following char: (Unit: us) ------------------------------------------------------------------------ Trace function | 256 ref | 512 refs | 1024 refs | ------------------------------------------------------------------------ __add_keyed_refs | 91 | 249 | 368 | __add_missing_keys | 1 | 3 | 4 | __merge_refs 1st call | 114 | 442 | 2043 | __merge_refs 2nd call | 136 | 2091 | 18964 | ------------------------------------------------------------------------ We can see the that __add_keyed_refs() grows almost in linear behavior. And __add_missing_keys() in this case doesn't change much or takes much time. While for the 1st __merge_refs() it's square growth for the 2nd __merge_refs() call it's cubic growth. It's no doubt that merge_refs() will take a long long time to execute if the number of refs continues its grows. So add a cond_resced() into the loop of __merge_refs(). Although this will solve the problem of soft lockup, we need to use the new rb_tree based structure introduced by Lu Fengqi to really solve the long execution time. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25Btrfs: fix memory leak of reloc_rootLiu Bo
When some critical errors occur and FS would be flipped into RO, if we have an on-going balance, we can end up with a memory leak of root->reloc_root since btrfs_drop_snapshots() bails out without freeing reloc_root at the very early start. However, we're not able to free reloc_root in btrfs_drop_snapshots() because its caller, merge_reloc_roots(), still needs to access it to cleanup reloc_root's rbtree. This makes us free reloc_root when we're going to free fs/file roots. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2016-08-25arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASEMark Rutland
When CONFIG_RANDOMIZE_BASE is selected, we modify the page tables to remap the kernel at a newly-chosen VA range. We do this with the MMU disabled, but do not invalidate TLBs prior to re-enabling the MMU with the new tables. Thus the old mappings entries may still live in TLBs, and we risk violating Break-Before-Make requirements, leading to TLB conflicts and/or other issues. We invalidate TLBs when we uninsall the idmap in early setup code, but prior to this we are subject to issues relating to the Break-Before-Make violation. Avoid these issues by invalidating the TLBs before the new mappings can be used by the hardware. Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR") Cc: <stable@vger.kernel.org> # 4.6+ Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-25Merge branch 'for-rc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal fixes from Zhang Rui: - Fix cpu_cooling to have separate thermal_cooling_device_ops structures for cpus with and without power model, to avoid NULL dereference in cpufreq_state2power. From Brendan Jackman. - Fix a possible NULL dereference in imx_thermal driver. From Corentin LABBE. - Another two trivial fixes, one typo fix and one deleting module owner. From Caesar Wang and Markus Elfring. * 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: imx: fix a possible NULL dereference thermal: trivial: fix the typo Thermal-INT3406: Delete owner assignment thermal: cpu_cooling: Fix NULL dereference in cpufreq_state2power
2016-08-25usb: dwc3: gadget: always decrement by 1Felipe Balbi
We need to decrement in both cases (enq > deq and enq < deq) Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25usb: dwc3: debug: fix ep name on trace outputFelipe Balbi
There was a typo when generating endpoint name which would be very confusing when debugging. Fix it. Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25usb: gadget: udc: core: don't starve DMA resourcesFelipe Balbi
Always unmap all SG entries as required by DMA API Fixes: a698908d3b3b ("usb: gadget: add generic map/unmap request utilities") Cc: <stable@vger.kernel.org> # v3.4+ Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2016-08-25Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-fixes radeon and amdgpu fixes for 4.8. Nothing major: - fix a performance regression due to the LRU changes in 4.7 - 32 bit fixes - fix a PLL regression - misc bug fixes * 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: skip TV/CV in display parsing drm/amdgpu: avoid a possible array overflow drm/amdgpu: fix lru size grouping v2 drm/amdgpu: fix timeout value check in amd_sched_job_recovery drm/amdgpu: fix sdma_v2_4_ring_test_ib drm/amdgpu: fix amdgpu_move_blit on 32bit systems drm/radeon: fix radeon_move_blit on 32bit systems drm/radeon: only apply the SS fractional workaround to RS[78]80
2016-08-25Merge tag 'drm/tegra/for-4.8-rc4' of ↵Dave Airlie
git://anongit.freedesktop.org/tegra/linux into drm-fixes drm/tegra: Fixes for v4.8-rc4 This contains one fix for DSI runtime power management support that was introduced in v4.8-rc1. This is slightly more elaborate than I would've wished, but there are a few corner cases that needed fixing. * tag 'drm/tegra/for-4.8-rc4' of git://anongit.freedesktop.org/tegra/linux: drm/tegra: dsi: Enhance runtime power management
2016-08-24SUNRPC: Silence WARN_ON when NFSv4.1 over RDMA is in useChuck Lever
Using NFSv4.1 on RDMA should be safe, so broaden the new checks in rpc_create(). WARN_ON_ONCE is used, matching most other WARN call sites in clnt.c. Fixes: 39a9beab5acb ("rpc: share one xps between all backchannels") Fixes: d50039ea5ee6 ("nfsd4/rpc: move backchannel create logic...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-24dm log: fix unitialized bio operation flagsHeinz Mauelshagen
Commit e6047149db ("dm: use bio op accessors") switched DM over to using bio_set_op_attrs() but didn't take care to initialize lc->io_req.bi_op_flags in dm-log.c:rw_header(). This caused rw_header()'s call to dm_io() to make bio->bi_op_flags be uninitialized in dm-io.c:do_region(), which ultimately resulted in a SCSI BUG() in sd_init_command(). Also, adjust rw_header() and its callers to use REQ_OP_{READ|WRITE}. Fixes: e6047149db ("dm: use bio op accessors") Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-08-24dm flakey: fix reads to be issued if drop_writes configuredMike Snitzer
v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the down_interval") overlooked the 'drop_writes' feature, which is meant to allow reads to be issued rather than errored, during the down_interval. Fixes: 99f3c90d0d ("dm flakey: error READ bios during the down_interval") Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2016-08-24clk: rockchip: mark aclk_emmc_noc as a critical clock on rk3399Xing Zheng
We don't have code to handle any of the noc clocks in rk3399 and they're all just listed as critical clocks. Let's do the same for aclk_emmc_noc. Without this clock being marked as critical we have problems around suspend/resume after commit 20c389e656a8 ("clk: rockchip: fix incorrect aclk_emmc source gate bits on rk3399"). Before that change we were presumably not actually gating any of these clocks because we were setting the wrong gate. Fixes: 20c389e656a8 ("clk: rockchip: fix incorrect aclk_emmc source gate bits on rk3399") Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2016-08-24blk-mq: improve warning for running a queue on the wrong CPUJens Axboe
__blk_mq_run_hw_queue() currently warns if we are running the queue on a CPU that isn't set in its mask. However, this can happen if a CPU is being offlined, and the workqueue handling will place the work on CPU0 instead. Improve the warning so that it only triggers if the batch cpu in the hardware queue is currently online. If it triggers for that case, then it's indicative of a flow problem in blk-mq, so we want to retain it for that case. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-24blk-mq: don't overwrite rq->mq_ctxJens Axboe
We do this in a few places, if the CPU is offline. This isn't allowed, though, since on multi queue hardware, we can't just move a request from one software queue to another, if they map to different hardware queues. The request and tag isn't valid on another hardware queue. This can happen if plugging races with CPU offlining. But it does no harm, since it can only happen in the window where we are currently busy freezing the queue and flushing IO, in preparation for redoing the software <-> hardware queue mappings. Signed-off-by: Jens Axboe <axboe@fb.com>
2016-08-24IB/srpt: Update sport->port_guid with each port refreshDoug Ledford
If port_guid is set with the default subnet_prefix, then we get a change event and run a port refresh, we don't update the port_guid. As a result, attempts to create a target device that uses the new subnet_prefix in the wwn will fail to find a match and be rejected by the ib_srpt driver. This makes it impossible to configure a port if it was initialized with a default subnet_prefix and later changed to any non-default subnet-prefix. Updating the port refresh task to always update the wwn based upon the current subnext_prefix solves this problem. Cc: Bart Van Assche <bart.vanassche@sandisk.com> Cc: nab@linux-iscsi.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24Merge branch 'for-linus-4.8-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fix from Richard Weinberger: "This contains a fix for a build regression introduced during the merge window" * 'for-linus-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Don't discard .text.exit section
2016-08-24Merge tag 'upstream-4.8-rc4' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull UBIFS fixes from Richard Weinberger: "This pull requests contains fixes for two issues in UBI and UBIFS: - wrong UBIFS assertion. - a UBIFS xattr regression" * tag 'upstream-4.8-rc4' of git://git.infradead.org/linux-ubifs: ubifs: Fix xattr generic handler usage ubifs: Fix assertion in layout_in_gaps()
2016-08-24Merge remote-tracking branches 'asoc/fix/max98371', 'asoc/fix/nau8825', ↵Mark Brown
'asoc/fix/omap', 'asoc/fix/samsung', 'asoc/fix/simple' and 'asoc/fix/wm2000' into asoc-linus
2016-08-24Merge remote-tracking branches 'asoc/fix/atmel', 'asoc/fix/compress', ↵Mark Brown
'asoc/fix/da7213' and 'asoc/fix/debugfs' into asoc-linus
2016-08-24Merge remote-tracking branch 'asoc/fix/rcar' into asoc-linusMark Brown
2016-08-24Merge remote-tracking branch 'asoc/fix/intel' into asoc-linusMark Brown
2016-08-24Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linusMark Brown
2016-08-24Merge remote-tracking branch 'asoc/fix/core' into asoc-linusMark Brown
2016-08-24drm/amdgpu: skip TV/CV in display parsingAlex Deucher
No asics supported by amdgpu support analog TV. Workaround for bug: https://bugs.freedesktop.org/show_bug.cgi?id=97460 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-08-24Merge tag 'for-linus-4.8b-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen regression fix from David Vrabel: "Fix a regression in the xenbus device preventing userspace tools from working" * tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: change the type of xen_vcpu_id to uint32_t xenbus: don't look up transaction IDs for ordinary writes
2016-08-24drm/amdgpu: avoid a possible array overflowAlex Deucher
When looking up the connector type make sure the index is valid. Avoids a later crash if we read past the end of the array. Workaround for bug: https://bugs.freedesktop.org/show_bug.cgi?id=97460 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-08-24clk: tegra: remove TEGRA_PLL_USE_LOCK for PLLD/PLLD2Vince Hsu
Tegra114 has a HW bug that the PLLD/PLLD2 lock bit cannot be asserted when the DIS power domain is during up-powergating process but the clamp to this domain is not removed yet. That causes a timeout and aborts the power sequence, although the PLLD/PLLD2 has already locked. To remove the false alarm, we don't use the lock for PLLD/PLLD2. Just wait 1ms and treat the clocks as locked. Signed-off-by: Vince Hsu <vinceh@nvidia.com> Tested-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-08-24raid5: avoid unnecessary bio data setShaohua Li
bio_reset doesn't change bi_io_vec and bi_max_vecs, so we don't need to set them every time. bi_private will be set before the bio is dispatched. Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24raid5: fix memory leak of bio integrity dataShaohua Li
Yi reported a memory leak of raid5 with DIF/DIX enabled disks. raid5 doesn't alloc/free bio, instead it reuses bios. There are two issues in current code: 1. the code calls bio_init (from init_stripe->raid5_build_block->bio_init) then bio_reset (ops_run_io). The bio is reused, so likely there is integrity data attached. bio_init will clear a pointer to integrity data and makes bio_reset can't release the data 2. bio_reset is called before dispatching bio. After bio is finished, it's possible we don't free bio's integrity data (eg, we don't call bio_reset again) Both issues will cause memory leak. The patch moves bio_init to stripe creation and bio_reset to bio end io. This will fix the two issues. Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24raid10: record correct address of bad blockTomasz Majchrzak
For failed write request record block address on a device, not block address in an array. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24md-cluster: fix error return code in join()Wei Yongjun
Fix to return error code -ENOMEM from the lockres_init() error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24r5cache: set MD_JOURNAL_CLEAN correctlySong Liu
Currently, the code sets MD_JOURNAL_CLEAN when the array has MD_FEATURE_JOURNAL and the recovery_cp is MaxSector. The array will be MD_JOURNAL_CLEAN even if the journal device is missing. With this patch, the MD_JOURNAL_CLEAN is only set when the journal device presents. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-08-24xen: change the type of xen_vcpu_id to uint32_tVitaly Kuznetsov
We pass xen_vcpu_id mapping information to hypercalls which require uint32_t type so it would be cleaner to have it as uint32_t. The initializer to -1 can be dropped as we always do the mapping before using it and we never check the 'not set' value anyway. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24xenbus: don't look up transaction IDs for ordinary writesJan Beulich
This should really only be done for XS_TRANSACTION_END messages, or else at least some of the xenstore-* tools don't work anymore. Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-08-24mlxsw: router: Enable neighbors to be created on stacked devicesYotam Gigi
Make the function mlxsw_router_neigh_construct search the rif according to the neighbour dev other than the dev that was passed to the ndo, thus allowing creating neigbhours upon stacked devices. Fixes: 6cf3c971dc84 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-24mlxsw: spectrum: Add missing flood to router portIdo Schimmel
In case we have a layer 3 interface on top of a bridge (VLAN / FID RIF), then we should flood the following packet types to the router: * Broadcast: If DIP is the broadcast address of the interface, then we need to be able to get it to CPU by trapping it following route lookup. * Reserved IP multicast (224.0.0.X): Some control packets (e.g. OSPF) use this range and are trapped in the router block. Fixes: 99f44bb3527b ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-24RDMA/ocrdma: Fix the max_sge reported from FWSelvin Xavier
Current driver is reporting wrong values for max_sge and max_sge_rd in query_device. This breaks the nfs rdma and iser in some device profiles. Fixing the driver to report correct values from FW. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24i40iw: Avoid writing to freed memoryMustafa Ismail
iwpbl->iwmr points to the structure that contains iwpbl, which is iwmr. Setting this to NULL would result in writing to freed memory. So just free iwmr, and return. Fixes: d37498417947 ("i40iw: add files for iwarp interface") Reported-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24i40iw: Fix double free of allocated_bufferMustafa Ismail
Memory allocated for iwqp; iwqp->allocated_buffer is freed twice in the create_qp error path. Correct this by having it freed only once in i40iw_free_qp_resources(). Fixes: d37498417947 ("i40iw: add files for iwarp interface") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24IB/mlx5: Remove superfluous include of io-mapping.hChris Wilson
This file does not use any structs or functions defined by io-mapping.h (nor does it directly use iomap, ioremap, iounamp or friends). Remove it to simplify verification of changes to io-mapping.h The include existed since its inception in commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c Author: Eli Cohen <eli@mellanox.com> Date: Sun Jul 7 17:25:49 2013 +0300 mlx5: Add driver for Mellanox Connect-IB adapters which looks like a copy across from the Mellanox ethernet driver. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Eli Cohen <eli@mellanox.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Matan Barak <matanb@mellanox.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24i40iw: Do not set self-referencing pointer to NULL after kfreeMustafa Ismail
In i40iw_free_virt_mem(), do not set mem->va to NULL after freeing it as mem->va is a self-referencing pointer to mem. Fixes: 4e9042e647ff ("i40iw: add hw and utils files") Reported-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24i40iw: Add missing NULL check for MPA private dataShiraz Saleem
Add NULL check for pdata and pdata->addr before the memcpy in i40iw_form_cm_frame(). This fixes a NULL pointer de-reference which occurs when the MPA private data pointer is NULL. Also only copy pdata->size bytes in the memcpy to prevent reading past the length of the private data buffer provided by upper layer. Fixes: f27b4746f378 ("i40iw: add connection management code") Reported-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-24Bluetooth: split sk_filter in l2cap_sock_recv_cbDaniel Borkmann
During an audit for sk_filter(), we found that rx_busy_skb handling in l2cap_sock_recv_cb() and l2cap_sock_recvmsg() looks not quite as intended. The assumption from commit e328140fdacb ("Bluetooth: Use event-driven approach for handling ERTM receive buffer") is that errors returned from sock_queue_rcv_skb() are due to receive buffer shortage. However, nothing should prevent doing a setsockopt() with SO_ATTACH_FILTER on the socket, that could drop some of the incoming skbs when handled in sock_queue_rcv_skb(). In that case sock_queue_rcv_skb() will return with -EPERM, propagated from sk_filter() and if in L2CAP_MODE_ERTM mode, wrong assumption was that we failed due to receive buffer being full. From that point onwards, due to the to-be-dropped skb being held in rx_busy_skb, we cannot make any forward progress as rx_busy_skb is never cleared from l2cap_sock_recvmsg(), due to the filter drop verdict over and over coming from sk_filter(). Meanwhile, in l2cap_sock_recv_cb() all new incoming skbs are being dropped due to rx_busy_skb being occupied. Instead, just use __sock_queue_rcv_skb() where an error really tells that there's a receive buffer issue. Split the sk_filter() and enable it for non-segmented modes at queuing time since at this point in time the skb has already been through the ERTM state machine and it has been acked, so dropping is not allowed. Instead, for ERTM and streaming mode, call sk_filter() in l2cap_data_rcv() so the packet can be dropped before the state machine sees it. Fixes: e328140fdacb ("Bluetooth: Use event-driven approach for handling ERTM receive buffer") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>