summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-11-10bcache: Kill op->clKent Overstreet
This isn't used for waiting asynchronously anymore - so this is a fairly trivial refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Prune struct btree_opKent Overstreet
Eventual goal is for struct btree_op to contain only what is necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Clean up cache_lookup_fnKent Overstreet
There was some looping in submit_partial_cache_hit() and submit_partial_cache_hit() that isn't needed anymore - originally, we wouldn't necessarily process the full hit or miss all at once because when splitting the bio, we took into account the restrictions of the device we were sending it to. But, device bio size restrictions are now handled elsewhere, with a wrapper around generic_make_request() - so that looping has been unnecessary for awhile now and we can now do quite a bit of cleanup. And if we trim the key we're reading from to match the subset we're actually reading, we don't have to explicitly calculate bi_sector anymore. Neat. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Convert bch_btree_read_async() to bch_btree_map_keys()Kent Overstreet
This is a fairly straightforward conversion, mostly reshuffling - op->lookup_done goes away, replaced by MAP_DONE/MAP_CONTINUE. And the code for handling cache hits and misses wasn't really btree code, so it gets moved to request.c. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Move some stuff to btree.cKent Overstreet
With the new btree_map() functions, we don't need to export the stuff needed for traversing the btree anymore. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Add btree_map() functionsKent Overstreet
Lots of stuff has been open coding its own btree traversal - which is generally pretty simple code, but there are a few subtleties. This adds new new functions, bch_btree_map_nodes() and bch_btree_map_keys(), which do the traversal for you. Everything that's open coding btree traversal now (with the exception of garbage collection) is slowly going to be converted to these two functions; being able to write other code at a higher level of abstraction is a big improvement w.r.t. overall code quality. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Convert writeback to a kthreadKent Overstreet
This simplifies the writeback flow control quite a bit - previously, it was conceptually two coroutines, refill_dirty() and read_dirty(). This makes the code quite a bit more straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Convert gc to a kthreadKent Overstreet
We needed a dedicated rescuer workqueue for gc anyways... and gc was conceptually a dedicated thread, just one that wasn't running all the time. Switch it to a dedicated thread to make the code a bit more straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Convert bucket_wait to wait_queue_head_tKent Overstreet
At one point we did do fancy asynchronous waiting stuff with bucket_wait, but that's all gone (and bucket_wait is used a lot less than it used to be). So use the standard primitives. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Convert try_wait to wait_queue_head_tKent Overstreet
We never waited on c->try_wait asynchronously, so just use the standard primitives. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Move keylist out of btree_opKent Overstreet
Slowly working on pruning struct btree_op - the aim is for it to only contain things that are actually necessary for traversing the btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Refactor journalling flow controlKent Overstreet
Making things less asynchronous that don't need to be - bch_journal() only has to block when the journal or journal entry is full, which is emphatically not a fast path. So make it a normal function that just returns when it finishes, to make the code and control flow easier to follow. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Refactor read request code a bitKent Overstreet
More refactoring, and renaming. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Refactor request_write()Kent Overstreet
Try to improve some of the naming a bit to be more consistent, and also improve the flow of control in request_write() a bit. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Clean up keylist codeKent Overstreet
More random refactoring. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Add explicit keylist arg to btree_insert()Kent Overstreet
Some refactoring - better to explicitly pass stuff around instead of having it all in the "big bag of state", struct btree_op. Going to prune struct btree_op quite a bit over time. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Convert btree_insert_check_key() to btree_insert_node()Kent Overstreet
This was the main point of all this refactoring - now, btree_insert_check_key() won't fail just because the leaf node happened to be full. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Insert multiple keys at a timeKent Overstreet
We'll often end up with a list of adjacent keys to insert - because bch_data_insert() may have to fragment the data it writes. Originally, to simplify things and avoid having to deal with corner cases bch_btree_insert() would pass keys from this list one at a time to btree_insert_recurse() - mainly because the list of keys might span leaf nodes, so it was easier this way. With the btree_insert_node() refactoring, it's now a lot easier to just pass down the whole list and have btree_insert_recurse() iterate over leaf nodes until it's done. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Add btree_insert_node()Kent Overstreet
The flow of control in the old btree insertion code was rather - backwards; we'd recurse down the btree (in btree_insert_recurse()), and then if we needed to split the keys to be inserted into the parent node would be effectively returned up to btree_insert_recurse(), which would notice there was more work to do and finish the insertion. The main problem with this was that the full logic for btree insertion could only be used by calling btree_insert_recurse; if you'd gotten to a btree leaf some other way and had a key to insert, if it turned out that node needed to be split you were SOL. This inverts the flow of control so btree_insert_node() does _full_ btree insertion, including splitting - and takes a (leaf) btree node to insert into as a parameter. This means we can now _correctly_ handle cache misses - for cache misses, we need to insert a fake "check" key into the btree when we discover we have a cache miss - while we still have the btree locked. Previously, if the btree node was full inserting a cache miss would just fail. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Explicitly track btree node's parentKent Overstreet
This is prep work for the reworked btree insertion code. The way we set b->parent is ugly and hacky... the problem is, when btree_split() or garbage collection splits or rewrites a btree node, the parent changes for all its (potentially already cached) children. I may change this later and add some code to look through the btree node cache and find all our cached child nodes and change the parent pointer then... Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Remove unnecessary check in should_split()Kent Overstreet
Checking i->seq was redundant, because since ages ago we always initialize the new bset when advancing b->written Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Stripe size isn't necessarily a power of twoKent Overstreet
Originally I got this right... except that the divides didn't use do_div(), which broke 32 bit kernels. When I went to fix that, I forgot that the raid stripe size usually isn't a power of two... doh Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Add on error panic/unregister settingKent Overstreet
Works kind of like the ext4 setting, to panic or remount read only on errors. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Use blkdev_issue_discard()Kent Overstreet
The old asynchronous discard code was really a relic from when all the allocation code was asynchronous - now that allocation runs out of a dedicated thread there's no point in keeping around all that complicated machinery. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Fix a lockdep splatKent Overstreet
bch_keybuf_del() takes a spinlock that can't be taken in interrupt context - whoops. Fortunately, this code isn't enabled by default (you have to toggle a sysfs thing). Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10bcache: Fix a journalling performance bugKent Overstreet
2013-11-10bcache: Fix dirty_data accountingKent Overstreet
Dirty data accounting wasn't quite right - firstly, we were adding the key we're inserting after it could have merged with another dirty key already in the btree, and secondly we could sometimes pass the wrong offset to bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is important when tracking dirty data by stripe. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
2013-11-11vlan: Implement vlan_dev_get_egress_qos_mask as an inline.David S. Miller
This is to avoid very silly Kconfig dependencies for modules using this routine. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11ixgbe: add warning when max_vfs is out of range.Jacob Keller
The max_vfs parameter has a limit of 63 and silently fails (adding 0 vfs) when it is out of range. This patch adds a warning so that the user knows something went wrong. Also, this patch moves the warning in ixgbe_enable_sriov() to where max_vfs is checked, so that even an out of range value will show the deprecated warning. Previously, an out of range parameter didn't even warn the user to use the new sysfs interface instead. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11igb: Update link modes display in ethtoolCarolyn Wyborny
This patch fixes multiple problems in the link modes display in ethtool. Newer parts have more complicated methods to determine actual link capabilities. Older parts cannot communicate with their SFP modules. Finally, all the available defines are not displayed by ethtool. This updates the link modes to be as accurate as possible depending on what data is available to the driver at any given time. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11netfilter: push reasm skb through instead of original frag skbsJiri Pirko
Pushing original fragments through causes several problems. For example for matching, frags may not be matched correctly. Take following example: <example> On HOSTA do: ip6tables -I INPUT -p icmpv6 -j DROP ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT and on HOSTB you do: ping6 HOSTA -s2000 (MTU is 1500) Incoming echo requests will be filtered out on HOSTA. This issue does not occur with smaller packets than MTU (where fragmentation does not happen) </example> As was discussed previously, the only correct solution seems to be to use reassembled skb instead of separete frags. Doing this has positive side effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams dances in ipvs and conntrack can be removed. Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c entirely and use code in net/ipv6/reassembly.c instead. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11ip6_output: fragment outgoing reassembled skb properlyJiri Pirko
If reassembled packet would fit into outdev MTU, it is not fragmented according the original frag size and it is send as single big packet. The second case is if skb is gso. In that case fragmentation does not happen according to the original frag size. This patch fixes these. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11virtio_scsi: verify if queue is broken after virtqueue_get_buf()Heinz Graalfs
If virtqueue_get_buf() returned with a NULL pointer avoid a possibly endless loop by checking for a broken virtqueue. Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-11-11f2fs: issue more large discard commandChangman Lee
o Changes from v1 Use find_next(_zero)_bit suggested by jg.kim When f2fs issues discard command, if segment is contiguous, let's issue more large segment to gather adjacent segments. ** blktrace ** 179,1 0 5859 42.619023770 971 C D 131072 + 2097152 [0] 179,1 0 33665 108.840475468 971 C D 2228224 + 2494464 [0] 179,1 0 33671 109.131616427 971 C D 14909440 + 344064 [0] 179,1 0 33677 109.137100677 971 C D 15261696 + 4096 [0] Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-11-11Merge tag 'gfs2-merge-window' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw Pull gfs2 updates from Steven Whitehouse: "The main feature of interest this time is quota updates. There are some clean ups and some patches to use the new generic lru list code. There is still plenty of scope for some further changes in due course - faster lookups of quota structures is very much on the todo list. Also, a start has been made towards the more tricky issue of using the generic lru code with glocks, but that will have to be completed in a subsequent merge window. The other, more minor feature, is that there have been a number of performance patches which relate to block allocation. In particular they will improve performance when the disk is nearly full" * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: GFS2: Use generic list_lru for quota GFS2: Rename quota qd_lru_lock qd_lock GFS2: Use reflink for quota data cache GFS2: Use lockref for glocks GFS2: Protect quota sync generation GFS2: Inline qd_trylock into gfs2_quota_unlock GFS2: Make two similar quota code fragments into a function GFS2: Remove obsolete quota tunable GFS2: Move gfs2_icbit_munge into quota.c GFS2: Speed up starting point selection for block allocation GFS2: Add allocation parameters structure GFS2: Clean up reservation removal GFS2: fix dentry leaks GFS2: new function gfs2_rbm_incr GFS2: Introduce rbm field bii GFS2: Do not reset flags on active reservations GFS2: introduce bi_blocks for optimization GFS2: optimize rbm_from_block wrt bi_start GFS2: d_splice_alias() can't return error
2013-11-11Merge branch 'gma500-next' of git://github.com/patjak/drm-gma500 into drm-nextDave Airlie
SDVO support for minnowboard * 'gma500-next' of git://github.com/patjak/drm-gma500: drm/gma500/mrst: Add SDVO to output init drm/gma500/mrst: Don't blindly guess a mode for LVDS drm/gma500/mrst: Setup GMBUS for oaktrail/mrst drm/gma500/mrst: Replace WMs and chickenbits with values from EMGD drm/gma500/mrst: Add aux register writes to SDVO drm/gma500/mrst: Properly route oaktrail hdmi hooks drm/gma500/mrst: Add aux register writes when programming pipe drm/gma500/mrst: Add SDVO clock calculation drm/gma500: Add aux device support for gmbus drm/gma500: Add support for aux pci vdc device drm/gma500: Add chip specific sdvo masks drm/gma500: Add Minnowboard to the IS_MRST() macro
2013-11-10drm: shmob_drm: Convert to clk_prepare/unprepareLaurent Pinchart
Turn clk_enable() and clk_disable() calls into clk_prepare_enable() and clk_disable_unprepare() to get ready for the migration to the common clock framework. Cc: David Airlie <airlied@linux.ie> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-11-10Merge tag 'bdw-stage1-2013-11-08-v2' of ↵Dave Airlie
git://people.freedesktop.org/~danvet/drm-intel into drm-next So here's the Broadwell pull request. From a kernel driver pov there's two areas with big changes in Broadwell: - Completely new enumerated interrupt bits. On the plus side it now looks fairly unform and sane. - Completely new pagetable layout. To ensure minimal impact on existing platforms we've refactored both the irq and low-level gtt handling code a lot in anticipation of the bdw push. So now bdw enabling in these areas just plugs in a bunch of vfuncs. Otherwise it's all fairly harmless adjusting of switch cases and if-ladders to shovel bdw into the right blocks. So minimized impact on existing platforms. I've also merged the bdw-stage1 branch into our -nightly integration branch for the past week to make sure we don't break anything. Note that there's still quite a flurry or patches floating around, but I've figured I'll push this out. I plan to keep the bdw fixes separate from my usual -fixes stream so that you can reject them easily in case it still looks like too much churn. Also, bdw is for now hidden behind the preliminary hw enabling module option. So there's no real pressure to get follow-up patches all into 3.13. * tag 'bdw-stage1-2013-11-08-v2' of git://people.freedesktop.org/~danvet/drm-intel: (75 commits) drm/i915: Mask the vblank interrupt on bdw by default drm/i915: Wire up cpu fifo underrun reporting support for bdw drm/i915: Optimize gen8_enable|disable_vblank functions drm/i915: Wire up pipe CRC support for bdw drm/i915: Wire up PCH interrupts for bdw drm/i915: Wire up port A aux channel drm/i915: Fix up the bdw pipe interrupt enable lists drm/i915: Optimize pipe irq handling on bdw drm/i915/bdw: Take render error interrupt out of the mask drm/i915/bdw: Add BDW PCH check first drm/i915: Use hsw_crt_get_config on BDW drm/i915/bdw: Change dp aux timeout to 600us on DDIA drm/i915/bdw: Enable trickle feed on Broadwell drm/i915/bdw: WaSingleSubspanDispatchOnAALinesAndPoints drm/i915/bdw: conservative SBE VUE cache mode drm/i915/bdw: Limit SDE poly depth FIFO to 2 drm/i915/bdw: Sampler power bypass disable ddrm/i915/bdw: Disable centroid pixel perf optimization drm/i915/bdw: BWGTLB clock gate disable drm/i915/bdw: Implement edp PSR workarounds ...
2013-11-10Merge branch 'drm-next-3.13' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next A few more patches for 3.13. The big one here is Hawaii support. I wanted to get that out sooner, but was sick earlier this week. That said, it's mostly self contained, so it shouldn't impact other asics. The rest are just bug fixes and a merge fix. * 'drm-next-3.13' of git://people.freedesktop.org/~agd5f/linux: (23 commits) Revert "drm/radeon/audio: don't set speaker allocation on DCE4+" drm/radeon/audio: improve ACR calculation drm/radeon/audio: correct ACR table drm/radeon: fix mismerge of drm-next with 3.12 drm/radeon: add pci ids for hawaii drm/radeon: fill in radeon_asic_init for hawaii drm/radeon: modesetting updates for hawaii drm/radeon: atombios.h updates for hawaii drm/radeon: update cik_get_csb_buffer for hawaii drm/radeon: add hawaii dpm support drm/radeon/cik: add hawaii UVD support drm/radeon: update firmware loading for hawaii drm/radeon: update rb setup for hawaii drm/radeon: add golden register settings for hawaii drm/radeon: update cik_tiling_mode_table_init() for hawaii drm/radeon: minor updates to cik.c for hawaii drm/radeon: update cik_gpu_init() for hawaii drm/radeon: add Hawaii chip family drm/radeon: fix-up some float to fixed conversion thinkos drm/radeon: use HDP_MEM_COHERENCY_FLUSH_CNTL for sdma as well ...
2013-11-10Merge branch 'msm-next' of git://people.freedesktop.org/~robclark/linux into ↵Dave Airlie
drm-next prime support, inactive rework, render nodes * 'msm-next' of git://people.freedesktop.org/~robclark/linux: drm/msm/mdp4: page_flip cleanups/fixes drm/msm: EBUSY status handling in msm_gem_fault() drm/msm: rework inactive-work drm/msm: add plane support drm/msm: resync generated headers drm/msm: support render nodes drm/msm: prime support
2013-11-10Merge tag 'fcoe-3.13' into for-linusJames Bottomley
Pull Request for 3.13 for FCOE tree. Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-11-09ecryptfs: ->f_op is never NULLAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-11-10drm/nouveau: fix 32-bit buildDave Airlie
This uses the proper div macro. Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-11-09dm cache: promotion optimisation for writesJoe Thornber
If a write block triggers promotion and covers a whole block we can avoid a copy. Introduce dm_{hook,unhook}_bio to simplify saving and restoring bio fields (bi_private is now used by overwrite). Switch writethrough support over to using these helpers too. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-11-09dm cache: be much more aggressive about promoting writes to discarded blocksJoe Thornber
Previously these promotions only got priority if there were unused cache blocks. Now we give them priority if there are any clean blocks in the cache. The fio_soak_test in the device-mapper-test-suite now gives uniform performance across subvolumes (~16 seconds). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-11-09dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty()Joe Thornber
There are now two multiqueues for in cache blocks. A clean one and a dirty one. writeback_work comes from the dirty one. Demotions come from the clean one. There are two benefits: - Performance improvement, since demoting a clean block is a noop. - The cache cleans itself when io load is light. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-11-09dm cache: optimize commit_if_neededHeinz Mauelshagen
Check commit_requested flag _before_ calling dm_cache_changed_this_transaction() superfluously. Also, be sure to set last_commit_jiffies _after_ dm_cache_commit() completes. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-11-09dm space map disk: optimise sm_disk_dec_blockJoe Thornber
Don't waste time spotting blocks that have been allocated and then freed in the same transaction. The extra lookup is expensive, and I don't think it really gives us much. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-11-09MAINTAINERS: add reference to device-mapper's linux-dm.git treeMike Snitzer
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2013-11-09dm: fix Kconfig menu indentationMikulas Patocka
The option DM_LOG_USERSPACE is sub-option of DM_MIRROR, so place it right after DM_MIRROR. Doing so fixes various other Device mapper targets/features to be properly nested under "Device mapper support". Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>