linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2013-03-24	bnx2x: fix assignment of signed expression to unsigned variable	Kumar Amit Mehta
	fix for incorrect assignment of signed expression to unsigned variable. Signed-off-by: Kumar Amit Mehta <gmate.amit@gmail.com> Acked-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24	bridge: fix crash when set mac address of br interface	Hong zhi guo
	When I tried to set mac address of a bridge interface to a mac address which already learned on this bridge, I got system hang. The cause is straight forward: function br_fdb_change_mac_address calls fdb_insert with NULL source nbp. Then an fdb lookup is performed. If an fdb entry is found and it's local, it's OK. But if it's not local, source is dereferenced for printk without NULL check. Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24	8021q: fix a potential use-after-free	Cong Wang
	vlan_vid_del() could possibly free ->vlan_info after a RCU grace period, however, we may still refer to the freed memory area by 'grp' pointer. Found by code inspection. This patch moves vlan_vid_del() as behind as possible. Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24	net: remove a WARN_ON() in net_enable_timestamp()	Eric Dumazet
	The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false positive, in socket clone path, run from softirq context : [ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80() [ 3641.668811] Call Trace: [ 3641.671254] <IRQ> [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0 [ 3641.677871] [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20 [ 3641.683683] [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80 [ 3641.689668] [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450 [ 3641.695222] [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170 [ 3641.701213] [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820 [ 3641.707663] [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670 [ 3641.713354] [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0 [ 3641.719425] [<ffffffff807af63a>] tcp_check_req+0x3da/0x530 [ 3641.724979] [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80 [ 3641.730964] [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0 [ 3641.736430] [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0 [ 3641.741985] [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0 Its safe at this point because the parent socket owns a reference on the netstamp_needed, so we cant have a 0 -> 1 transition, which requires to lock a mutex. Instead of refining the check, lets remove it, as all known callers are safe. If it ever changes in the future, static_key_slow_inc() will complain anyway. Reported-by: Laurent Chavey <chavey@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-24	Merge tag 'pinctrl-fixes-for-v3.9' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pinctrl fixes from Linus Walleij: "Here are a few pinctrl fixes for the v3.9 rc series: - Usecount bounds checking so we do not go below zero on mux usecounts. - Loop range checking in GPIO ranges in the DT range parser. - Proper print in debugfs for pinconf state. - Fix compilation bug in generic pinconf code. - Minor bugfixes to abx500 and mvebu drivers." * tag 'pinctrl-fixes-for-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinmux: forbid mux_usecount to be set at UINT_MAX pinctrl: mvebu: fix checking for SoC specific controls pinctrl: generic: Fix compilation error pinctrl: Print the correct information in debugfs pinconf-state file pinctrl: abx500: Fix checking if pin use AlternateFunction register gpio: fix wrong checking condition for gpio range
2013-03-24	Merge branch 'x86/urgent' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "A collection of minor fixes, more EFI variables paranoia (anti-bricking) plus the ability to disable the pstore either as a runtime default or completely, due to bricking concerns." * 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efivars: Fix check for CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE x86, microcode_intel_early: Mark apply_microcode_early() as cpuinit efivars: Handle duplicate names from get_next_variable() efivars: explicitly calculate length of VariableName efivars: Add module parameter to disable use as a pstore backend efivars: Allow disabling use as a pstore backend x86-32, microcode_intel_early: Fix crash with CONFIG_DEBUG_VIRTUAL x86-64: Fix the failure case in copy_user_handle_tail()
2013-03-24	Revert "drm/i915: write backlight harder"	Daniel Vetter
	This reverts commit cf0a6584aa6d382f802f2c3cacac23ccbccde0cd. Turns out that cargo-culting breaks systems. Note that we can't revert further, since commit 770c12312ad617172b1a65b911d3e6564fc5aca8 Author: Takashi Iwai <tiwai@suse.de> Date: Sat Aug 11 08:56:42 2012 +0200 drm/i915: Fix blank panel at reopening lid fixed a regression in 3.6-rc kernels for which we've never figured out the exact root cause. But some further inspection of the backlight code reveals that it's seriously lacking locking. And especially the asle backlight update is know to get fired (through some smm magic) when writing specific backlight control registers. So the possibility of suffering from races is rather real. Until those races are fixed I don't think it makes sense to try further hacks. Which sucks a bit, but sometimes that's how it is :( References: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg18788.html Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47941 Tested-by: Takashi Iwai <tiwai@suse.de> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: stable@vger.kernel.org (the reverted commit was cc: stable, too) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-24	drm/i915: don't disable the power well yet	Paulo Zanoni
	We're still not 100% ready to disable the power well, so don't disable it for now. When we disable it we break the audio driver (because some of the audio registers are on the power well) and machines with eDP on port D (because it doesn't use TRANSCODER_EDP). Also, instead of just reverting the code, add a Kernel option to let us disable it if we want. This will allow us to keep developing and testing the feature while it's not enabled. This fixes problems caused by the following commit: commit d6dd9eb1d96d2b7345fe4664066c2b7ed86da898 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jan 29 16:35:20 2013 -0200 drm/i915: dynamic Haswell display power well support References: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg18788.html Cc: Takashi Iwai <tiwai@suse.de> Cc: Mengdong Lin <mengdong.lin@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-24	Revert "drm/i915: set TRANSCODER_EDP even earlier"	Daniel Vetter
	This reverts commit cc464b2a17c59adedbdc02cc54341d630354edc3. The reason is that Takashi Iwai reported a regression bisected to this commit: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg18788.html His machine has eDP on port D (usual desktop all-in-on setup), which intel_dp.c identifies as an eDP panel, but the hsw ddi code mishandles. Closer inspection of the code reveals that haswell_crtc_mode_set also checks intel_encoder_is_pch_edp when setting is_cpu_edp. On haswell that doesn't make much sense (since there's no edp on the pch), but what this function _really_ checks is whether that edp connector is on port A or port D. It's just that on ilk-ivb port D was on the pch ... So that explains why this seemingly innocent change killed eDP on port D. Furthermore it looks like everything else accidentally works, since we've never enabled eDP on port D support for hsw intentionally (e.g. we still register the HDMI output for port D in that case). But in retrospective I also don't like that this leaks highly platform specific details into common code, and the reason is that the drm vblank layer sucks. So instead I think we should: - move the cpu_transcoder into the dynamic pipe_config tracking (once that's merged). - fix up the drm vblank layer to finally deal with kms crtc objects instead of int pipes. v2: Pimp commit message with the better diagnosis as discussed with Paulo on irc. Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-23	Merge tag 'efi-for-3.9-rc4' into x86/urgent	H. Peter Anvin
	Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-03-23	Linux 3.9-rc4v3.9-rc4	Linus Torvalds

2013-03-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending	Linus Torvalds
	Pull SCSI target fixes from Nicholas Bellinger: "These are mostly minor fixes this time around. The iscsi-target CHAP big-endian bugfix and bump FD_MAX_SECTORS=2048 default patch to allow 1MB sized I/Os for FILEIO backends on >= v3.5 code are both CC'ed to stable. Also, there is a persistent reservations regression that has recently been reported for >= v3.8.x code, that is currently being tracked down for v3.9." * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target/pscsi: Reject cross page boundary case in pscsi_map_sg target/file: Bump FD_MAX_SECTORS to 2048 to handle 1M sized I/Os tcm_vhost: Flush vhost_work in vhost_scsi_flush() tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() target: fix possible memory leak in core_tpg_register() target/iscsi: Fix mutual CHAP auth on big-endian arches target_core_sbc: use noop for SYNCHRONIZE_CACHE
2013-03-23	Merge tag 'md-3.9-fixes' of git://neil.brown.name/md	Linus Torvalds
	Pull md fixes from NeilBrown: "A few bugfixes for md - recent regressions in raid5 - recent regressions in dmraid - a few instances of CONFIG_MULTICORE_RAID456 linger Several tagged for -stable" * tag 'md-3.9-fixes' of git://neil.brown.name/md: md: remove CONFIG_MULTICORE_RAID456 entirely md/raid5: ensure sync and DISCARD don't happen at the same time. MD: Prevent sysfs operations on uninitialized kobjects MD RAID5: Avoid accessing gendisk or queue structs when not available md/raid5: schedule_construction should abort if nothing to do.
2013-03-23	bio-integrity: Add explicit field for owner of bip_buf	Kent Overstreet
	This was the only real user of BIO_CLONED, which didn't have very clear semantics. Convert to its own flag so we can get rid of BIO_CLONED. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23	block: Add an explicit bio flag for bios that own their bvec	Kent Overstreet
	This is for the new bio splitting code. When we split a bio, if the split occured on a bvec boundry we reuse the bvec for the new bio. But that means bio_free() can't free it, hence the explicit flag. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> Acked-by: Tejun Heo <tj@kernel.org>
2013-03-23	block: Add bio_alloc_pages()	Kent Overstreet
	More utility code to replace stuff that's getting open coded. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de>
2013-03-23	block: Convert some code to bio_for_each_segment_all()	Kent Overstreet
	More prep work for immutable bvecs: A few places in the code were either open coding or using the wrong version - fix. After we introduce the bvec iter, it'll no longer be possible to modify the biovec through bio_for_each_segment_all() - it doesn't increment a pointer to the current bvec, you pass in a struct bio_vec (not a pointer) which is updated with what the current biovec would be (taking into account bi_bvec_done and bi_size). So because of that it's more worthwhile to be consistent about bio_for_each_segment()/bio_for_each_segment_all() usage. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de> CC: Alasdair Kergon <agk@redhat.com> CC: dm-devel@redhat.com CC: Alexander Viro <viro@zeniv.linux.org.uk>
2013-03-23	block: Add bio_for_each_segment_all()	Kent Overstreet
	__bio_for_each_segment() iterates bvecs from the specified index instead of bio->bv_idx. Currently, the only usage is to walk all the bvecs after the bio has been advanced by specifying 0 index. For immutable bvecs, we need to split these apart; bio_for_each_segment() is going to have a different implementation. This will also help document the intent of code that's using it - bio_for_each_segment_all() is only legal to use for code that owns the bio. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Neil Brown <neilb@suse.de> CC: Boaz Harrosh <bharrosh@panasas.com>
2013-03-23	bounce: Refactor __blk_queue_bounce to not use bi_io_vec	Kent Overstreet
	A bunch of what __blk_queue_bounce() was doing was problematic for the immutable bvec work; this cleans that up and the code is quite a bit smaller, too. The __bio_for_each_segment() in copy_to_high_bio_irq() was changed because that one's looping over the original bio, not the bounce bio - a later patch renames __bio_for_each_segment() -> bio_for_each_segment_all(), and documents that bio_for_each_segment_all() is only for code that owns the bio. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>
2013-03-23	raid1: use bio_copy_data()	Kent Overstreet
	This doesn't really delete any code _yet_, but once immutable bvecs are done we can just delete the rest of the code in that loop. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de>
2013-03-23	pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage	Kent Overstreet
	In the short term this'll help with code auditing, and if this code ever gets used now it's converted :) Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jiri Kosina <jkosina@suse.cz>
2013-03-23	pktcdvd: use bio_copy_data()	Kent Overstreet
	Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Jiri Kosina <jkosina@suse.cz>
2013-03-23	block: Add bio_copy_data()	Kent Overstreet
	This gets open coded quite a bit and it's tricky to get right, so make a generic version and convert some existing users over to it instead. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>
2013-03-23	raid1: Refactor narrow_write_error() to not use bi_idx	Kent Overstreet
	More bi_idx removal. This code was just open coding bio_clone(). This could probably be further improved by using bio_advance() instead of skipping over null pages, but that'd be a larger rework. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de>
2013-03-23	raid5: use bio_reset()	Kent Overstreet
	Had to shuffle the code around a bit (where bi_rw and bi_end_io were set), but shouldn't really be anything tricky here Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de>
2013-03-23	raid1: use bio_reset()	Kent Overstreet
	Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de>
2013-03-23	raid10: Use bio_reset()	Kent Overstreet
	More prep work for immutable bio vecs, mainly getting rid of references to bi_idx. bio_reset was being open coded in a few places. The one in sync_request was a bit nontrivial to convert, so could use some extra eyeballs. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de> Acked-by: NeilBrown <neilb@suse.de>
2013-03-23	block: Add submit_bio_wait(), remove from md	Kent Overstreet
	Random cleanup - this code was duplicated and it's not really specific to md. Also added the ability to return the actual error code. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de> Acked-by: Tejun Heo <tj@kernel.org>
2013-03-23	block: Remove some unnecessary bi_vcnt usage	Kent Overstreet
	More prep work for immutable bvecs/effecient bio splitting - usage of bi_vcnt has to be auditing, so getting rid of all the unnecessary usage makes that easier. Plus, bio_segments() is really what this code wanted, as it respects the current value of bi_idx. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Eric Moore <Eric.Moore@lsi.com> CC: "James E.J. Bottomley" <JBottomley@parallels.com> CC: linux-scsi@vger.kernel.org
2013-03-23	block: Remove bi_idx references	Kent Overstreet
	For immutable bvecs, all bi_idx usage needs to be audited - so here we're removing all the unnecessary uses. Most of these are places where it was being initialized on a bio that was just allocated, a few others are conversions to standard macros. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>
2013-03-23	block: Change bio_split() to respect the current value of bi_idx	Kent Overstreet
	In the current code bio_split() won't be seeing partially completed bios so this doesn't change any behaviour, but this makes the code a bit clearer as to what bio_split() actually requires. The immediate purpose of the patch is removing unnecessary bi_idx references, but the end goal is to allow partial completed bios to be submitted, which along with immutable biovecs enables effecient bio splitting. Some of the callers were (double) checking that bios could be split, so update their checks too. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Lars Ellenberg <drbd-dev@lists.linbit.com> CC: Neil Brown <neilb@suse.de> CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23	block: Use bio_sectors() more consistently	Kent Overstreet
	Bunch of places in the code weren't using it where they could be - this'll reduce the size of the patch that puts bi_sector/bi_size/bi_idx into a struct bvec_iter. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: "Ed L. Cashin" <ecashin@coraid.com> CC: Nick Piggin <npiggin@kernel.dk> CC: Jiri Kosina <jkosina@suse.cz> CC: Jim Paris <jim@jtan.com> CC: Geoff Levand <geoff@infradead.org> CC: Alasdair Kergon <agk@redhat.com> CC: dm-devel@redhat.com CC: Neil Brown <neilb@suse.de> CC: Steven Rostedt <rostedt@goodmis.org> Acked-by: Ed Cashin <ecashin@coraid.com>
2013-03-23	block: Add bio_end_sector()	Kent Overstreet
	Just a little convenience macro - main reason to add it now is preparing for immutable bio vecs, it'll reduce the size of the patch that puts bi_sector/bi_size/bi_idx into a struct bvec_iter. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Lars Ellenberg <drbd-dev@lists.linbit.com> CC: Jiri Kosina <jkosina@suse.cz> CC: Alasdair Kergon <agk@redhat.com> CC: dm-devel@redhat.com CC: Neil Brown <neilb@suse.de> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: linux-s390@vger.kernel.org CC: Chris Mason <chris.mason@fusionio.com> CC: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com>
2013-03-23	md: Convert md_trim_bio() to use bio_advance()	Kent Overstreet
	Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: NeilBrown <neilb@suse.de> Acked-by: NeilBrown <neilb@suse.de>
2013-03-23	block: Refactor blk_update_request()	Kent Overstreet
	Converts it to use bio_advance(), simplifying it quite a bit in the process. Note that req_bio_endio() now always calls bio_advance() - which means it always loops over the biovec, not just on partial completions. Don't expect it to affect performance, but worth noting. Tested it by forcing partial updates, and dumping before and after on various bio/bvec fields when doing a partial update. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>
2013-03-23	block: Add bio_advance()	Kent Overstreet
	This is prep work for immutable bio vecs; we first want to centralize where bvecs are modified. Next two patches convert some existing code to use this function. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>
2013-03-23	block: Convert integrity to bvec_alloc_bs()	Kent Overstreet
	This adds a pointer to the bvec array to struct bio_integrity_payload, instead of the bvecs always being inline; then the bvecs are allocated with bvec_alloc_bs(). Changed bvec_alloc_bs() and bvec_free_bs() to take a pointer to a mempool instead of the bioset, so that bio integrity can use a different mempool for its bvecs, and thus avoid a potential deadlock. This is eventually for immutable bio vecs - immutable bvecs aren't useful if we still have to copy them, hence the need for the pointer. Less code is always nice too, though. Also, bio_integrity_alloc() was using fs_bio_set if no bio_set was specified. This was wrong - using the bio_set doesn't protect us from memory allocation failures, because we just used kmalloc for the bio_integrity_payload. But it does introduce the possibility of deadlock, if for some reason we weren't supposed to be using fs_bio_set. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23	block: Fix a buffer overrun in bio_integrity_split()	Kent Overstreet
	bio_integrity_split() seemed to be confusing pointers and arrays - bip_vec in bio_integrity_payload was an array appended to the end of the payload, so the bio_vecs in struct bio_pair should have come after the bio_integrity_payload they're for. Fix it by making bip_vec a pointer to the inline vecs - a later patch is going to make more use of this pointer. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: Martin K. Petersen <martin.petersen@oracle.com>
2013-03-23	block: Avoid deadlocks with bio allocation by stacking drivers	Kent Overstreet
	Previously, if we ever try to allocate more than once from the same bio set while running under generic_make_request() (i.e. a stacking block driver), we risk deadlock. This is because of the code in generic_make_request() that converts recursion to iteration; any bios we submit won't actually be submitted (so they can complete and eventually be freed) until after we return - this means if we allocate a second bio, we're blocking the first one from ever being freed. Thus if enough threads call into a stacking block driver at the same time with bios that need multiple splits, and the bio_set's reserve gets used up, we deadlock. This can be worked around in the driver code - we could check if we're running under generic_make_request(), then mask out __GFP_WAIT when we go to allocate a bio, and if the allocation fails punt to workqueue and retry the allocation. But this is tricky and not a generic solution. This patch solves it for all users by inverting the previously described technique. We allocate a rescuer workqueue for each bio_set, and then in the allocation code if there are bios on current->bio_list we would be blocking, we punt them to the rescuer workqueue to be submitted. This guarantees forward progress for bio allocations under generic_make_request() provided each bio is submitted before allocating the next, and provided the bios are freed after they complete. Note that this doesn't do anything for allocation from other mempools. Instead of allocating per bio data structures from a mempool, code should use bio_set's front_pad. Tested it by forcing the rescue codepath to be taken (by disabling the first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot of arbitrary bio splitting) and verified that the rescuer was being invoked. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Muthukumar Ratty <muthur@gmail.com>
2013-03-23	block: Reorder struct bio_set	Kent Overstreet
	This is prep work for the next patch, which embeds a struct bio_list in struct bio_set. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk>
2013-03-23	Merge tag 'upstream-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev Pull libata updates from Jeff Garzik: "Simple stuff. See one-line summaries." * tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: pata_samsung_cf: use module_platform_driver_probe() [libata] Avoid specialized TLA's in ZPODD's Kconfig libata-acpi.c: fix copy and paste mistake in ata_acpi_register_power_resource sata_fsl: Remove redundant NULL check before kfree ahci: Add Device IDs for Intel Wellsburg PCH ata_piix: Add MODULE_PARM_DESC to prefer_ms_hyperv
2013-03-23	Merge branch 'i2c/for-current' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "One bugfix for the tegra driver. Two updates regarding email addresses and MAINTAINERS which I like to have up-to-date so people can be reached immediately. While we are here, there is on PCI_ID addition." * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: add maintainer entry for atmel i2c driver i2c: Fix my e-mail address in drivers and documentation i2c: iSMT: add Intel Avoton DeviceIDs i2c: tegra: check the clk_prepare_enable() return value
2013-03-23	Merge git://www.linux-watchdog.org/linux-watchdog	Linus Torvalds
	Pull watchdog fixes from Wim Van Sebroeck: "Fix a boot issues and correct the AcpiMmioSel bitmask in the sp5100_tco watchdog device driver" * git://www.linux-watchdog.org/linux-watchdog: watchdog: sp5100_tco: Set the AcpiMmioSel bitmask value to 1 instead of 2 watchdog: sp5100_tco: Remove code that may cause a boot failure
2013-03-23	KMS: fix EDID detailed timing frame rate	Torsten Duwe
	When KMS has parsed an EDID "detailed timing", it leaves the frame rate zeroed. Consecutive (debug-) output of that mode thus yields 0 for vsync. This simple fix also speeds up future invocations of drm_mode_vrefresh(). While it is debatable whether this qualifies as a -stable fix I'd apply it for consistency's sake; drm_helper_probe_single_connector_modes() does the same thing already for all probed modes. Cc: stable@vger.kernel.org Signed-off-by: Torsten Duwe <duwe@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-23	KMS: fix EDID detailed timing vsync parsing	Torsten Duwe
	EDID spreads some values across multiple bytes; bit-fiddling is needed to retrieve these. The current code to parse "detailed timings" has a cut&paste error that results in a vsync offset of at most 15 lines instead of 63. See http://en.wikipedia.org/wiki/EDID and in the "EDID Detailed Timing Descriptor" see bytes 10+11 show why that needs to be a left shift. Cc: stable@vger.kernel.org Signed-off-by: Torsten Duwe <duwe@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-22	block: implement runtime pm strategy	Lin Ming
	When a request is added: If device is suspended or is suspending and the request is not a PM request, resume the device. When the last request finishes: Call pm_runtime_mark_last_busy(). When pick a request: If device is resuming/suspending, then only PM request is allowed to go. The idea and API is designed by Alan Stern and described here: http://marc.info/?l=linux-scsi&m=133727953625963&w=2 Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22	block: add runtime pm helpers	Lin Ming
	Add runtime pm helper functions: void blk_pm_runtime_init(struct request_queue q, struct device dev) - Initialization function for drivers to call. int blk_pre_runtime_suspend(struct request_queue q) - If any requests are in the queue, mark last busy and return -EBUSY. Otherwise set q->rpm_status to RPM_SUSPENDING and return 0. void blk_post_runtime_suspend(struct request_queue q, int err) - If the suspend succeeded then set q->rpm_status to RPM_SUSPENDED. Otherwise set it to RPM_ACTIVE and mark last busy. void blk_pre_runtime_resume(struct request_queue q) - Set q->rpm_status to RPM_RESUMING. void blk_post_runtime_resume(struct request_queue q, int err) - If the resume succeeded then set q->rpm_status to RPM_ACTIVE and call __blk_run_queue, then mark last busy and autosuspend. Otherwise set q->rpm_status to RPM_SUSPENDED. The idea and API is designed by Alan Stern and described here: http://marc.info/?l=linux-scsi&m=133727953625963&w=2 Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22	block: add a flag to identify PM request	Lin Ming
	Add a flag REQ_PM to identify the request is PM related, such requests will not change the device request queue's runtime status. It is intended to be used in driver's runtime PM callback, so that driver can perform some IO to the device there with the queue's runtime status unaffected. e.g. in SCSI disk's runtime suspend callback, the disk will be put into stopped power state, and this require sending a command to the device. Such command processing should not change the disk's runtime status. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22	Merge branches 'cxgb4', 'ipoib' and 'qib' into for-next	Roland Dreier

2013-03-22	IB/qib: change QLogic to Intel	Vinit Agnihotri
	These changes modify the qib driver as part of acquiring the InfiniBand assets of QLogic. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>