linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2015-03-03	drm: rcar-du: Switch plane update to atomic helpers	Laurent Pinchart
	This removes the legacy plane update code. Wire up the default atomic check and atomic commit mode config helpers as needed by the plane update atomic helpers. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Rework CRTC enable/disable for atomic updates	Laurent Pinchart
	When using atomic updates the CRTC .enable() and .disable() helper operations are preferred over the (then legacy) .prepare() and .commit() operations. Implement .enable() and rework .disable() to not depend on DPMS, easing DPMS removal later on. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Rework HDMI encoder enable/disable for atomic updates	Laurent Pinchart
	When using atomic updates the encoder .enable() and .disable() helper operations are preferred over the (then legacy) .prepare() and .commit() operations. Implement .enable() and .disable() and rework .prepare(), .commit() and .dpms() as wrappers around .enable() and .disable(), easing their future removal. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Rework encoder enable/disable for atomic updates	Laurent Pinchart
	When using atomic updates the encoder .enable() and .disable() helper operations are preferred over the (then legacy) .prepare() and .commit() operations. Implement .enable() and .disable() and rework .prepare(), .commit() and .dpms() as wrappers around .enable() and .disable(), easing their future removal. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Replace LVDS encoder DPMS by enable/disable	Laurent Pinchart
	The LVDS encoder doesn't support DPMS states, replace the DPMS operation by enable/disable to avoid propagating DPMS states down to the encoder code. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Remove private copy of plane size and position	Laurent Pinchart
	The plane source and destination size and positions are stored in the plane state, and a private copy is kept in the rcar_du_plane objects. Remove the private copy as it just duplicates the state. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Wire up atomic state object scaffolding	Laurent Pinchart
	Hook up the default .reset(), .atomic_duplicate_state() and .atomic_free_state() helpers to ensure that state objects are properly created and destroyed, and call drm_mode_config_reset() at init time to create the initial state objects. Framebuffer reference count also gets maintained automatically by the transitional helpers except for the legacy page flip operation. Maintain it explicitly there. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Handle primary plane config through atomic plane ops	Laurent Pinchart
	Use the new CRTC atomic transitional helpers drm_helper_crtc_mode_set() and drm_helper_crtc_mode_set_base() to implement the CRTC .mode_set and .mode_set_base operations. This delegates primary plane configuration to the plane .atomic_update and .atomic_disable operations, removing duplicate code from the CRTC implementation. There is now no code path available to the driver in which to drop the reference to the CRTC acquired in the .prepare() operation if an error then occurs. The driver thus now leaks a reference if an error occurs during mode set. So be it, this will be fixed in a further step of the atomic update transition. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Implement planes atomic operations	Laurent Pinchart
	Implement the CRTC .atomic_begin() and .atomic_flush() operations, the plane .atomic_check(), .atomic_update() and operations, and use the transitional atomic helpers to implement the plane update and disable operations on top of the new atomic operations. The plane setup code can't be moved out of the CRTC start function completely yet, as the atomic code paths are not taken every time the CRTC needs to be started. This results in some code duplication that will be fixed after switching to atomic updates completely. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Fix hardware plane allocation	Laurent Pinchart
	The hardware plane allocator loops over all planes to find free candidates. However, instead of looping over the number of hardware planes, it loops over the number of software planes, which happens to be larger by one unit. This has no effect in practise as the extra plane is always cleared in the mask of free planes, but it should still be fixed for correctness. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Implement universal plane support	Laurent Pinchart
	Explicitly create the CRTC primary plane instead of relying on the core helpers to do so. This simplifies the plane logic by merging the KMS and software planes. Reject plane API operations on the primary planes for now, as that code will anyway be refactored when implementing support for atomic updates. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Define macros for the max number of groups, CRTCs and LVDS	Laurent Pinchart
	Let's avoid magic constants. Beside increasing code readability, it will also ensure that no location will be forgotten when raising the maximum number of groups, CRTCs or LVDS encoders Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Disable fbdev emulation when no connector is present	Laurent Pinchart
	fbdev emulation requires at least one connector, and will fail to initialize if no connector has been successfully instantiated. Disable it in that case and print an informational message instead of failing probe with a confusing fbdev emulation error message. It could be argued that probe should fail when no connector is present, but the DU could still be useful in that case with the to-be-implemented memory write-back support. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Turn vblank on/off when enabling/disabling CRTC	Laurent Pinchart
	The DRM core vblank handling mechanism requires drivers to forcefully turn vblank reporting off when disabling the CRTC, and to restore the vblank reporting status when enabling the CRTC. Implement this using the drm_crtc_vblank_on/off helpers. When disabling vblank we must first wait for page flips to complete, so implement page flip completion wait as well. Finally, drm_crtc_vblank_off() must be called at startup to synchronize the state of the vblank core code with the hardware, which is initially disabled. This is performed at CRTC creation time, requiring vertical blanking to be initialized before creating CRTCs. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Wait for page flip completion when turning the CRTC off	Laurent Pinchart
	Turning a CRTC off will prevent a queued page flip from ever completing, potentially confusing userspace. Wait for queued page flips to complete before turning the CRTC off to avoid this. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Reorder CRTC functions	Laurent Pinchart
	The next commit will need functions to be reordered to avoid forward declarations. Do it separately to help review. This only moves functions without any change to the code. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Don't set connector->encoder at init time	Laurent Pinchart
	The drm_connector encoder field points to the encoder driving the connector. No such association exists at init time, as all pipelines are disabled. Don't set the field. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Remove drm_fbdev_cma_restore_mode() call at init time	Laurent Pinchart
	The function is meant to restore the fbdev mode in the lastclose handler, not to be called at init time. Remove the call. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	drm: rcar-du: Don't disable unused functions at init time	Laurent Pinchart
	All encoders and CRTCs start disabled, re-disabling them is a no-op. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
2015-03-03	Revert "drm/rockchip: Flip select/depends in Kconfig"	Dave Airlie
	This reverts commit 9c58e8dbd3bfe7197323c88a784617afeffa9f87. This doesn't seem to fully fix this, Kbuild who knows. Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-03-03	drm/rockchip: Flip select/depends in Kconfig	Daniel Vetter
	Otherwise Kconfig gets confused and somehow ends up creating a 2nd drm submenu. I couldn't find i915 because of this any more at first. Cc: Andy Yan <andy.yan@rock-chips.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: "Yann E. MORIN" <yann.morin.1998@free.fr> Cc: linux-kbuild@vger.kernel.or Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Dave Airlie <airlied@gmail.com>
2015-03-03	Merge tag 'imx-drm-fixes-2015-02-24' of ↵	Dave Airlie
	git://git.pengutronix.de/git/pza/linux into drm-fixes imx-drm fixes for mode fixup, dw_hdmi/imx, and parallel-display - A clock fix for too large pixel clocks depending on the DI clock flag simplification patch - Pruning of unsupported modes and a missing end of array element for dw_hdmi-imx - LVDS modeset fix for mode fixup - Fix parallel-display deferred probing if drm_panel is used * tag 'imx-drm-fixes-2015-02-24' of git://git.pengutronix.de/git/pza/linux: DRM: i.MX: parallel display: Support probe deferral for finding DRM panel drm/imx: imx-ldb: enable DI clock in encoder_mode_set drm/imx: dw_hdmi-imx: add end of array element to current control array drm/imx: dw_hdmi-imx: add mode_valid callback prune unsupported modes gpu: ipu-v3: do not divide by zero if the pixel clock is too large
2015-03-03	eCryptfs: don't pass fs-specific ioctl commands through	Tyler Hicks
	eCryptfs can't be aware of what to expect when after passing an arbitrary ioctl command through to the lower filesystem. The ioctl command may trigger an action in the lower filesystem that is incompatible with eCryptfs. One specific example is when one attempts to use the Btrfs clone ioctl command when the source file is in the Btrfs filesystem that eCryptfs is mounted on top of and the destination fd is from a new file created in the eCryptfs mount. The ioctl syscall incorrectly returns success because the command is passed down to Btrfs which thinks that it was able to do the clone operation. However, the result is an empty eCryptfs file. This patch allows the trim, {g,s}etflags, and {g,s}etversion ioctl commands through and then copies up the inode metadata from the lower inode to the eCryptfs inode to catch any changes made to the lower inode's metadata. Those five ioctl commands are mostly common across all filesystems but the whitelist may need to be further pruned in the future. https://bugzilla.kernel.org/show_bug.cgi?id=93691 https://launchpad.net/bugs/1305335 Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Cc: Rocko <rockorequin@hotmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: stable@vger.kernel.org # v2.6.36+: c43f7b8 eCryptfs: Handle ioctl calls with unlocked and compat functions
2015-03-03	usb: ftdi_sio: Add jtag quirk support for Cyber Cortex AV boards	Max Mansfield
	This patch integrates Cyber Cortex AV boards with the existing ftdi_jtag_quirk in order to use serial port 0 with JTAG which is required by the manufacturers' software. Steps: 2 [ftdi_sio_ids.h] 1. Defined the device PID [ftdi_sio.c] 2. Added a macro declaration to the ids array, in order to enable the jtag quirk for the device. Signed-off-by: Max Mansfield <max.m.mansfield@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2015-03-02	udp: only allow UFO for packets from SOCK_DGRAM sockets	Michal Kubeček
	If an over-MTU UDP datagram is sent through a SOCK_RAW socket to a UFO-capable device, ip_ufo_append_data() sets skb->ip_summed to CHECKSUM_PARTIAL unconditionally as all GSO code assumes transport layer checksum is to be computed on segmentation. However, in this case, skb->csum_start and skb->csum_offset are never set as raw socket transmit path bypasses udp_send_skb() where they are usually set. As a result, driver may access invalid memory when trying to calculate the checksum and store the result (as observed in virtio_net driver). Moreover, the very idea of modifying the userspace provided UDP header is IMHO against raw socket semantics (I wasn't able to find a document clearly stating this or the opposite, though). And while allowing CHECKSUM_NONE in the UFO case would be more efficient, it would be a bit too intrusive change just to handle a corner case like this. Therefore disallowing UFO for packets from SOCK_DGRAM seems to be the best option. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	Merge branch 'sh_eth'	David S. Miller
	Ben Hutchings says: ==================== Fixes for sh_eth #4 v2 I'm continuing review and testing of Ethernet support on the R-Car H2 chip, with help from a colleague. This series fixes a few more issues. These are not tested on any of the other supported chips. v2: Add note that the revert is not a pure revert. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	sh_eth: Really fix padding of short frames on TX	Ben Hutchings
	My previous fix to clear padding of short frames used skb->len as the DMA length, assuming that skb_padto() extended skb->len to include the padding. That isn't the case; we need to use skb_put_padto() instead. (This wasn't immediately obvious because software padding isn't actually needed on the R-Car H2. We could make it conditional on which chip is being driven, but it's probably not worth the effort.) Reported-by: "Violeta Menéndez González" <violeta.menendez@codethink.co.uk> Fixes: 612a17a54b50 ("sh_eth: Fix padding of short frames on TX") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"	Ben Hutchings
	This reverts commit fd9af07c3404ac9ecbd0d859563360f51ce1ffde. The hardware manual states that the frame error and multicast bits are copied to bits 9:0 of RD0, not bits 25:16. I've tested that this is true for RFS1 (CRC error), RFS3 (frame too short), RFS4 (frame too long) and RFS8 (multicast). Also adjust a comment to agree with this. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	sh_eth: Fix RX recovery on R-Car in case of RX ring underrun	Ben Hutchings
	In case of RX ring underrun (RDE), we attempt to reset the software descriptor pointers (dirty_rx and cur_rx) to match where the hardware will read the next descriptor from, as that might not be the first dirty descriptor. This relies on reading RDFAR, but that register doesn't exist on all supported chips - specifically, not on the R-Car chips. This will result in unpredictable behaviour on those chips after an RDE. Make this pointer reset conditional and assume that it isn't needed on the R-Car chips. This fix also assumes that RDFAR is never exposed at offset 0 in the memory map - this is currently true, and a subsequent commit will fix the ambiguity between offset 0 and no-offset in the register offset maps. Fixes: 79fba9f51755 ("net: sh_eth: fix the rxdesc pointer when rx ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	sh_eth: Ensure proper ordering of descriptor active bit write/read	Ben Hutchings
	When submitting a DMA descriptor, the active bit must be written last. When reading a completed DMA descriptor, the active bit must be read first. Add memory barriers to ensure that this ordering is maintained. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02	Merge branch 'fixes' of github.com:lmajewski/linux-samsung-thermal into ↵	Eduardo Valentin
	work-fixes Pull samsung thermal fixes from Lukasz Majewski: "Changes: - Exynos7 power down detection mode fix - Fix for cpufreq cooling device regression - Updating MAINTAINER's entry for Samsung Exynos Thermal" Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-03-03	livepatch: fix RCU usage in klp_find_external_symbol()	Peter Zijlstra
	While one must hold RCU-sched (aka. preempt_disable) for find_symbol() one must equally hold it over the use of the object returned. The moment you release the RCU-sched read lock, the object can be dead and gone. [jkosina@suse.cz: change subject line to be aligned with other patches] Cc: Seth Jennings <sjenning@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Petr Mladek <pmladek@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-03-02	NFSv4: Ensure we skip delegations that are already being returned	Trond Myklebust
	In nfs_client_return_marked_delegations() and nfs_delegation_reap_unclaimed() we want to optimise the loop traversal by skipping delegations that are already in the process of being returned. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02	NFSv4: Pin the superblock while we're returning the delegation	Trond Myklebust
	This patch ensures that the superblock doesn't go ahead and disappear underneath us while the state manager thread is returning delegations. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02	NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation()	Trond Myklebust
	Ensure that nfs_inode_set_delegation() doesn't inadvertently detach a delegation that is already in the process of being returned. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02	NFSv4: Ensure that we don't reap a delegation that is being returned	Trond Myklebust
	Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02	NFS: Fix stateid used for NFS v4 closes	Anna Schumaker
	After 566fcec60 the client uses the "current stateid" from the nfs4_state structure to close a file. This could potentially contain a delegation stateid, which is disallowed by the protocol and causes servers to return NFS4ERR_BAD_STATEID. This patch restores the (correct) behavior of sending the open stateid to close a file. Reported-by: Olga Kornievskaia <kolga@netapp.com> Fixes: 566fcec60 (NFSv4: Fix an atomicity problem in CLOSE) Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-03-02	KVM: MIPS: Enable after disabling interrupt	Tapasweni Pathak
	Enable disabled interrupt, on unsuccessful operation. Found by Coccinelle. Signed-off-by: Tapasweni Pathak <tapaswenipathak@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Reviewed-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02	KVM: MIPS: Fix trace event to save PC directly	James Hogan
	Currently the guest exit trace event saves the VCPU pointer to the structure, and the guest PC is retrieved by dereferencing it when the event is printed rather than directly from the trace record. This isn't safe as the printing may occur long afterwards, after the PC has changed and potentially after the VCPU has been freed. Usually this results in the same (wrong) PC being printed for multiple trace events. It also isn't portable as userland has no way to access the VCPU data structure when interpreting the trace record itself. Lets save the actual PC in the structure so that the correct value is accessible later. Fixes: 669e846e6c4e ("KVM/MIPS32: MIPS arch specific APIs for KVM") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: <stable@vger.kernel.org> # v3.10+ Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02	Merge tag 'gpio-v4.0-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Two GPIO fixes: - Fix a translation problem in of_get_named_gpiod_flags() - Fix a long standing container_of() mistake in the TPS65912 driver" * tag 'gpio-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: tps65912: fix wrong container_of arguments gpiolib: of: allow of_gpiochip_find_and_xlate to find more than one chip per node
2015-03-02	Merge branch 'fixes-for-4.0-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal management fixes from Eduardo Valentin: "Specifics: - Several fixes in tmon tool. - Fixes in intel int340x for _ART and _TRT tables. - Add id for Avoton SoC into powerclamp driver. - Fixes in RCAR thermal driver to remove race conditions and fix fail path - Fixes in TI thermal driver: removal of unnecessary code and build fix if !CONFIG_PM_SLEEP - Cleanups in exynos thermal driver - Add stubs for include/linux/thermal.h. Now drivers using thermal calls but that also work without CONFIG_THERMAL will be able to compile for systems that don't care about thermal. Note: I am sending this pull on Rui's behalf while he fixes issues in his Linux box" * 'fixes-for-4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: int340x_thermal: Ignore missing _ART, _TRT tables thermal/intel_powerclamp: add id for Avoton SoC tools/thermal: tmon: silence 'set but not used' warnings tools/thermal: tmon: use pkg-config to determine library dependencies tools/thermal: tmon: support cross-compiling tools/thermal: tmon: add .gitignore tools/thermal: tmon: fixup tui windowing calculations tools/thermal: tmon: tui: don't hard-code dialog window size assumptions tools/thermal: tmon: add min/max macros tools/thermal: tmon: add --target-temp parameter thermal: exynos: Clean-up code to use oneline entry for exynos compatible table thermal: rcar: Make error and remove paths symmetrical with init thermal: rcar: Fix race condition between init and interrupt thermal: Introduce dummy functions when thermal is not defined ti-soc-thermal: Delete an unnecessary check before the function call "cpufreq_cooling_unregister" thermal: ti-soc-thermal: bandgap: Fix build warning if !CONFIG_PM_SLEEP
2015-03-02	Btrfs: incremental send, don't rename a directory too soon	Filipe Manana
	There's one more case where we can't issue a rename operation for a directory as soon as we process it. We used to delay directory renames only if they have some ancestor directory with a higher inode number that got renamed too, but there's another case where we need to delay the rename too - when a directory A is renamed to the old name of a directory B but that directory B has its rename delayed because it has now (in the send root) an ancestor with a higher inode number that was renamed. If we don't delay the directory rename in this case, the receiving end of the send stream will attempt to rename A to the old name of B before B got renamed to its new name, which results in a "directory not empty" error. So fix this by delaying directory renames for this case too. Steps to reproduce: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/a $ mkdir /mnt/b $ mkdir /mnt/c $ touch /mnt/a/file $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ mv /mnt/c /mnt/x $ mv /mnt/a /mnt/x/y $ mv /mnt/b /mnt/a $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send /mnt/snap1 -f /tmp/1.send $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.send $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt2 $ btrfs receive /mnt2 -f /tmp/1.send $ btrfs receive /mnt2 -f /tmp/2.send ERROR: rename b -> a failed. Directory not empty A test case for xfstests follows soon. Reported-by: Ames Cornish <ames@cornishes.net> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	btrfs: fix lost return value due to variable shadowing	David Sterba
	A block-local variable stores error code but btrfs_get_blocks_direct may not return it in the end as there's a ret defined in the function scope. CC: <stable@vger.kernel.org> # 3.6+ Fixes: d187663ef24c ("Btrfs: lock extents as we map them in DIO") Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	Btrfs: do not ignore errors from btrfs_lookup_xattr in do_setxattr	Filipe Manana
	The return value from btrfs_lookup_xattr() can be a pointer encoding an error, therefore deal with it. This fixes commit 5f5bc6b1e2d5 ("Btrfs: make xattr replace operations atomic"). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	Btrfs: fix off-by-one logic error in btrfs_realloc_node	Filipe Manana
	The end_slot variable actually matches the number of pointers in the node and not the last slot (which is 'nritems - 1'). Therefore in order to check that the current slot in the for loop doesn't match the last one, the correct logic is to check if 'i' is less than 'end_slot - 1' and not 'end_slot - 2'. Fix this and set end_slot to be 'nritems - 1', as it's less confusing since the variable name implies it's inclusive rather then exclusive. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	Btrfs: add missing inode update when punching hole	Filipe Manana
	When punching a file hole if we endup only zeroing parts of a page, because the start offset isn't a multiple of the sector size or the start offset and length fall within the same page, we were not updating the inode item. This prevented an fsync from doing anything, if no other file changes happened in the current transaction, because the fields in btrfs_inode used to check if the inode needs to be fsync'ed weren't updated. This issue is easy to reproduce and the following excerpt from the xfstest case I made shows how to trigger it: _scratch_mkfs >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create our test file. $XFS_IO_PROG -f -c "pwrite -S 0x22 -b 16K 0 16K" \ $SCRATCH_MNT/foo \| _filter_xfs_io # Fsync the file, this makes btrfs update some btrfs inode specific fields # that are used to track if the inode needs to be written/updated to the fsync # log or not. After this fsync, the new values for those fields indicate that # a subsequent fsync does not need to touch the fsync log. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo # Force a commit of the current transaction. After this point, any operation # that modifies the data or metadata of our file, should update those fields in # the btrfs inode with values that make the next fsync operation write to the # fsync log. sync # Punch a hole in our file. This small range affects only 1 page. # This made the btrfs hole punching implementation write only some zeroes in # one page, but it did not update the btrfs inode fields used to determine if # the next fsync needs to write to the fsync log. $XFS_IO_PROG -c "fpunch 8000 4K" $SCRATCH_MNT/foo # Another variation of the previously mentioned case. $XFS_IO_PROG -c "fpunch 15000 100" $SCRATCH_MNT/foo # Now fsync the file. This was a no-operation because the previous hole punch # operation didn't update the inode's fields mentioned before, so they remained # with the values they had after the first fsync - that is, they indicate that # it is not needed to write to fsync log. $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo echo "File content before:" od -t x1 $SCRATCH_MNT/foo # Simulate a crash/power loss. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey # Enable writes and mount the fs. This makes the fsync log replay code run. _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Because the last fsync didn't do anything, here the file content matched what # it was after the first fsync, before the holes were punched, and not what it # was after the holes were punched. echo "File content after:" od -t x1 $SCRATCH_MNT/foo This issue has been around since 2012, when the punch hole implementation was added, commit 2aaa66558172 ("Btrfs: add hole punching"). A test case for xfstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	Btrfs: abort the transaction if we fail to update the free space cache inode	Josef Bacik
	Our gluster boxes were hitting a problem where they'd run out of space when updating the block group cache and therefore wouldn't be able to update the free space inode. This is a problem because this is how we invalidate the cache and protect ourselves from errors further down the stack, so if this fails we have to abort the transaction so we make sure we don't end up with stale free space cache. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	Btrfs: fix fsync race leading to ordered extent memory leaks	Filipe Manana
	We can have multiple fsync operations against the same file during the same transaction and they can collect the same ordered extents while they don't complete (still accessible from the inode's ordered tree). If this happens, those ordered extents will never get their reference counts decremented to 0, leading to memory leaks and inode leaks (an iput for an ordered extent's inode is scheduled only when the ordered extent's refcount drops to 0). The following sequence diagram explains this race: CPU 1 CPU 2 btrfs_sync_file() btrfs_sync_file() mutex_lock(inode->i_mutex) btrfs_log_inode() btrfs_get_logged_extents() --> collects ordered extent X --> increments ordered extent X's refcount btrfs_submit_logged_extents() mutex_unlock(inode->i_mutex) mutex_lock(inode->i_mutex) btrfs_sync_log() btrfs_wait_logged_extents() --> list_del_init(&ordered->log_list) btrfs_log_inode() btrfs_get_logged_extents() --> Adds ordered extent X to logged_list because at this point: list_empty(&ordered->log_list) && test_bit(BTRFS_ORDERED_LOGGED, &ordered->flags) == 0 --> Increments ordered extent X's refcount --> check if ordered extent's io is finished or not, start it if necessary and wait for it to finish --> sets bit BTRFS_ORDERED_LOGGED on ordered extent X's flags and adds it to trans->ordered btrfs_sync_log() finishes btrfs_submit_logged_extents() btrfs_log_inode() finishes mutex_unlock(inode->i_mutex) btrfs_sync_file() finishes btrfs_sync_log() btrfs_wait_logged_extents() --> Sees ordered extent X has the bit BTRFS_ORDERED_LOGGED set in its flags --> X's refcount is untouched btrfs_sync_log() finishes btrfs_sync_file() finishes btrfs_commit_transaction() --> called by transaction kthread for e.g. btrfs_wait_pending_ordered() --> waits for ordered extent X to complete --> decrements ordered extent X's refcount by 1 only, corresponding to the increment done by the fsync task ran by CPU 1 In the scenario of the above diagram, after the transaction commit, the ordered extent will remain with a refcount of 1 forever, leaking the ordered extent structure and preventing the i_count of its inode from ever decreasing to 0, since the delayed iput is scheduled only when the ordered extent's refcount drops to 0, preventing the inode from ever being evicted by the VFS. Fix this by using the flag BTRFS_ORDERED_LOGGED differently. Use it to mean that an ordered extent is already being processed by an fsync call, which will attach it to the current transaction, preventing it from being collected by subsequent fsync operations against the same inode. This race was introduced with the following change (added in 3.19 and backported to stable 3.18 and 3.17): Btrfs: make sure logged extents complete in the current transaction V3 commit 50d9aa99bd35c77200e0e3dd7a72274f8304701f I ran into this issue while running xfstests/generic/113 in a loop, which failed about 1 out of 10 runs with the following warning in dmesg: [ 2612.440038] WARNING: CPU: 4 PID: 22057 at fs/btrfs/disk-io.c:3558 free_fs_root+0x36/0x133 [btrfs]() [ 2612.442810] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop processor parport_pc parport psmouse therma l_sys i2c_piix4 serio_raw pcspkr evdev microcode button i2c_core ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom virtio_scsi ata_generic virtio_pci ata_piix virtio_ring libata virtio flo ppy e1000 scsi_mod [last unloaded: btrfs] [ 2612.452711] CPU: 4 PID: 22057 Comm: umount Tainted: G W 3.19.0-rc5-btrfs-next-4+ #1 [ 2612.454921] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 2612.457709] 0000000000000009 ffff8801342c3c78 ffffffff8142425e ffff88023ec8f2d8 [ 2612.459829] 0000000000000000 ffff8801342c3cb8 ffffffff81045308 ffff880046460000 [ 2612.461564] ffffffffa036da56 ffff88003d07b000 ffff880046460000 ffff880046460068 [ 2612.463163] Call Trace: [ 2612.463719] [<ffffffff8142425e>] dump_stack+0x4c/0x65 [ 2612.464789] [<ffffffff81045308>] warn_slowpath_common+0xa1/0xbb [ 2612.466026] [<ffffffffa036da56>] ? free_fs_root+0x36/0x133 [btrfs] [ 2612.467247] [<ffffffff810453c5>] warn_slowpath_null+0x1a/0x1c [ 2612.468416] [<ffffffffa036da56>] free_fs_root+0x36/0x133 [btrfs] [ 2612.469625] [<ffffffffa036f2a7>] btrfs_drop_and_free_fs_root+0x93/0x9b [btrfs] [ 2612.471251] [<ffffffffa036f353>] btrfs_free_fs_roots+0xa4/0xd6 [btrfs] [ 2612.472536] [<ffffffff8142612e>] ? wait_for_completion+0x24/0x26 [ 2612.473742] [<ffffffffa0370bbc>] close_ctree+0x1f3/0x33c [btrfs] [ 2612.475477] [<ffffffff81059d1d>] ? destroy_workqueue+0x148/0x1ba [ 2612.476695] [<ffffffffa034e3da>] btrfs_put_super+0x19/0x1b [btrfs] [ 2612.477911] [<ffffffff81153e53>] generic_shutdown_super+0x73/0xef [ 2612.479106] [<ffffffff811540e2>] kill_anon_super+0x13/0x1e [ 2612.480226] [<ffffffffa034e1e3>] btrfs_kill_super+0x17/0x23 [btrfs] [ 2612.481471] [<ffffffff81154307>] deactivate_locked_super+0x3b/0x50 [ 2612.482686] [<ffffffff811547a7>] deactivate_super+0x3f/0x43 [ 2612.483791] [<ffffffff8116b3ed>] cleanup_mnt+0x59/0x78 [ 2612.484842] [<ffffffff8116b44c>] __cleanup_mnt+0x12/0x14 [ 2612.485900] [<ffffffff8105d019>] task_work_run+0x8f/0xbc [ 2612.486960] [<ffffffff810028d8>] do_notify_resume+0x5a/0x6b [ 2612.488083] [<ffffffff81236e5b>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 2612.489333] [<ffffffff8142a17f>] int_signal+0x12/0x17 [ 2612.490353] ---[ end trace 54a960a6bdcb8d93 ]--- [ 2612.557253] VFS: Busy inodes after unmount of sdb. Self-destruct in 5 seconds. Have a nice day... Kmemleak confirmed the ordered extent leak (and btrfs inode specific structures such as delayed nodes): $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff880154290db0 (size 576): comm "btrfsck", pid 21980, jiffies 4295542503 (age 1273.412s) hex dump (first 32 bytes): 01 40 00 00 01 00 00 00 b0 1d f1 4e 01 88 ff ff .@.........N.... 00 00 00 00 00 00 00 00 c8 0d 29 54 01 88 ff ff ..........)T.... backtrace: [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83 [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190 [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs] [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs] [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs] [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs] [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs] [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b [<ffffffff81429e92>] system_call_fastpath+0x12/0x17 [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff88014ef11db0 (size 576): comm "rm", pid 22009, jiffies 4295542593 (age 1273.052s) hex dump (first 32 bytes): 02 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 c8 1d f1 4e 01 88 ff ff ...........N.... backtrace: [<ffffffff8141d74d>] kmemleak_update_trace+0x4c/0x6a [<ffffffff8122f2c0>] radix_tree_node_alloc+0x6d/0x83 [<ffffffff8122fb26>] __radix_tree_create+0x109/0x190 [<ffffffff8122fbdd>] radix_tree_insert+0x30/0xac [<ffffffffa03b9bde>] btrfs_get_or_create_delayed_node+0x130/0x187 [btrfs] [<ffffffffa03bb82d>] btrfs_delayed_delete_inode_ref+0x32/0xac [btrfs] [<ffffffffa0379dae>] __btrfs_unlink_inode+0xee/0x288 [btrfs] [<ffffffffa037c715>] btrfs_unlink_inode+0x1e/0x40 [btrfs] [<ffffffffa037c797>] btrfs_unlink+0x60/0x9b [btrfs] [<ffffffff8115d7f0>] vfs_unlink+0x9c/0xed [<ffffffff8115f5de>] do_unlinkat+0x12c/0x1fa [<ffffffff811601a7>] SyS_unlinkat+0x29/0x2b [<ffffffff81429e92>] system_call_fastpath+0x12/0x17 [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff8800336feda8 (size 584): comm "aio-stress", pid 22031, jiffies 4295543006 (age 1271.400s) hex dump (first 32 bytes): 00 40 3e 00 00 00 00 00 00 00 8f 42 00 00 00 00 .@>........B.... 00 00 01 00 00 00 00 00 00 00 01 00 00 00 00 00 ................ backtrace: [<ffffffff8114eb34>] create_object+0x172/0x29a [<ffffffff8141d790>] kmemleak_alloc+0x25/0x41 [<ffffffff81141ae6>] kmemleak_alloc_recursive.constprop.52+0x16/0x18 [<ffffffff81145288>] kmem_cache_alloc+0xf7/0x198 [<ffffffffa0389243>] __btrfs_add_ordered_extent+0x43/0x309 [btrfs] [<ffffffffa038968b>] btrfs_add_ordered_extent_dio+0x12/0x14 [btrfs] [<ffffffffa03810e2>] btrfs_get_blocks_direct+0x3ef/0x571 [btrfs] [<ffffffff81181349>] do_blockdev_direct_IO+0x62a/0xb47 [<ffffffff8118189a>] __blockdev_direct_IO+0x34/0x36 [<ffffffffa03776e5>] btrfs_direct_IO+0x16a/0x1e8 [btrfs] [<ffffffff81100373>] generic_file_direct_write+0xb8/0x12d [<ffffffffa038615c>] btrfs_file_write_iter+0x24b/0x42f [btrfs] [<ffffffff8118bb0d>] aio_run_iocb+0x2b7/0x32e [<ffffffff8118c99a>] do_io_submit+0x26e/0x2ff [<ffffffff8118ca3b>] SyS_io_submit+0x10/0x12 [<ffffffff81429e92>] system_call_fastpath+0x12/0x17 CC: <stable@vger.kernel.org> # 3.19, 3.18 and 3.17 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-03-02	KVM: SVM: fix interrupt injection (apic->isr_count always 0)	Radim Krčmář
	In commit b4eef9b36db4, we started to use hwapic_isr_update() != NULL instead of kvm_apic_vid_enabled(vcpu->kvm). This didn't work because SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not change apic->isr_count, because it should always be 1. The initial value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm), which is always 0 for SVM, so KVM could have injected interrupts when it shouldn't. Fix it by implicitly setting SVM's hwapic_isr_update to NULL and make the initial isr_count depend on hwapic_isr_update() for good measure. Fixes: b4eef9b36db4 ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv") Reported-and-tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2015-03-02	Merge tag 'md/4.0-fixes' of git://neil.brown.name/md	Linus Torvalds
	Pull md fixes from Neil Brown: "Three md fixes: - fix a read-balance problem that was reported 2 years ago, but that I never noticed the report :-( - fix for rare RAID6 problem causing incorrect bitmap updates when two devices fail. - add __ATTR_PREALLOC annotation now that it is possible" * tag 'md/4.0-fixes' of git://neil.brown.name/md: md: mark some attributes as pre-alloc raid5: check faulty flag for array status during recovery. md/raid1: fix read balance when a drive is write-mostly.