linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2015-03-26	cdc-wdm: error returns need to be translated	Oliver Neukum
	One more case of error codes not correctly being correctly returned to user space. Signed-off-by: Olive Neukum <oneukum@suse.com>0 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	cdc-wdm: fix endianness bug in debug statements	Oliver Neukum
	Values directly from descriptors given in debug statements must be converted to native endianness. Signed-off-by: Oliver Neukum <oneukum@suse.de> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	cdc-wdm: unify error handling in write	Oliver Neukum
	This makes sure the error handling path is the same for all error conditions, thus reducing code duplication. Signed-off-by: Oliver Neukum <oneukum@suse.de>0 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	cdc-acm: convert to not directly using urb->status	Oliver Neukum
	A step on the road to passing status as a parameter Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	cdc-acm: surpress misleading message	Oliver Neukum
	During the entry intro suspend a misleading message can be printed. Surpress it by checking the specific error. Signed-off-by: Oliver Neukum <oneukum@suse.de>0 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	cdc-acm: fix race between callback and unthrottle	Oliver Neukum
	Abn URB may be may marked free only after the buffer has been processed or there is a small window during which it could be submitted on another CPU and overwrite an unprocessed buffer Signed-off-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	usb/misc/usb3503: Always read refclk frequency from DT	Ben Gamari
	This is necessary to set REF_SEL appropriately in uses where refclk is always available. Signed-off-by: Ben Gamari <ben@smart-cactus.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	cdc-wdm: return correct error codes	Oliver Neukum
	Lieing to user space is wrong. The real reason for a failure to write should be returned to user space. Signed-off-by: Oliver Neukum <oneukum@suse.de>0 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	MAINTAINERS: change my git address to kernel.org	Peter Chen
	Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	usb: ehci-orion: add more constants for register values	Thomas Petazzoni
	This commit adds new register values for the USB_CMD and USB_MODE registers, which allows to avoid the usage of a number of magic values in orion_usb_phy_v1_setup(). Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	USB: Move usb_disabled() towards top of the file	Viresh Kumar
	Move usb_disabled() and module_param()/core_param() towards the top of the file, where 'nousb' is defined, as they are all related. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	USB: Use usb_disabled() consistently	Viresh Kumar
	At few places we have used usb_disabled() and at other places used 'nousb' directly. Lets be consistent and use usb_disabled(); Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	usb: Add driver for Altus Metrum ChaosKey device (v2)	Keith Packard
	This is a hardware random number generator. The driver provides both a /dev/chaoskeyX entry and hooks the entropy source up to the kernel hwrng interface. More information about the device can be found at http://chaoskey.org The USB ID for ChaosKey was allocated from the OpenMoko USB vendor space and is visible as 'USBtrng' here: http://wiki.openmoko.org/wiki/USB_Product_IDs v2: Respond to review from Oliver Neukum <oneukum@suse.de> * Delete extensive debug infrastructure and replace it with calls to dev_dbg. * Allocate I/O buffer separately from device structure to obey requirements for non-coherant architectures. * Initialize mutexes before registering device to ensure that open cannot be invoked before the device is ready to proceed. * Return number of bytes read instead of -EINTR when partial read operation is aborted due to a signal. * Make sure device mutex is unlocked in read error paths. * Add MAINTAINERS entry for the driver Signed-off-by: Keith Packard <keithp@keithp.com> Cc: Oliver Neukum <oneukum@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	usb: chipidea: usbmisc_imx: fix returnvar.cocci warnings	kbuild test robot
	drivers/usb/chipidea/usbmisc_imx.c:277:5-8: Unneeded variable: "ret". Return "0" on line 297 Removes unneeded variable used to store return value. Generated by: scripts/coccinelle/misc/returnvar.cocci Cc: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: octeon: Remove extern from .c file	Helen Fornazier
	This patch fixes the checkpatch.pl warning: WARNING: externs should be avoided in .c files +extern void octeon_mdiobus_force_mod_depencency(void); Signed-off-by: Helen Fornazier <helen.fornazier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: i2o: Remove indentation of labels	Helen Fornazier
	This patch fixes the checkpatche.pl warnings: WARNING: labels should not be indented + context_remove: WARNING: labels should not be indented + nop_msg: WARNING: labels should not be indented + exit: Signed-off-by: Helen Fornazier <helen.fornazier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	staging: fbtft: Remove do {} while(0) in single statement macro	Helen Fornazier
	This patch fixes the checkpatch.pl warning: WARNING: Single statement macros should not use a do {} while (0) loop +#define write_reg(par, ...) \ +do { \ + par->fbtftops.write_register(par, NUMARGS(__VA_ARGS__), __VA_ARGS__); \ +} while (0) Signed-off-by: Helen Fornazier <helen.fornazier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	staging: fbtft: Add space around '='	Helen Fornazier
	This patch fixes the checkpatch.pl error: ERROR: spaces required around that '=' (ctx:VxV) + sdev->bits_per_word=9; ^ Signed-off-by: Helen Fornazier <helen.fornazier@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: iio: use the BIT macro in adc	Haneen Mohammed
	This patch replaces bit shifting on: 0,1,2, and 3 with the BIT(x) macro. Issue addressed by checkpatcg.pl. This was done with the help of Coccinelle: @r1@ identifier x; constant int g; @@ ( 0<<$x\\|g$ \| 1<<$x\\|g$ \| 2<<$x\\|g$ \| 3<<$x\\|g$ ) @script:python b@ g2 <<r1.g; y; @@ coccinelle.y = int(g2) + 1 @c@ constant int r1.g; identifier b.y; @@ ( -(1 << g) +BIT(g) \| -(0 << g) + 0 \| -(2 << g) +BIT(y) \| -(3 << g) +(BIT(y)\| BIT(g)) ) Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: iio: Add braces on all arms of if statement	Haneen Mohammed
	The following patch adds braces on all arms of if statement. Issue addressed by checkpatch.pl. Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: iio: add blank line after function declaration	Haneen Mohammed
	This patch adds blank line after function declaration. Issue addressed by checkpatch.pl. Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: iio: remove multible blank lines	Haneen Mohammed
	This patch removes extra blank lines to address checkpatch.pl warnings regarding that. Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: iio: Adjust alignment for function parameters	Haneen Mohammed
	This patch adjust parameters alignment in functions to match open parenthesis. Issue addressed by checkpatch.pl. Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: rtl8192u: Fix space issues before '(' and after ')'	Haneen Mohammed
	Space is required before the open and after the close parenthesis. This patch adds space after 'if' and before '{'. This was done with the help of the following Coccinelle script: @r@ expression E; position p1,p2,p3; @@ if@p1 (E) @p3{@p2 ... } @script:python@ p1 << r.p1; p2 << r.p2; p3 << r.p3; @@ l1 = int (p1[0].line) l2 = int (p2[0].line) l3 = int (p3[0].line) c1 = int (p1[0].column_end) c2 = int (p2[0].column) c3 = int (p3[0].column) if (l1 != l2): cocci.include_match(False) if (l2 == l3 and c3 + 2 == c2): cocci.include_match(False) @@ position r.p1,r.p2,r.p3; expression r.E; @@ -if@p1 (E) @p3{@p2 +if (E) { ... } Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: rtl8192u: Add space before open parenthesis	Haneen Mohammed
	Space is required before the open parenthesis. This patch adds space after if to address that issue. This was done with the help of the following Coccinelle script: @r@ position p1,p2; @@ if@p1 (@p2 ...) { ... } @script:python@ p1 << r.p1; p2 << r.p2; @@ l1 = int (p1[0].line) l2 = int (p2[0].line) c1 = int (p1[0].column) c2 = int (p2[0].column) if (l2 == l1 and c1 + 2 != c2): cocci.include_match(False) @@ position r.p1,r.p2; @@ - if@p1 ( + if ( ...) { ... } Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: lustre: Remove extern from function declaration	Vatika Harlalka
	Functions have the extern storage class specifier by default, so this keyword can be removed. Signed-off-by: Vatika Harlalka <vatikaharlalka@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: rtl8188eu: Add new variable to make code compact	Vatika Harlalka
	Introducing this variable leads to overall more code compactness and increases readability. Signed-off-by: Vatika Harlalka <vatikaharlalka@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: rtl8188eu: Refactor repititive code to loop to increase compactness	Vatika Harlalka
	Refactor repetitive code to loop so as to increase compactness and introduce newlines for readability. Signed-off-by: Vatika Harlalka <vatikaharlalka@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	Staging: rtl8188eu: Reduce line size to increase readability	Vatika Harlalka
	Reduce line size to increase readability. Signed-off-by: Vatika Harlalka <vatikaharlalka@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-26	drm/i915: Keep ring->active_list and ring->requests_list consistent	Chris Wilson
	If we retire requests last, we may use a later seqno and so clear the requests lists without clearing the active list, leading to confusion. Hence we should retire requests first for consistency with the early return. The order used to be important as the lifecycle for the object on the active list was determined by request->seqno. However, the requests themselves are now reference counted removing the constraint from the order of retirement. Fixes regression from commit 1b5a433a4dd967b125131da42b89b5cc0d5b1f57 Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:42 2014 +0000 drm/i915: Convert 'i915_seqno_passed' calls into 'i915_gem_request_completed ' and a WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140() WARN_ON(!list_empty(&vm->active_list)) Identified by updating WATCH_LISTS: [drm:i915_verify_lists] ERROR blitter ring: active list not empty, but no requests WARNING: CPU: 0 PID: 681 at drivers/gpu/drm/i915/i915_gem.c:2751 i915_gem_retire_requests_ring+0x149/0x230() WARN_ON(i915_verify_lists(ring->dev)) Note that this is only a problem in evict_vm where the following happens after a retire_request has cleaned out all requests, but not all active bo: - intel_ring_idle called from i915_gpu_idle notices that no requests are outstanding and immediately returns. - i915_gem_retire_requests_ring called from i915_gem_retire_requests also immediately returns when there's no request, still leaving the bo on the active list. - evict_vm hits the WARN_ON(!list_empty(&vm->active_list)) after evicting all active objects that there's still stuff left that shouldn't be there. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-03-26	ALSA: hda_intel: apply the Seperate stream_tag for Sunrise Point	Libin Yang
	The total stream number of Sunrise Point's input and output stream exceeds 15, which will cause some streams do not work because of the overflow on SDxCTL.STRM field if using the legacy stream tag allocation method. This patch uses the new stream tag allocation method by add the flag AZX_DCAPS_SEPARATE_STREAM_TAG for Skylake platform. Signed-off-by: Libin Yang <libin.yang@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-03-26	ARC: signal handling robustify	Vineet Gupta
	A malicious signal handler / restorer can DOS the system by fudging the user regs saved on stack, causing weird things such as sigreturn returning to user mode PC but cpu state still being kernel mode.... Ensure that in sigreturn path status32 always has U bit; any other bogosity (gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms. Reproducer signal handler: void handle_sig(int signo, siginfo_t info, void context) { ucontext_t uc = context; struct user_regs_struct regs = &(uc->uc_mcontext.regs); regs->scratch.status32 = 0; } Before the fix, kernel would go off to weeds like below: --------->8----------- [ARCLinux]$ ./signal-test Path: /signal-test CPU: 0 PID: 61 Comm: signal-test Not tainted 4.0.0-rc5+ #65 task: 8f177880 ti: 5ffe6000 task.ti: 8f15c000 [ECR ]: 0x00220200 => Invalid Write @ 0x00000010 by insn @ 0x00010698 [EFA ]: 0x00000010 [BLINK ]: 0x2007c1ee [ERET ]: 0x10698 [STAT32]: 0x00000000 : <-------- BTA: 0x00010680 SP: 0x5ffe7e48 FP: 0x00000000 LPS: 0x20003c6c LPE: 0x20003c70 LPC: 0x00000000 ... --------->8----------- Reported-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-03-26	ARC: SA_SIGINFO ucontext regs off-by-one	Vineet Gupta
	The regfile provided to SA_SIGINFO signal handler as ucontext was off by one due to pt_regs gutter cleanups in 2013. Before handling signal, user pt_regs are copied onto user_regs_struct and copied back later. Both structs are binary compatible. This was all fine until commit 2fa919045b72 (ARC: pt_regs update #2) which removed the empty stack slot at top of pt_regs (corresponding to first pad) and made the corresponding fixup in struct user_regs_struct (the pad in there was moved out of @scratch - not removed altogether as it is part of ptrace ABI) struct user_regs_struct { + long pad; struct { - long pad; long bta, lp_start, lp_end,.... } scratch; ... } This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and signal code needs to user_regs_struct.scratch to reflect it as pt_regs, which is what this commit does. This problem was hidden for 2 years, because both save/restore, despite using wrong location, were using the same location. Only an interim inspection (reproducer below) exposed the issue. void handle_segv(int signo, siginfo_t info, void context) { ucontext_t uc = context; struct user_regs_struct regs = &(uc->uc_mcontext.regs); printf("regs %x %x\n", <=== prints 7 8 (vs. 8 9) regs->scratch.r8, regs->scratch.r9); } int main() { struct sigaction sa; sa.sa_sigaction = handle_segv; sa.sa_flags = SA_SIGINFO; sigemptyset(&sa.sa_mask); sigaction(SIGSEGV, &sa, NULL); asm volatile( "mov r7, 7 \n" "mov r8, 8 \n" "mov r9, 9 \n" "mov r10, 10 \n" :::"r7","r8","r9","r10"); ((unsigned int)0x10) = 0; } Fixes: 2fa919045b72ec892e "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs" CC: <stable@vger.kernel.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-03-25	NFSD: Fix bad update of layout in nfsd4_return_file_layout	Kinglong Mee
	With return layout as, (seg is return layout, lo is record layout) seg->offset <= lo->offset and layout_end(seg) < layout_end(lo), nfsd should update lo's offset to seg's end, and, seg->offset > lo->offset and layout_end(seg) >= layout_end(lo), nfsd should update lo's end to seg's offset. Fixes: 9cf514ccfa ("nfsd: implement pNFS operations") Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25	NFSD: Take care the return value from nfsd4_encode_stateid	Kinglong Mee
	Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25	NFSD: Printk blocklayout length and offset as format 0x%llx	Kinglong Mee
	When testing pnfs with nfsd_debug on, nfsd print a negative number of layout length and foff in nfsd4_block_proc_layoutget as, "GET: -xxxx:-xxx 2" Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25	nfsd: return correct lockowner when there is a race on hash insert	J. Bruce Fields
	alloc_init_lock_stateowner can return an already freed entry if there is a race to put openowners in the hashtable. Noticed by inspection after Jeff Layton fixed the same bug for open owners. Depending on client behavior, this one may be trickier to trigger in practice. Fixes: c58c6610ec24 "nfsd: Protect adding/removing lock owners using client_lock" Cc: <stable@vger.kernel.org> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Acked-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25	nfsd: return correct openowner when there is a race to put one in the hash	Jeff Layton
	alloc_init_open_stateowner can return an already freed entry if there is a race to put openowners in the hashtable. In commit 7ffb588086e9, we changed it so that we allocate and initialize an openowner, and then check to see if a matching one got stuffed into the hashtable in the meantime. If it did, then we free the one we just allocated and take a reference on the one already there. There is a bug here though. The code will then return the pointer to the one that was allocated (and has now been freed). This wasn't evident before as this race almost never occurred. The Linux kernel client used to serialize requests for a single openowner. That has changed now with v4.0 kernels, and this race can now easily occur. Fixes: 7ffb588086e9 Cc: <stable@vger.kernel.org> # v3.17+ Cc: Trond Myklebust <trond.myklebust@primarydata.com> Reported-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25	Merge tag 'metag-fixes-v4.0-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull arch/metag fix from James Hogan: "Another metag architecture fix for v4.0 This is another single fix, for an include dependency problem when using ioremap_wc() from asm/io.h without also including asm/pgtable.h" * tag 'metag-fixes-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: metag: Fix ioremap_wc/ioremap_cached build errors
2015-03-26	phy: samsung_usb2: Fixup samsung_usb2_phy_power_on/off paths	Axel Lin
	Ensure we have balanced clk_prepare_enable/clk_disable_unprepare calls if .power_on or .power_off callbacks return error. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-26	phy: exynos5-usbdrd: Add to support for Exynos5433 SoC	Jaewon Kim
	This patch adds driver data to support for Exynos5433 SoC. The Exynos5433 has one USB3.0 Host and USB3.0 DRD(Dual Role Device). Exynos5433 is simplar to Eyxnos7 but Exynos5433 have one more USB3.0 Host controller. Signed-off-by: Jaewon Kim <jaewon02.kim@samsung.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-26	phy: qcom-ufs: Catch devm_phy_create failure in ufs_qcom_phy_generic_probe	Axel Lin
	Current code does NULL test against return value of ufs_qcom_phy_generic_probe. However, in the case of devm_phy_create() failure, ufs_qcom_phy_generic_probe does not return NULL. Fix it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-26	phy: stih41x-usb: Fixup stih41x_usb_phy_power_on failure path	Axel Lin
	If stih41x_usb_phy_power_on() fails, we need to call clk_disable_unprepare() before return error. This is to ensure we have balanced clk_enable/disable calls. Signed-off-by: Axel Lin <axel.lin@ingics.com> Acked-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2015-03-25	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc fixes from Andrew Morton: "15 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: numa: mark huge PTEs young when clearing NUMA hinting faults mm: numa: slow PTE scan rate if migration failures occur mm: numa: preserve PTE write permissions across a NUMA hinting fault mm: numa: group related processes based on VMA flags instead of page table flags hfsplus: fix B-tree corruption after insertion at position 0 MAINTAINERS: add Jan as DMI/SMBIOS support maintainer fs/affs/file.c: unlock/release page on error mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate mm/slub: fix lockups on PREEMPT && !SMP kernels mm/memory hotplug: postpone the reset of obsolete pgdat MAINTAINERS: correct rtc armada38x pattern entry mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers mm: fix anon_vma->degree underflow in anon_vma endless growing prevention drivers/rtc/rtc-mrst: fix suspend/resume aoe: update aoe maintainer information
2015-03-25	Merge tag 'signed-for-4.0' of git://github.com/agraf/linux-2.6	Marcelo Tosatti
	Patch queue for 4.0 - 2015-03-25 A few bug fixes for Book3S HV KVM: - Fix spinlock ordering - Fix idle guests on LE hosts - Fix instruction emulation
2015-03-25	mm: numa: mark huge PTEs young when clearing NUMA hinting faults	Mel Gorman
	Base PTEs are marked young when the NUMA hinting information is cleared but the same does not happen for huge pages which this patch addresses. Note that migrated pages are not marked young as the base page migration code does not assume that migrated pages have been referenced. This could be addressed but beyond the scope of this series which is aimed at Dave Chinners shrink workload that is unlikely to be affected by this issue. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-25	mm: numa: slow PTE scan rate if migration failures occur	Mel Gorman
	Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226 Across the board the 4.0-rc1 numbers are much slower, and the degradation is far worse when using the large memory footprint configs. Perf points straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config: - 56.07% 56.07% [kernel] [k] default_send_IPI_mask_sequence_phys - default_send_IPI_mask_sequence_phys - 99.99% physflat_send_IPI_mask - 99.37% native_send_call_func_ipi smp_call_function_many - native_flush_tlb_others - 99.85% flush_tlb_page ptep_clear_flush try_to_unmap_one rmap_walk try_to_unmap migrate_pages migrate_misplaced_page - handle_mm_fault - 99.73% __do_page_fault trace_do_page_fault do_async_page_fault + async_page_fault 0.63% native_send_call_func_single_ipi generic_exec_single smp_call_function_single This is showing excessive migration activity even though excessive migrations are meant to get throttled. Normally, the scan rate is tuned on a per-task basis depending on the locality of faults. However, if migrations fail for any reason then the PTE scanner may scan faster if the faults continue to be remote. This means there is higher system CPU overhead and fault trapping at exactly the time we know that migrations cannot happen. This patch tracks when migration failures occur and slows the PTE scanner. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-25	mm: numa: preserve PTE write permissions across a NUMA hinting fault	Mel Gorman
	Protecting a PTE to trap a NUMA hinting fault clears the writable bit and further faults are needed after trapping a NUMA hinting fault to set the writable bit again. This patch preserves the writable bit when trapping NUMA hinting faults. The impact is obvious from the number of minor faults trapped during the basis balancing benchmark and the system CPU usage; autonumabench 4.0.0-rc4 4.0.0-rc4 baseline preserve Time System-NUMA01 107.13 ( 0.00%) 103.13 ( 3.73%) Time System-NUMA01_THEADLOCAL 131.87 ( 0.00%) 83.30 ( 36.83%) Time System-NUMA02 8.95 ( 0.00%) 10.72 (-19.78%) Time System-NUMA02_SMT 4.57 ( 0.00%) 3.99 ( 12.69%) Time Elapsed-NUMA01 515.78 ( 0.00%) 517.26 ( -0.29%) Time Elapsed-NUMA01_THEADLOCAL 384.10 ( 0.00%) 384.31 ( -0.05%) Time Elapsed-NUMA02 48.86 ( 0.00%) 48.78 ( 0.16%) Time Elapsed-NUMA02_SMT 47.98 ( 0.00%) 48.12 ( -0.29%) 4.0.0-rc4 4.0.0-rc4 baseline preserve User 44383.95 43971.89 System 252.61 201.24 Elapsed 998.68 1000.94 Minor Faults 2597249 1981230 Major Faults 365 364 There is a similar drop in system CPU usage using Dave Chinner's xfsrepair workload 4.0.0-rc4 4.0.0-rc4 baseline preserve Amean real-xfsrepair 454.14 ( 0.00%) 442.36 ( 2.60%) Amean syst-xfsrepair 277.20 ( 0.00%) 204.68 ( 26.16%) The patch looks hacky but the alternatives looked worse. The tidest was to rewalk the page tables after a hinting fault but it was more complex than this approach and the performance was worse. It's not generally safe to just mark the page writable during the fault if it's a write fault as it may have been read-only for COW so that approach was discarded. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-25	mm: numa: group related processes based on VMA flags instead of page table flags	Mel Gorman
	These are three follow-on patches based on the xfsrepair workload Dave Chinner reported was problematic in 4.0-rc1 due to changes in page table management -- https://lkml.org/lkml/2015/3/1/226. Much of the problem was reduced by commit 53da3bc2ba9e ("mm: fix up numa read-only thread grouping logic") and commit ba68bc0115eb ("mm: thp: Return the correct value for change_huge_pmd"). It was known that the performance in 3.19 was still better even if is far less safe. This series aims to restore the performance without compromising on safety. For the test of this mail, I'm comparing 3.19 against 4.0-rc4 and the three patches applied on top autonumabench 3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 vanilla vanilla vmwrite-v5r8 preserve-v5r8 slowscan-v5r8 Time System-NUMA01 124.00 ( 0.00%) 161.86 (-30.53%) 107.13 ( 13.60%) 103.13 ( 16.83%) 145.01 (-16.94%) Time System-NUMA01_THEADLOCAL 115.54 ( 0.00%) 107.64 ( 6.84%) 131.87 (-14.13%) 83.30 ( 27.90%) 92.35 ( 20.07%) Time System-NUMA02 9.35 ( 0.00%) 10.44 (-11.66%) 8.95 ( 4.28%) 10.72 (-14.65%) 8.16 ( 12.73%) Time System-NUMA02_SMT 3.87 ( 0.00%) 4.63 (-19.64%) 4.57 (-18.09%) 3.99 ( -3.10%) 3.36 ( 13.18%) Time Elapsed-NUMA01 570.06 ( 0.00%) 567.82 ( 0.39%) 515.78 ( 9.52%) 517.26 ( 9.26%) 543.80 ( 4.61%) Time Elapsed-NUMA01_THEADLOCAL 393.69 ( 0.00%) 384.83 ( 2.25%) 384.10 ( 2.44%) 384.31 ( 2.38%) 380.73 ( 3.29%) Time Elapsed-NUMA02 49.09 ( 0.00%) 49.33 ( -0.49%) 48.86 ( 0.47%) 48.78 ( 0.63%) 50.94 ( -3.77%) Time Elapsed-NUMA02_SMT 47.51 ( 0.00%) 47.15 ( 0.76%) 47.98 ( -0.99%) 48.12 ( -1.28%) 49.56 ( -4.31%) 3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 vanilla vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8 User 46334.60 46391.94 44383.95 43971.89 44372.12 System 252.84 284.66 252.61 201.24 249.00 Elapsed 1062.14 1050.96 998.68 1000.94 1026.78 Overall the system CPU usage is comparable and the test is naturally a bit variable. The slowing of the scanner hurts numa01 but on this machine it is an adverse workload and patches that dramatically help it often hurt absolutely everything else. Due to patch 2, the fault activity is interesting 3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 vanilla vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8 Minor Faults 2097811 2656646 2597249 1981230 1636841 Major Faults 362 450 365 364 365 Note the impact preserving the write bit across protection updates and fault reduces faults. NUMA alloc hit 1229008 1217015 1191660 1178322 1199681 NUMA alloc miss 0 0 0 0 0 NUMA interleave hit 0 0 0 0 0 NUMA alloc local 1228514 1216317 1190871 1177448 1199021 NUMA base PTE updates 245706197 240041607 238195516 244704842 115012800 NUMA huge PMD updates 479530 468448 464868 477573 224487 NUMA page range updates 491225557 479886983 476207932 489222218 229950144 NUMA hint faults 659753 656503 641678 656926 294842 NUMA hint local faults 381604 373963 360478 337585 186249 NUMA hint local percent 57 56 56 51 63 NUMA pages migrated 5412140 6374899 6266530 5277468 5755096 AutoNUMA cost 5121% 5083% 4994% 5097% 2388% Here the impact of slowing the PTE scanner on migratrion failures is obvious as "NUMA base PTE updates" and "NUMA huge PMD updates" are massively reduced even though the headline performance is very similar. As xfsrepair was the reported workload here is the impact of the series on it. xfsrepair 3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 vanilla vanilla vmwrite-v5r8 preserve-v5r8 slowscan-v5r8 Min real-fsmark 1183.29 ( 0.00%) 1165.73 ( 1.48%) 1152.78 ( 2.58%) 1153.64 ( 2.51%) 1177.62 ( 0.48%) Min syst-fsmark 4107.85 ( 0.00%) 4027.75 ( 1.95%) 3986.74 ( 2.95%) 3979.16 ( 3.13%) 4048.76 ( 1.44%) Min real-xfsrepair 441.51 ( 0.00%) 463.96 ( -5.08%) 449.50 ( -1.81%) 440.08 ( 0.32%) 439.87 ( 0.37%) Min syst-xfsrepair 195.76 ( 0.00%) 278.47 (-42.25%) 262.34 (-34.01%) 203.70 ( -4.06%) 143.64 ( 26.62%) Amean real-fsmark 1188.30 ( 0.00%) 1177.34 ( 0.92%) 1157.97 ( 2.55%) 1158.21 ( 2.53%) 1182.22 ( 0.51%) Amean syst-fsmark 4111.37 ( 0.00%) 4055.70 ( 1.35%) 3987.19 ( 3.02%) 3998.72 ( 2.74%) 4061.69 ( 1.21%) Amean real-xfsrepair 450.88 ( 0.00%) 468.32 ( -3.87%) 454.14 ( -0.72%) 442.36 ( 1.89%) 440.59 ( 2.28%) Amean syst-xfsrepair 199.66 ( 0.00%) 290.60 (-45.55%) 277.20 (-38.84%) 204.68 ( -2.51%) 150.55 ( 24.60%) Stddev real-fsmark 4.12 ( 0.00%) 10.82 (-162.29%) 4.14 ( -0.28%) 5.98 (-45.05%) 4.60 (-11.53%) Stddev syst-fsmark 2.63 ( 0.00%) 20.32 (-671.82%) 0.37 ( 85.89%) 16.47 (-525.59%) 15.05 (-471.79%) Stddev real-xfsrepair 6.87 ( 0.00%) 4.55 ( 33.75%) 3.46 ( 49.58%) 1.78 ( 74.12%) 0.52 ( 92.50%) Stddev syst-xfsrepair 3.02 ( 0.00%) 10.30 (-241.37%) 13.17 (-336.37%) 0.71 ( 76.63%) 5.00 (-65.61%) CoeffVar real-fsmark 0.35 ( 0.00%) 0.92 (-164.73%) 0.36 ( -2.91%) 0.52 (-48.82%) 0.39 (-12.10%) CoeffVar syst-fsmark 0.06 ( 0.00%) 0.50 (-682.41%) 0.01 ( 85.45%) 0.41 (-543.22%) 0.37 (-478.78%) CoeffVar real-xfsrepair 1.52 ( 0.00%) 0.97 ( 36.21%) 0.76 ( 49.94%) 0.40 ( 73.62%) 0.12 ( 92.33%) CoeffVar syst-xfsrepair 1.51 ( 0.00%) 3.54 (-134.54%) 4.75 (-214.31%) 0.34 ( 77.20%) 3.32 (-119.63%) Max real-fsmark 1193.39 ( 0.00%) 1191.77 ( 0.14%) 1162.90 ( 2.55%) 1166.66 ( 2.24%) 1188.50 ( 0.41%) Max syst-fsmark 4114.18 ( 0.00%) 4075.45 ( 0.94%) 3987.65 ( 3.08%) 4019.45 ( 2.30%) 4082.80 ( 0.76%) Max real-xfsrepair 457.80 ( 0.00%) 474.60 ( -3.67%) 457.82 ( -0.00%) 444.42 ( 2.92%) 441.03 ( 3.66%) Max syst-xfsrepair 203.11 ( 0.00%) 303.65 (-49.50%) 294.35 (-44.92%) 205.33 ( -1.09%) 155.28 ( 23.55%) The really relevant lines as syst-xfsrepair which is the system CPU usage when running xfsrepair. Note that on my machine the overhead was 45% higher on 4.0-rc4 which may be part of what Dave is seeing. Once we preserve the write bit across faults, it's only 2.51% higher on average. With the full series applied, system CPU usage is 24.6% lower on average. Again, the impact of preserving the write bit on minor faults is obvious and the impact of slowing scanning after migration failures is obvious on the PTE updates. Note also that the number of pages migrated is much reduced even though the headline performance is comparable. 3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 vanilla vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8 Minor Faults 153466827 254507978 249163829 153501373 105737890 Major Faults 610 702 690 649 724 NUMA base PTE updates 217735049 210756527 217729596 216937111 144344993 NUMA huge PMD updates 129294 85044 106921 127246 79887 NUMA pages migrated 21938995 29705270 28594162 22687324 16258075 3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 vanilla vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8 Mean sdb-avgqusz 13.47 2.54 2.55 2.47 2.49 Mean sdb-avgrqsz 202.32 140.22 139.50 139.02 138.12 Mean sdb-await 25.92 5.09 5.33 5.02 5.22 Mean sdb-r_await 4.71 0.19 0.83 0.51 0.11 Mean sdb-w_await 104.13 5.21 5.38 5.05 5.32 Mean sdb-svctm 0.59 0.13 0.14 0.13 0.14 Mean sdb-rrqm 0.16 0.00 0.00 0.00 0.00 Mean sdb-wrqm 3.59 1799.43 1826.84 1812.21 1785.67 Max sdb-avgqusz 111.06 12.13 14.05 11.66 15.60 Max sdb-avgrqsz 255.60 190.34 190.01 187.33 191.78 Max sdb-await 168.24 39.28 49.22 44.64 65.62 Max sdb-r_await 660.00 52.00 280.00 76.00 12.00 Max sdb-w_await 7804.00 39.28 49.22 44.64 65.62 Max sdb-svctm 4.00 2.82 2.86 1.98 2.84 Max sdb-rrqm 8.30 0.00 0.00 0.00 0.00 Max sdb-wrqm 34.20 5372.80 5278.60 5386.60 5546.15 FWIW, I also checked SPECjbb in different configurations but it's similar observations -- minor faults lower, PTE update activity lower and performance is roughly comparable against 3.19. This patch (of 3): Threads that share writable data within pages are grouped together as related tasks. This decision is based on whether the PTE is marked dirty which is subject to timing races between the PTE scanner update and when the application writes the page. If the page is file-backed, then background flushes and sync also affect placement. This is unpredictable behaviour which is impossible to reason about so this patch makes grouping decisions based on the VMA flags. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-25	hfsplus: fix B-tree corruption after insertion at position 0	Sergei Antonov
	Fix B-tree corruption when a new record is inserted at position 0 in the node in hfs_brec_insert(). In this case a hfs_brec_update_parent() is called to update the parent index node (if exists) and it is passed hfs_find_data with a search_key containing a newly inserted key instead of the key to be updated. This results in an inconsistent index node. The bug reproduces on my machine after an extents overflow record for the catalog file (CNID=4) is inserted into the extents overflow B-tree. Because of a low (reserved) value of CNID=4, it has to become the first record in the first leaf node. The resulting first leaf node is correct: ---------------------------------------------------- \| key0.CNID=4 \| key1.CNID=123 \| key2.CNID=456, ... \| ---------------------------------------------------- But the parent index key0 still contains the previous key CNID=123: ----------------------- \| key0.CNID=123 \| ... \| ----------------------- A change in hfs_brec_insert() makes hfs_brec_update_parent() work correctly by preventing it from getting fd->record=-1 value from __hfs_brec_find(). Along the way, I removed duplicate code with unification of the if condition. The resulting code is equivalent to the original code because node is never 0. Also hfs_brec_update_parent() will now return an error after getting a negative fd->record value. However, the return value of hfs_brec_update_parent() is not checked anywhere in the file and I'm leaving it unchanged by this patch. brec.c lacks error checking after some other calls too, but this issue is of less importance than the one being fixed by this patch. Signed-off-by: Sergei Antonov <saproj@gmail.com> Cc: Joe Perches <joe@perches.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Anton Altaparmakov <aia21@cam.ac.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>