summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-12-05documentation: Record RCU requirementsPaul E. McKenney
This commit adds RCU requirements as published in a 2015 LWN series. Bringing these requirements in-tree allows them to be updated as changes are discovered. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Updates to charset and URLs as suggested by Josh Triplett. ]
2015-12-05Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes a couple of crypto drivers that were using memcmp to verify authentication tags. They now use crypto_memneq instead" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: talitos - Fix timing leak in ESP ICV verification crypto: nx - Fix timing leak in GCM and CCM decryption
2015-12-05EDAC, sb_edac: Add Knights Landing (Xeon Phi gen 2) supportJim Snow
Knights Landing is the next generation architecture for HPC market. KNL introduces concept of a tile and CHA - Cache/Home Agent for memory accesses. Some things are fixed in KNL: () There's single DIMM slot per channel () There's 2 memory controllers with 3 channels each, however, from EDAC standpoint, it is presented as single memory controller with 6 channels. In order to represent 2 MCs w/ 3 CH, it would require major redesign of EDAC core driver. Basically, two functionalities are added/extended: () during driver initialization KNL topology is being recognized, i.e. which channels are populated with what DIMM sizes (knl_get_dimm_capacity function) () handle MCE errors - channel swizzling Reviewed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Jim Snow <jim.m.snow@intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449136134-23706-5-git-send-email-hubert.chrzaniuk@intel.com [ Rebase to 4.4-rc3. ] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2015-12-05EDAC, sb_edac: Add support for duplicate device IDsJim Snow
Add options to sbridge_get_all_devices() to allow for duplicate device IDs and devices that are scattered across mulitple PCI buses. Signed-off-by: Jim Snow <jim.m.snow@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449136134-23706-4-git-send-email-hubert.chrzaniuk@intel.com [ Rebase to 4.4-rc3. ] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2015-12-05EDAC, sb_edac: Virtualize several hard-coded functionsJim Snow
SAD limit, interleave mode and DRAM related functionalities are now virtualized, so that overriding them is easier. Signed-off-by: Jim Snow <jim.m.snow@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: lukasz.anaczkowski@intel.com Link: http://lkml.kernel.org/r/1449136134-23706-3-git-send-email-hubert.chrzaniuk@intel.com [ Rebase to 4.4-rc3. ] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2015-12-05x86/signal: Fix restart_syscall number for x32 tasksDmitry V. Levin
When restarting a syscall with regs->ax == -ERESTART_RESTARTBLOCK, regs->ax is assigned to a restart_syscall number. For x32 tasks, this syscall number must have __X32_SYSCALL_BIT set, otherwise it will be an x86_64 syscall number instead of a valid x32 syscall number. This issue has been there since the introduction of x32. Reported-by: strace/tests/restart_syscall.test Reported-and-tested-by: Elvira Khabirova <lineprinter0@gmail.com> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Cc: Elvira Khabirova <lineprinter0@gmail.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20151130215436.GA25996@altlinux.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-05x86/mpx: Fix instruction decoder conditionDave Hansen
MPX decodes instructions in order to tell which bounds register was violated. Part of this decoding involves looking at the "REX prefix" which is a special instrucion prefix used to retrofit support for new registers in to old instructions. The X86_REX_*() macros are defined to return actual bit values: #define X86_REX_R(rex) ((rex) & 4) *not* boolean values. However, the MPX code was checking for them like they were booleans. This might have led to us mis-decoding the "REX prefix" and giving false information out to userspace about bounds violations. X86_REX_B() actually is bit 1, so this is really only broken for the X86_REX_X() case. Fix the conditionals up to tolerate the non-boolean values. Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: Dave Hansen <dave@sr71.net> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20151201003113.D800C1E0@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-12-05dmaengine: mic_x100: add missing spin_unlockSaurabh Sengar
spin lock should be released while returning from function Signed-off-by: Saurabh Sengar <saurabh.truth@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-05dmaengine: bcm2835-dma: Convert to use DMA poolPeter Ujfalusi
f93178291712 dmaengine: bcm2835-dma: Fix memory leak when stopping a running transfer Fixed the memleak, but introduced another issue: the terminate_all callback might be called with interrupts disabled and the dma_free_coherent() is not allowed to be called when IRQs are disabled. Convert the driver to use dma_pool_* for managing the list of control blocks for the transfer. Fixes: f93178291712 ("dmaengine: bcm2835-dma: Fix memory leak when stopping a running transfer") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Matthias Reichl <hias@horus.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-05dmaengine: at_xdmac: fix bad behavior in interleaved modeSylvain ETIENNE
When performing interleaved transfers with numf > 1, an extra line is copied. The mbr.bc field is incremented once too often. The length of the block is (BLEN+1) microblocks. Signed-off-by: Sylvain ETIENNE <Sylvain.ETIENNE@ingenico.com> Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: 4e5385784e69 ("dmaengine: at_xdmac: handle numf > 1") Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-05dmaengine: at_xdmac: fix false condition for memset_sg transfersLudovic Desroches
The code was not in agreement with the comments. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Cc: stable@vger.kernel.org # 4.3 and later Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-05dmaengine: at_xdmac: fix macro typoLudovic Desroches
Fix typo in a macro which was not used until now. It explains why there is no error at compilation time. Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com> Fixes: e1f7c9eee707 "dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver" Cc: stable@vger.kernel.org # 3.19 and later Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-12-05Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next A few more last minute fixes for 4.4 on top of my pull request from earlier this week. The big change here is a vblank regression fix due to commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed". Beyond that, a hotplug fix and a few VM fixes. * 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3) drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2) drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt drm/amdgpu: add spin lock to protect freed list in vm (v2) drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2 drm/amdgpu: take a BO reference for the user fence drm/amdgpu: take a BO reference in the display code drm/amdgpu: set snooped flags only on system addresses v2 drm/amdgpu: fix race condition in amd_sched_entity_push_job drm/amdgpu: add err check for pin userptr add blacklist for thinkpad T40p drm/amdgpu: fix VM page table reference counting drm/amdgpu: fix userptr flags check
2015-12-04PCI: hisi: Fix hisi_pcie_cfg_read() 32-bit readsDongdong Liu
For 32-bit config reads (size == 4), hisi_pcie_cfg_read() returned success but never filled in the data we read. Return the register data for 32-bit config reads. Without this fix, PCI doesn't work at all because enumeration depends on 32-bit config reads. The driver was tested internally, but got broken in the process of upstreaming, so this fixes the breakage. Fixes: 500a1d9a43e0 ("PCI: hisi: Add HiSilicon SoC Hip05 PCIe driver") Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
2015-12-04PCI: altera: Fix error when INTx is 4Ley Foon Tan
PCI interrupt lines start at 1, not at 0. So, creates additional one interrupt when register for irq domain. Error when PCIe devices have 4 INTx: WARNING: CPU: 1 PID: 1 at kernel/irq/irqdomain.c:280 irq_domain_associate+0x17c/0x1cc() error: hwirq 0x4 is too large for dummy Tested on Ethernet adapter card with multi-functions. Signed-off-by: Ley Foon Tan <lftan@altera.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-12-04PCI: altera: Check TLP completion statusLey Foon Tan
Check TLP packet successful completion status. This fix the issue when accessing multi-function devices in enumeration process, TLP will return error when accessing non-exist function number. Returns PCI error code instead of generic errno. Tested on Ethernet adapter card with multi-functions. [bhelgaas: simplify completion status checking code] Signed-off-by: Ley Foon Tan <lftan@altera.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-12-04PCI: altera: Fix Requester ID for config accessesLey Foon Tan
The Requester ID should use the Root Port devfn and it should be always 0. Previously we constructed the Requester ID using the *Completer* devfn, i.e., the devfn of the Function we expect to respond to the config access. This causes issues when accessing configuration space for devices other than the Root Port. Build the Requester ID using the Root Port devfn. Tested on Ethernet adapter card with multi-functions. Signed-off-by: Ley Foon Tan <lftan@altera.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-12-04PCI: altera: Fix loop in tlp_read_packet()Dan Carpenter
TLP_LOOP is 500 and the "loop" variable was a u8 so "loop < TLP_LOOP" is always true. We only need this condition to work if there is a problem so it would have been easy to miss this in testing. Make it a normal for loop with "int i" instead of over thinking things and making it complicated. Fixes: 6bb4dd154ae8 ("PCI: altera: Add Altera PCIe host controller driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Ley Foon Tan <lftan@altera.com>
2015-12-04atl1c: Improve driver not to do order 4 GFP_ATOMIC allocationPavel Machek
atl1c driver is doing order-4 allocation with GFP_ATOMIC priority. That often breaks networking after resume. Switch to GFP_KERNEL. Still not ideal, but should be significantly better. atl1c_setup_ring_resources() is called from .open() function, and already uses GFP_KERNEL, so this change is safe. Signed-off-by: Pavel Machek <pavel@ucw.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04rhashtable: Use __vmalloc with GFP_ATOMIC for table allocationHerbert Xu
When an rhashtable user pounds rhashtable hard with back-to-back insertions we may end up growing the table in GFP_ATOMIC context. Unfortunately when the table reaches a certain size this often fails because we don't have enough physically contiguous pages to hold the new table. Eric Dumazet suggested (and in fact wrote this patch) using __vmalloc instead which can be used in GFP_ATOMIC context. Reported-by: Phil Sutter <phil@nwl.cc> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04gre6: allow to update all parameters via rtnlNicolas Dichtel
Parameters were updated only if the kernel was unable to find the tunnel with the new parameters, ie only if core pamareters were updated (keys, addr, link, type). Now it's possible to update ttl, hoplimit, flowinfo and flags. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04pppoe: fix memory corruption in padt work structureGuillaume Nault
pppoe_connect() mustn't touch the padt_work field of pppoe sockets because that work could be already pending. [ 21.473147] BUG: unable to handle kernel NULL pointer dereference at 00000004 [ 21.474523] IP: [<c1043177>] process_one_work+0x29/0x31c [ 21.475164] *pde = 00000000 [ 21.475513] Oops: 0000 [#1] SMP [ 21.475910] Modules linked in: pppoe pppox ppp_generic slhc crc32c_intel aesni_intel virtio_net xts aes_i586 lrw gf128mul ablk_helper cryptd evdev acpi_cpufreq processor serio_raw button ext4 crc16 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio [ 21.476168] CPU: 2 PID: 164 Comm: kworker/2:2 Not tainted 4.4.0-rc1 #1 [ 21.476168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 21.476168] task: f5f83c00 ti: f5e28000 task.ti: f5e28000 [ 21.476168] EIP: 0060:[<c1043177>] EFLAGS: 00010046 CPU: 2 [ 21.476168] EIP is at process_one_work+0x29/0x31c [ 21.484082] EAX: 00000000 EBX: f678b2a0 ECX: 00000004 EDX: 00000000 [ 21.484082] ESI: f6c69940 EDI: f5e29ef0 EBP: f5e29f0c ESP: f5e29edc [ 21.484082] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 21.484082] CR0: 80050033 CR2: 000000a4 CR3: 317ad000 CR4: 00040690 [ 21.484082] Stack: [ 21.484082] 00000000 f6c69950 00000000 f6c69940 c0042338 f5e29f0c c1327945 00000000 [ 21.484082] 00000008 f678b2a0 f6c69940 f678b2b8 f5e29f30 c1043984 f5f83c00 f6c69970 [ 21.484082] f678b2a0 c10437d3 f6775e80 f678b2a0 c10437d3 f5e29fac c1047059 f5e29f74 [ 21.484082] Call Trace: [ 21.484082] [<c1327945>] ? _raw_spin_lock_irq+0x28/0x30 [ 21.484082] [<c1043984>] worker_thread+0x1b1/0x244 [ 21.484082] [<c10437d3>] ? rescuer_thread+0x229/0x229 [ 21.484082] [<c10437d3>] ? rescuer_thread+0x229/0x229 [ 21.484082] [<c1047059>] kthread+0x8f/0x94 [ 21.484082] [<c1327a32>] ? _raw_spin_unlock_irq+0x22/0x26 [ 21.484082] [<c1327ee9>] ret_from_kernel_thread+0x21/0x38 [ 21.484082] [<c1046fca>] ? kthread_parkme+0x19/0x19 [ 21.496082] Code: 5d c3 55 89 e5 57 56 53 89 c3 83 ec 24 89 d0 89 55 e0 8d 7d e4 e8 6c d8 ff ff b9 04 00 00 00 89 45 d8 8b 43 24 89 45 dc 8b 45 d8 <8b> 40 04 8b 80 e0 00 00 00 c1 e8 05 24 01 88 45 d7 8b 45 e0 8d [ 21.496082] EIP: [<c1043177>] process_one_work+0x29/0x31c SS:ESP 0068:f5e29edc [ 21.496082] CR2: 0000000000000004 [ 21.496082] ---[ end trace e362cc9cf10dae89 ]--- Reported-by: Andrew <nitr0@seti.kr.ua> Fixes: 287f3a943fef ("pppoe: Use workqueue to die properly when a PADT is received") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This addresses a refcounting bug that leads to a use-after-free" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: don't put snap_context twice in rbd_queue_workfn()
2015-12-04list: Use WRITE_ONCE() when initializing list_head structuresPaul E. McKenney
Code that does lockless emptiness testing of non-RCU lists is relying on INIT_LIST_HEAD() to write the list head's ->next pointer atomically, particularly when INIT_LIST_HEAD() is invoked from list_del_init(). This commit therefore adds WRITE_ONCE() to this function's pointer stores that could affect the head's ->next pointer. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04list: Introduces generic list_splice_tail_init_rcu()Petko Manolov
The list_splice_init_rcu() can be used as a stack onto which full lists are pushed, but queue-like behavior is now needed by some security policies. This requires a list_splice_tail_init_rcu(). This commit therefore supplies a list_splice_tail_init_rcu() by pulling code common it and to list_splice_init_rcu() into a new __list_splice_init_rcu() function. This new function is based on the existing list_splice_init_rcu() implementation. Signed-off-by: Petko Manolov <petkan@mip-labs.com> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Stop disabling interrupts in scheduler fastpathsPaul E. McKenney
We need the scheduler's fastpaths to be, well, fast, and unnecessarily disabling and re-enabling interrupts is not necessarily consistent with this goal. Especially given that there are regions of the scheduler that already have interrupts disabled. This commit therefore moves the call to rcu_note_context_switch() to one of the interrupts-disabled regions of the scheduler, and removes the now-redundant disabling and re-enabling of interrupts from rcu_note_context_switch() and the functions it calls. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Shift rcu_note_context_switch() to avoid deadlock, as suggested by Peter Zijlstra. ]
2015-12-04rcu: Avoid tick_nohz_active checks on NOCBs CPUsPaul E. McKenney
Currently, rcu_prepare_for_idle() checks for tick_nohz_active, even on individual NOCBs CPUs, unless all CPUs are marked as NOCBs CPUs at build time. This check is pointless on NOCBs CPUs because they never have any callbacks posted, given that all of their callbacks are handed off to the corresponding rcuo kthread. There is a check for individually designated NOCBs CPUs, but it pointelessly follows the check for tick_nohz_active. This commit therefore moves the check for individually designated NOCBs CPUs up with the check for CONFIG_RCU_NOCB_CPU_ALL. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Fix obsolete rcu_bootup_announce_oddness() commentPaul E. McKenney
This function no longer has #ifdefs, so this commit removes the header comment calling them out. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Remove lock-acquisition loop from rcu_read_unlock_special()Paul E. McKenney
Several releases have come and gone without the warning triggering, so remove the lock-acquisition loop. Retain the WARN_ON_ONCE() out of sheer paranoia. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Simplify rcu_sched_qs() control flowPaul E. McKenney
This commit applies an early-exit approach to rcu_sched_qs(), reducing the nesting level and saving a line of code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04kernel: Make rcu/tree_trace.c explicitly non-modularPaul Gortmaker
The Kconfig currently controlling compilation of this code is: init/Kconfig:config TREE_RCU_TRACE init/Kconfig: def_bool RCU_TRACE && ( TREE_RCU || PREEMPT_RCU ) ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the file there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We could consider moving this to an earlier initcall if desired. We don't replace module.h with init.h since the file already has that. We also delete the moduleparam.h include that is left over from commit 64db4cfff99c04cd5f550357edcc8780f96b54a2 (""Tree RCU": scalable classic RCU implementation") since it is not needed here either. We morph some tags like MODULE_AUTHOR into the comments at the top of the file for documentation purposes. Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Move lock_class_key to local scopePaul E. McKenney
Currently, the rcu_node_class[], rcu_fqs_class[], and rcu_exp_class[] arrays needlessly pollute the global namespace within tree.c. This commit therefore converts them to static local variables within rcu_init_one(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Allow expedited grace periods to be disabled at initPaul E. McKenney
Expedited grace periods can speed up boot, but are undesirable in aggressive real-time systems. This commit therefore introduces a kernel parameter rcupdate.rcu_normal_after_boot that disables expedited grace periods just before init is spawned. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Wire up rcu_end_inkernel_boot()Paul E. McKenney
This commit adds the invocation of rcu_end_inkernel_boot() just before init is invoked. This allows the CONFIG_RCU_EXPEDITE_BOOT Kconfig option to do something useful and prepares for the upcoming rcupdate.rcu_normal_after_boot kernel parameter. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Add rcu_normal kernel parameter to suppress expeditingPaul E. McKenney
Although expedited grace periods can be quite useful, and although their OS jitter has been greatly reduced, they can still pose problems for extreme real-time workloads. This commit therefore adds a rcu_normal kernel boot parameter (which can also be manipulated via sysfs) to suppress expedited grace periods, that is, to treat requests for expedited grace periods as if they were requests for normal grace periods. If both rcu_expedited and rcu_normal are specified, rcu_normal wins. This means that if you are relying on expedited grace periods to speed up boot, you will want to specify rcu_expedited on the kernel command line, and then specify rcu_normal via sysfs once boot completes. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Add more diagnostics to expedited stall warning messages.Paul E. McKenney
This commit adds print statements that check the rcu_node structure to find which ->expmask bits and which ->exp_tasks structures are blocking the current expedited grace period. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Make expedited grace periods resolve stall-warning tiesPaul E. McKenney
Currently, if a grace period ends just as the stall-warning timeout fires, an empty stall warning will be printed. This is not helpful, so this commit avoids these useless warnings by rechecking completion after awakening in synchronize_sched_expedited_wait(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Reduce expedited GP memory contention via per-CPU variablesPaul E. McKenney
Currently, the piggybacked-work checks carried out by sync_exp_work_done() atomically increment a small set of variables (the ->expedited_workdone0, ->expedited_workdone1, ->expedited_workdone2, ->expedited_workdone3 fields in the rcu_state structure), which will form a memory-contention bottleneck given a sufficiently large number of CPUs concurrently invoking either synchronize_rcu_expedited() or synchronize_sched_expedited(). This commit therefore moves these for fields to the per-CPU rcu_data structure, eliminating the memory contention. The show_rcuexp() function also changes to sum up each field in the rcu_data structures. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Invert sync_rcu_exp_select_cpus() "if" statementPaul E. McKenney
This commit saves a couple lines of code and reduces indentation by inverting the sense of an "if" statement in the function sync_rcu_exp_select_cpus(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Move smp_mb() from rcu_seq_snap() to rcu_exp_gp_seq_snap()Paul E. McKenney
The memory barrier in rcu_seq_snap() is needed only for grace periods, so this commit moves it to the grace-period-oriented wrapper rcu_exp_gp_seq_snap(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Clarify role of ->expmaskinitnextPaul E. McKenney
Analogy with the ->qsmaskinitnext field might lead one to believe that ->expmaskinitnext tracks online CPUs. This belief is incorrect: Any CPU that has ever been online will have its bit set in the ->expmaskinitnext field. This commit therefore adds a comment to make this clear, at least to people who read comments. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04rcu: Short-circuit synchronize_sched_expedited() if only one CPUPaul E. McKenney
If there is only one CPU, then invoking synchronize_sched_expedited() is by definition a grace period. This commit checks for this condition and does a short-circuit return in that case. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-12-04drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)Alex Deucher
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Maybe replace the udelay() in the flip_work_func() by a suitable usleep_range() for a bit better efficiency? Will try that. - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Probably fixes: fdo#93147 Port of Mario's radeon fix to amdgpu. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v2) Refine amdgpu_flip_work_func() for better efficiency. In amdgpu_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Also small fix to code comment and formatting in that function. (v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> (v3) Fix crash in crtc disabled case
2015-12-04Merge branch 'mvpp2-fixes'David S. Miller
Marcin Wojtas says: ==================== Marvell Armada 375 mvpp2 fixes During my work on mvneta driver I revised mvpp2, and it occurred that the initial version of Marvell Armada 375 SoC comprised bugs around DMA-unmapping in both ingress and egress paths - not all buffers were umapped in TX path and none(!) in RX. Three patches that I send fix this situation. Any feedback would be welcome. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04net: mvpp2: fix refilling BM pools in RX pathMarcin Wojtas
In hitherto code in case of RX buffer allocation error during refill, original buffer is pushed to the network stack, but the amount of available buffer pointers in BM pool is decreased. This commit fixes the situation by moving refill call before skb_put(), and returning original buffer pointer to the pool in case of an error. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Cc: <stable@vger.kernel.org> # v3.18+ Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04net: mvpp2: fix buffers' DMA handling on RX pathMarcin Wojtas
Each allocated buffer, whose pointer is put into BM pool is DMA-mapped. Hence it should be properly unmapped after usage or when removing buffers from pool. This commit fixes DMA handling on RX path by adding dma_unmap_single() in mvpp2_rx() and in mvpp2_bufs_free(). The latter function's argument number had to be increased for this purpose. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Cc: <stable@vger.kernel.org> # v3.18+ Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04net: mvpp2: fix missing DMA region unmap in egress processingMarcin Wojtas
The Tx descriptor release code currently calls dma_unmap_single() and dev_kfree_skb_any() if the descriptor is associated with a non-NULL skb. This condition is true only for the last fragment of the packet. Since every descriptor's buffer is DMA-mapped it has to be properly unmapped. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Cc: <stable@vger.kernel.org> # v3.18+ Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04ALSA: rme96: Fix unexpected volume reset after rate changesTakashi Iwai
rme96 driver needs to reset DAC depending on the sample rate, and this results in resetting to the max volume suddenly. It's because of the missing call of snd_rme96_apply_dac_volume(). However, calling this function right after the DAC reset still may not work, and we need some delay before this call. Since the DAC reset and the procedure after that are performed in the spinlock, we delay the DAC volume restore at the end after the spinlock. Reported-and-tested-by: Sylvain LABOISNE <maeda1@free.fr> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-12-04rhashtable: Prevent spurious EBUSY errors on insertionHerbert Xu
Thomas and Phil observed that under stress rhashtable insertion sometimes failed with EBUSY, even though this error should only ever been seen when we're under attack and our hash chain length has grown to an unacceptable level, even after a rehash. It turns out that the logic for detecting whether there is an existing rehash is faulty. In particular, when two threads both try to grow the same table at the same time, one of them may see the newly grown table and thus erroneously conclude that it had been rehashed. This is what leads to the EBUSY error. This patch fixes this by remembering the current last table we used during insertion so that rhashtable_insert_rehash can detect when another thread has also done a resize/rehash. When this is detected we will give up our resize/rehash and simply retry the insertion with the new table. Reported-by: Thomas Graf <tgraf@suug.ch> Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: - NFIT parsing regression fixes from Linda. The nvdimm hot-add implementation merged in 4.4-rc1 interpreted the specification in a way that breaks actual HPE platforms. We are also closing the loop with the ACPI Working Group to get this clarification added to the spec. - Andy pointed out that his laptop without nvdimm resources is loading the e820-nvdimm module by default, fix that up to only load the module when an e820-type-12 range is present. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nfit: Adjust for different _FIT and NFIT headers nfit: Fix the check for a successful NFIT merge nfit: Account for table size length variation libnvdimm, e820: skip module loading when no type-12