Age | Commit message (Collapse) | Author |
|
Ages ago, drivers could return values greater than zero from their probe
function and this would be regarded as success.
But after f3ec4f87d607 ("PCI: change device runtime PM settings for probe
and remove") and 967577b06241 ("PCI/PM: Keep runtime PM enabled for unbound
PCI devices"), we set dev->driver to NULL if the driver's probe function
returns a value greater than zero.
__pci_device_probe() treats this as success, and drivers can still mostly
work even with dev->driver == NULL, but PCI power management doesn't work,
and we don't call the driver's remove function on rmmod.
To help catch these driver problems, issue a warning in this case.
[bhelgaas: changelog]
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
Introduce flag KM_ZERO which is used to alloc zeroed entry, and convert
kmem_{zone_}zalloc to call kmem_{zone_}alloc() with KM_ZERO directly,
in order to avoid the setting to zero step.
And following Dave's suggestion, make kmem_{zone_}zalloc static inline
into kmem.h as they're now just a simple wrapper.
V2:
Make kmem_{zone_}zalloc static inline into kmem.h as Dave suggested.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
The early_init_devtree() API was removed in linux-next for 3.13 with
commit "mips: use early_init_dt_scan". This causes Netlogic XLP compile
to fail:
arch/mips/netlogic/xlp/setup.c:101: undefined reference to `early_init_devtree'
Add xlp_early_init_devtree() which uses the __dt_setup_arch() to
handle early device tree related initialization to fix this.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
|
|
|
|
The modpost tool could overflow its stack buffer if someone was running
with an insane shell environment. Regardless, it's technically a bug,
so this fixes it to truncate the string instead of seg-faulting.
Found by Coverity.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
|
|
Disable static detection: the static currently drops a lot of useful
information including clones generated by gcc. Drop this. The statics
will appear now without static. prefix.
But remove the LTO .NUMBER postfixes that look ugly
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
|
|
Also warn for too long symbols
v2: Add missing newline. Use 255 max (Joe Perches)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
|
|
[AK: This seems like a ticking time bomb even without LTO,
so should be merged now. It causes very weird problems.
Thanks to Joe for tracking them down.]
With the added postfixes that LTO adds for local
symbols, the longest name in the kernel overflows
the namebuf[KSYM_NAME_LEN] array by two bytes. That name is:
__pci_fixup_resumePCI_VENDOR_ID_SERVERWORKSPCI_DEVICE_ID_SERVERWORKS_HT1000SBquirk_disable_broadcom_boot_interrupt.1488004.672802
Double the max symbol name length.
v2: Use 255 (Joe Perches)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
|
|
Error
When scripts/kernel-doc cannot understand a function prototype,
it had been generating a fatal error and stopping immediately.
Make this a Warning instead of an Error and keep going.
Note that this can happen if the kernel-doc notation that is being
parsed is not actually a function prototype; maybe it's a struct or
something else, so I added "function" to the warning message to try
to make it clearer that scripts/kernel-doc is looking for a function
prototype here.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
|
|
This patch adds support for completion interrupt coalescing that
allows only every ISERT_COMP_BATCH_COUNT (8) to set IB_SEND_SIGNALED,
thus avoiding completion interrupts for every posted iser_tx_desc.
The batch processing is done using a per isert_conn llist that once
IB_SEND_SIGNALED has been set is saved to tx_desc->comp_llnode_batch,
and completion processing of previously posted iser_tx_descs is done
in a single shot from within isert_send_completion() code.
Note this is only done for response PDUs from ISCSI_OP_SCSI_CMD, and
all other control type of PDU responses will force an implicit batch
drain to occur.
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
|
|
This patch proposes to remove the IRQF_DISABLED flag from drivers/xen
code. It's a NOOP since 2.6.35 and it will be removed one day.
Note that architecture dependent fixes for IRQF_DISABLED
were already submitted through separate patches.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
This patch proposes to remove the IRQF_DISABLED flag from x86/xen
code. It's a NOOP since 2.6.35 and it will be removed one day.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
The duration field of print_graph_duration() can also be used
to do the space filling by passing an enum in it:
DURATION_FILL_FULL
DURATION_FILL_START
DURATION_FILL_END
The problem is that these are enums and defined as negative,
but the duration field is unsigned long long. Most archs are
fine with this but blackfin fails to compile because of it:
kernel/built-in.o: In function `print_graph_duration':
kernel/trace/trace_functions_graph.c:782: undefined reference to `__ucmpdi2'
Overloading a unsigned long long with an signed enum is just
bad in principle. We can accomplish the same thing by using
part of the flags field instead.
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
In the past, ftrace_off_permanent() was called if something
strange was detected. But the ftrace_bug() now handles all the
anomolies that can happen with ftrace (function tracing), and there
are no uses of ftrace_off_permanent(). Get rid of it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
In system_tr_open(), the filp->private_data can be assigned the 'dir'
variable even if it was freed. This is on the error path, and is
harmless because the error return code will prevent filp->private_data
from being used. But for correctness, we should not assign it to
a recently freed variable, as that can cause static tools to give
false warnings.
Also have both subsystem_open() and system_tr_open() return -ENODEV
if tracing has been disabled.
Link: http://lkml.kernel.org/r/1383764571-7318-1-git-send-email-geyslan@gmail.com
Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
The current default perf paranoid level is "1" which has
"perf_paranoid_kernel()" return false, and giving any operations that
use it, access to normal users. Unfortunately, this includes function
tracing and normal users should not be allowed to enable function
tracing by default.
The proper level is defined at "-1" (full perf access), which
"perf_paranoid_tracepoint_raw()" will only give access to. Use that
check instead for enabling function tracing.
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: stable@vger.kernel.org # 3.4+
CVE: CVE-2013-2930
Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
set_swbp() and set_orig_insn() are __weak, but this is pointless
because write_opcode() is static.
Export write_opcode() as uprobe_write_opcode() for the upcoming
arm port, this way it can actually override set_swbp() and use
__opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
|
|
Currently xol_get_insn_slot() assumes that we should simply copy
arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
just the copy of the original insn.
This is not true for arm which needs to create another insn to
execute it out-of-line.
So this patch simply adds the new member, ->ixol into the union.
This doesn't make any difference for x86 and powerpc, but arm
can divorce insn/ixol and initialize the correct xol insn in
arch_uprobe_analyze_insn().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
|
|
Turn module_init() into __initcall() and kill module_exit().
This code can't be compiled as a module so these module_*()
calls only add the confusion, especially if arch-dependant
code needs its own initialization hooks.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
|
|
Move the function declarations from the arch headers to the common
header, since only the function bodies are architecture-specific.
These changes are from Vincent Rabin's uprobes patch.
[ oleg: update arch/powerpc/include/asm/uprobes.h ]
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David A. Long <dave.long@linaro.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
|
|
To help track down AGI/AGF lock ordering issues, I added these
tracepoints to tell us when an AGI or AGF is read and locked. With
these we can now determine if the lock ordering goes wrong from
tracing captures.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
I debugging a log tail issue on a RHEL6 kernel, I added these trace
points to trace log items being added, moved and removed in the AIL
and how that affected the log tail LSN that was written to the log.
They were very helpful in that they immediately identified the cause
of the problem being seen. Hence I'd like to always have them
available for use.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
|
|
The variable hv_lapic_frequency causes an unused variable warning if
CONFIG_X86_LOCAL_APIC is disabled. Since the variable is only used
inside a small if statement, move the declaration of that variable
into the if statement itself.
Cc: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1381444224-3303-1-git-send-email-kys@microsoft.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
With the connector and pipe passed around, we can now set the backlight
on the right pipe on VLV/BYT.
v2: drop combination mode check for VLV (Jani)
add save/restore code for VLV backlight regs (Jani)
check for existing modulation freq when initializing backlight regs (Jani)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67245
Tested-by: Joe Konno <joe.konno@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
The warnings are really harmless but annoying. Since they are only
about debug prints, and it's at most 32bit DMA, let's just cast to
unsigned long.
sound/pci/lx6464es/lx6464es.c:457:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
sound/pci/lx6464es/lx_core.c:1195:21: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Just add an ifdef CONFIG_PM to shut up the warnings:
sound/isa/cmi8328.c:129:13: warning: ‘snd_cmi8328_cfg_save’ defined but not used [-Wunused-function]
sound/isa/cmi8328.c:136:13: warning: ‘snd_cmi8328_cfg_restore’ defined but not used [-Wunused-function]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
On VLV/BYT, backlight controls a per-pipe, so when adjusting the
backlight we need to pass the correct info. So make the externally
visible backlight functions take a connector argument, which can be used
internally to figure out the pipe backlight to adjust.
v2: make connector pipe lookup check for NULL crtc (Jani)
fixup connector check in ASLE code (Jani)
v3: make sure we take the mode config lock around lookups (Daniel)
v4: fix double unlock in panel_get_brightness (Daniel)
v5: push ASLE work into a work queue (Daniel)
v6: separate ASLE work to a prep patch, rebase (Jani)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Doing this has been long overdue anyway, but now we really need it in
preparation for per connector backlight handling.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
sfr pointed out that with CONFIG_UIDGID_STRICT_TYPE_CHECKS set the audit
tree would not build. This is because the oldsessionid in
audit_set_loginuid() was accidentally being declared as a kuid_t. This
patch fixes that declaration mistake.
Example of problem:
kernel/auditsc.c: In function 'audit_set_loginuid':
kernel/auditsc.c:2003:15: error: incompatible types when assigning to
type 'kuid_t' from type 'int'
oldsessionid = audit_get_sessionid(current);
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Eric Paris <eparis@redhat.com>
|
|
This patch creates the function 'tracing_is_disabled', which
can be used outside of trace.c.
Link: http://lkml.kernel.org/r/1382141754-12155-1-git-send-email-geyslan@gmail.com
Signed-off-by: Geyslan G. Bem <geyslan@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
With ftrace_dump_on_oops, we previously did not open the tracer in
question, sometimes causing the trace output to be useless.
For example, the function_graph tracer with tracing_thresh set dumped via
ftrace_dump_on_oops would show a series of '}' indented at different levels,
but no function names.
call trace->open() (and do a few other fixups copied from the normal dump
path) to make the output more intelligible.
Link: http://lkml.kernel.org/r/1382554197-16961-1-git-send-email-cody@linux.vnet.ibm.com
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
This patch removes a duplicate define from
arch/mips/boot/ecoff.h
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/6081/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
|
|
If the UID is specified by userspace when calling the KEYCTL_GET_PERSISTENT
function and the process does not have the CAP_SETUID capability, then the
function will return -EPERM if the current process's uid, suid, euid and fsuid
all match the requested UID. This is incorrect.
Fix it such that when a non-privileged caller requests a persistent keyring by
a specific UID they can only request their own (ie. the specified UID matches
either then process's UID or the process's EUID).
This can be tested by logging in as the user and doing:
keyctl get_persistent @p
keyctl get_persistent @p `id -u`
keyctl get_persistent @p 0
The first two should successfully print the same key ID. The third should do
the same if called by UID 0 or indicate Operation Not Permitted otherwise.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
The while loop only peeks at the top request in the queue but does
not yet consume it. Since we only handle fs requests, we need to
dequeue and complete all other request command types with error
just in case we would ever receive such an unforeseen request.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
Also check the busy placements before deciding to move a buffer object.
Failing to do this may result in a completely unneccessary move within a
single memory type.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: stable@vger.kernel.org
|
|
All error paths will want to keep the mm node, so handle this at the
function exit. This fixes an ioremap failure error path.
Also add some comments to make the function a bit easier to understand.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: stable@vger.kernel.org
|
|
Fix the case where the ttm pointer may be NULL causing
a NULL pointer dereference.
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Thomas Hellström <thellstrom@vmware.com>
Cc: stable@vger.kernel.org
|
|
NO_EVICT bos that are not idle when all references are dropped are put on
the delayed destroy list. However, since they are not on LRU lists, they
are not available to shrinkers at that point, and buffers on the delayed
destroy list are not checked very often for idle.
So when these buffers are put on the delayed destroy list, clear the
NO_EVICT flag and put them on the right LRU list. This way they are
immediately available for eviction or shrinkers and will not cause false
OOMS.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
|
|
Make use of the FAULT_FLAG_ALLOW_RETRY flag to allow dropping the
mmap_sem while waiting for bo idle.
FAULT_FLAG_ALLOW_RETRY appears to be primarily designed for disk waits
but should work just as fine for GPU waits..
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
|
|
The code handles three different cases:
1) physical page addresses. The ttm page array is used.
2) DMA subsystem addresses. A scatter-gather list is used.
3) Coherent pages. The ttm dma pool is used, together with the dma_ttm
array os dma_addr_t
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
|
|
Used by the vmwgfx driver
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
Commit 52ea2a560a9d (arm64: locks: introduce ticket-based spinlock
implementation) introduces the arch_spin_is_contended() function making
CONFIG_GENERIC_LOCKBREAK unnecessary.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
|
|
While enabling lockdep on seqlocks, I ran across the warning below
caused by the ipv6 stats being updated in both irq and non-irq context.
This patch changes from IP6_INC_STATS_BH to IP6_INC_STATS (suggested
by Eric Dumazet) to resolve this problem.
[ 11.120383] =================================
[ 11.121024] [ INFO: inconsistent lock state ]
[ 11.121663] 3.12.0-rc1+ #68 Not tainted
[ 11.122229] ---------------------------------
[ 11.122867] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 11.123741] init/4483 [HC0[0]:SC1[3]:HE1:SE0] takes:
[ 11.124505] (&stats->syncp.seq#6){+.?...}, at: [<c1ab80c2>] ndisc_send_ns+0xe2/0x130
[ 11.125736] {SOFTIRQ-ON-W} state was registered at:
[ 11.126447] [<c10e0eb7>] __lock_acquire+0x5c7/0x1af0
[ 11.127222] [<c10e2996>] lock_acquire+0x96/0xd0
[ 11.127925] [<c1a9a2c3>] write_seqcount_begin+0x33/0x40
[ 11.128766] [<c1a9aa03>] ip6_dst_lookup_tail+0x3a3/0x460
[ 11.129582] [<c1a9e0ce>] ip6_dst_lookup_flow+0x2e/0x80
[ 11.130014] [<c1ad18e0>] ip6_datagram_connect+0x150/0x4e0
[ 11.130014] [<c1a4d0b5>] inet_dgram_connect+0x25/0x70
[ 11.130014] [<c198dd61>] SYSC_connect+0xa1/0xc0
[ 11.130014] [<c198f571>] SyS_connect+0x11/0x20
[ 11.130014] [<c198fe6b>] SyS_socketcall+0x12b/0x300
[ 11.130014] [<c1bbf880>] syscall_call+0x7/0xb
[ 11.130014] irq event stamp: 1184
[ 11.130014] hardirqs last enabled at (1184): [<c1086901>] local_bh_enable+0x71/0x110
[ 11.130014] hardirqs last disabled at (1183): [<c10868cd>] local_bh_enable+0x3d/0x110
[ 11.130014] softirqs last enabled at (0): [<c108014d>] copy_process.part.42+0x45d/0x11a0
[ 11.130014] softirqs last disabled at (1147): [<c1086e05>] irq_exit+0xa5/0xb0
[ 11.130014]
[ 11.130014] other info that might help us debug this:
[ 11.130014] Possible unsafe locking scenario:
[ 11.130014]
[ 11.130014] CPU0
[ 11.130014] ----
[ 11.130014] lock(&stats->syncp.seq#6);
[ 11.130014] <Interrupt>
[ 11.130014] lock(&stats->syncp.seq#6);
[ 11.130014]
[ 11.130014] *** DEADLOCK ***
[ 11.130014]
[ 11.130014] 3 locks held by init/4483:
[ 11.130014] #0: (rcu_read_lock){.+.+..}, at: [<c109363c>] SyS_setpriority+0x4c/0x620
[ 11.130014] #1: (((&ifa->dad_timer))){+.-...}, at: [<c108c1c0>] call_timer_fn+0x0/0xf0
[ 11.130014] #2: (rcu_read_lock){.+.+..}, at: [<c1ab6494>] ndisc_send_skb+0x54/0x5d0
[ 11.130014]
[ 11.130014] stack backtrace:
[ 11.130014] CPU: 0 PID: 4483 Comm: init Not tainted 3.12.0-rc1+ #68
[ 11.130014] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 11.130014] 00000000 00000000 c55e5c10 c1bb0e71 c57128b0 c55e5c4c c1badf79 c1ec1123
[ 11.130014] c1ec1484 00001183 00000000 00000000 00000001 00000003 00000001 00000000
[ 11.130014] c1ec1484 00000004 c5712dcc 00000000 c55e5c84 c10de492 00000004 c10755f2
[ 11.130014] Call Trace:
[ 11.130014] [<c1bb0e71>] dump_stack+0x4b/0x66
[ 11.130014] [<c1badf79>] print_usage_bug+0x1d3/0x1dd
[ 11.130014] [<c10de492>] mark_lock+0x282/0x2f0
[ 11.130014] [<c10755f2>] ? kvm_clock_read+0x22/0x30
[ 11.130014] [<c10dd8b0>] ? check_usage_backwards+0x150/0x150
[ 11.130014] [<c10e0e74>] __lock_acquire+0x584/0x1af0
[ 11.130014] [<c10b1baf>] ? sched_clock_cpu+0xef/0x190
[ 11.130014] [<c10de58c>] ? mark_held_locks+0x8c/0xf0
[ 11.130014] [<c10e2996>] lock_acquire+0x96/0xd0
[ 11.130014] [<c1ab80c2>] ? ndisc_send_ns+0xe2/0x130
[ 11.130014] [<c1ab66d3>] ndisc_send_skb+0x293/0x5d0
[ 11.130014] [<c1ab80c2>] ? ndisc_send_ns+0xe2/0x130
[ 11.130014] [<c1ab80c2>] ndisc_send_ns+0xe2/0x130
[ 11.130014] [<c108cc32>] ? mod_timer+0xf2/0x160
[ 11.130014] [<c1aa706e>] ? addrconf_dad_timer+0xce/0x150
[ 11.130014] [<c1aa70aa>] addrconf_dad_timer+0x10a/0x150
[ 11.130014] [<c1aa6fa0>] ? addrconf_dad_completed+0x1c0/0x1c0
[ 11.130014] [<c108c233>] call_timer_fn+0x73/0xf0
[ 11.130014] [<c108c1c0>] ? __internal_add_timer+0xb0/0xb0
[ 11.130014] [<c1aa6fa0>] ? addrconf_dad_completed+0x1c0/0x1c0
[ 11.130014] [<c108c5b1>] run_timer_softirq+0x141/0x1e0
[ 11.130014] [<c1086b20>] ? __do_softirq+0x70/0x1b0
[ 11.130014] [<c1086b70>] __do_softirq+0xc0/0x1b0
[ 11.130014] [<c1086e05>] irq_exit+0xa5/0xb0
[ 11.130014] [<c106cfd5>] smp_apic_timer_interrupt+0x35/0x50
[ 11.130014] [<c1bbfbca>] apic_timer_interrupt+0x32/0x38
[ 11.130014] [<c10936ed>] ? SyS_setpriority+0xfd/0x620
[ 11.130014] [<c10e26c9>] ? lock_release+0x9/0x240
[ 11.130014] [<c10936d7>] ? SyS_setpriority+0xe7/0x620
[ 11.130014] [<c1bbee6d>] ? _raw_read_unlock+0x1d/0x30
[ 11.130014] [<c1093701>] SyS_setpriority+0x111/0x620
[ 11.130014] [<c109363c>] ? SyS_setpriority+0x4c/0x620
[ 11.130014] [<c1bbf880>] syscall_call+0x7/0xb
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-5-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
After adding lockdep support to seqlock/seqcount structures,
I started seeing the following warning:
[ 1.070907] ======================================================
[ 1.072015] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[ 1.073181] 3.11.0+ #67 Not tainted
[ 1.073801] ------------------------------------------------------
[ 1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 1.076088] (&p->mems_allowed_seq){+.+...}, at: [<ffffffff81187d7f>] new_slab+0x5f/0x280
[ 1.077572]
[ 1.077572] and this task is already holding:
[ 1.078593] (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffff81339f03>] blk_execute_rq_nowait+0x53/0xf0
[ 1.080042] which would create a new lock dependency:
[ 1.080042] (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...}
[ 1.080042]
[ 1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 1.080042] (&(&q->__queue_lock)->rlock){..-...}
[ 1.080042] ... which became SOFTIRQ-irq-safe at:
[ 1.080042] [<ffffffff810ec179>] __lock_acquire+0x5b9/0x1db0
[ 1.080042] [<ffffffff810edfe5>] lock_acquire+0x95/0x130
[ 1.080042] [<ffffffff818968a1>] _raw_spin_lock+0x41/0x80
[ 1.080042] [<ffffffff81560c9e>] scsi_device_unbusy+0x7e/0xd0
[ 1.080042] [<ffffffff8155a612>] scsi_finish_command+0x32/0xf0
[ 1.080042] [<ffffffff81560e91>] scsi_softirq_done+0xa1/0x130
[ 1.080042] [<ffffffff8133b0f3>] blk_done_softirq+0x73/0x90
[ 1.080042] [<ffffffff81095dc0>] __do_softirq+0x110/0x2f0
[ 1.080042] [<ffffffff81095fcd>] run_ksoftirqd+0x2d/0x60
[ 1.080042] [<ffffffff810bc506>] smpboot_thread_fn+0x156/0x1e0
[ 1.080042] [<ffffffff810b3916>] kthread+0xd6/0xe0
[ 1.080042] [<ffffffff818980ac>] ret_from_fork+0x7c/0xb0
[ 1.080042]
[ 1.080042] to a SOFTIRQ-irq-unsafe lock:
[ 1.080042] (&p->mems_allowed_seq){+.+...}
[ 1.080042] ... which became SOFTIRQ-irq-unsafe at:
[ 1.080042] ... [<ffffffff810ec1d3>] __lock_acquire+0x613/0x1db0
[ 1.080042] [<ffffffff810edfe5>] lock_acquire+0x95/0x130
[ 1.080042] [<ffffffff810b3df2>] kthreadd+0x82/0x180
[ 1.080042] [<ffffffff818980ac>] ret_from_fork+0x7c/0xb0
[ 1.080042]
[ 1.080042] other info that might help us debug this:
[ 1.080042]
[ 1.080042] Possible interrupt unsafe locking scenario:
[ 1.080042]
[ 1.080042] CPU0 CPU1
[ 1.080042] ---- ----
[ 1.080042] lock(&p->mems_allowed_seq);
[ 1.080042] local_irq_disable();
[ 1.080042] lock(&(&q->__queue_lock)->rlock);
[ 1.080042] lock(&p->mems_allowed_seq);
[ 1.080042] <Interrupt>
[ 1.080042] lock(&(&q->__queue_lock)->rlock);
[ 1.080042]
[ 1.080042] *** DEADLOCK ***
The issue stems from the kthreadd() function calling set_mems_allowed
with irqs enabled. While its possibly unlikely for the actual deadlock
to trigger, a fix is fairly simple: disable irqs before taking the
mems_allowed_seq lock.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Currently seqlocks and seqcounts don't support lockdep.
After running across a seqcount related deadlock in the timekeeping
code, I used a less-refined and more focused variant of this patch
to narrow down the cause of the issue.
This is a first-pass attempt to properly enable lockdep functionality
on seqlocks and seqcounts.
Since seqcounts are used in the vdso gettimeofday code, I've provided
non-lockdep accessors for those needs.
I've also handled one case where there were nested seqlock writers
and there may be more edge cases.
Comments and feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.
The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.
This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.
Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.
Feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Simon Horman <horms@verge.net.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
nr_busy_cpus parameter is used by nohz_kick_needed() to find out the
number of busy cpus in a sched domain which has SD_SHARE_PKG_RESOURCES
flag set. Therefore instead of updating nr_busy_cpus at every level
of sched domain, since it is irrelevant, we can update this parameter
only at the parent domain of the sd which has this flag set. Introduce
a per-cpu parameter sd_busy which represents this parent domain.
In nohz_kick_needed() we directly query the nr_busy_cpus parameter
associated with the groups of sd_busy.
By associating sd_busy with the highest domain which has
SD_SHARE_PKG_RESOURCES flag set, we cover all lower level domains
which could have this flag set and trigger nohz_idle_balancing if any
of the levels have more than one busy cpu.
sd_busy is irrelevant for asymmetric load balancing. However sd_asym
has been introduced to represent the highest sched domain which has
SD_ASYM_PACKING flag set so that it can be queried directly when
required.
While we are at it, we might as well change the nohz_idle parameter to
be updated at the sd_busy domain level alone and not the base domain
level of a CPU. This will unify the concept of busy cpus at just one
level of sched domain where it is currently used.
Signed-off-by: Preeti U Murthy<preeti@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: svaidy@linux.vnet.ibm.com
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Cc: peterz@infradead.org
Cc: mikey@neuling.org
Link: http://lkml.kernel.org/r/20131030031252.23426.4417.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Asymmetric scheduling within a core is a scheduler loadbalancing
feature that is triggered when SD_ASYM_PACKING flag is set. The goal
for the load balancer is to move tasks to lower order idle SMT threads
within a core on a POWER7 system.
In nohz_kick_needed(), we intend to check if our sched domain (core)
is completely busy or we have idle cpu.
The following check for SD_ASYM_PACKING:
(cpumask_first_and(nohz.idle_cpus_mask, sched_domain_span(sd)) < cpu)
already covers the case of checking if the domain has an idle cpu,
because cpumask_first_and() will not yield any set bits if this domain
has no idle cpu.
Hence, nr_busy check against group weight can be removed.
Reported-by: Michael Neuling <michael.neuling@au1.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: vincent.guittot@linaro.org
Cc: bitbucket@online.de
Cc: benh@kernel.crashing.org
Cc: anton@samba.org
Cc: Morten.Rasmussen@arm.com
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131030031242.23426.13019.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.
Besides the counter/control registers in IRP boxes are not properly
aligned.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is
completely the same as SandyBridge-EP.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|