Age | Commit message (Collapse) | Author |
|
With the IRQ stack changes integrated, the XRX200 devices started
emitting a constant stream of kernel messages like this:
[ 565.415310] Spurious IRQ: CAUSE=0x1100c300
This is caused by IP0 getting handled by plat_irq_dispatch() rather than
its vectored interrupt handler, which is fixed by commit de856416e714
("MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch").
Fix plat_irq_dispatch() to handle non-vectored IPI interrupts correctly
by setting up IP2-6 as proper chained IRQ handlers and calling do_IRQ
for all MIPS CPU interrupts.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15077/
[james.hogan@imgtec.com: tweaked commit message]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
|
|
Since commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") the output of counter synchornisation has been
split across lines:
[ 0.665181] Synchronize counters for CPU 1:
[ 0.678578] done.
Fix this by using pr_cont, and replace printk with pr_info.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15195/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
|
|
Commit dda45f701c9d ("MIPS: Switch to the irq_stack in interrupts")
changed both the normal and vectored interrupt handlers. Unfortunately
the vectored version, "except_vec_vi_handler", was incorrectly modified
to unconditionally jal to plat_irq_dispatch, rather than doing a jalr to
the vectored handler that has been set up. This is ok for many platforms
which set the vectored handler to plat_irq_dispatch anyway, but will
cause problems with platforms that use other handlers.
Fixes: dda45f701c9d ("MIPS: Switch to the irq_stack in interrupts")
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15110/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
|
|
The postlink Makefile must include include/config/auto.conf to get the
kernel configuration variables. But in a clean kernel directory this
file does not exist, causing make to bail with the error:
arch/mips/Makefile.postlink:10: include/config/auto.conf: No such file or directory
make[1]: *** No rule to make target 'include/config/auto.conf'. Stop.
Makefile:1290: recipe for target 'vmlinuxclean' failed
Fix this by using "-include" to not cause a Make error when the file
does not exist.
Fixes: 44079d3509ae ("MIPS: Use Makefile.postlink to insert relocations into vmlinux")
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15136/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
|
|
The recently added MIPS cacheinfo support used a macro populate_cache()
to populate the cacheinfo structures depending on which caches are
present. However the macro contains multiple statements without
enclosing them in a do {} while (0) loop, so the L2 and L3 cache
conditionals in populate_cache_leaves() only conditionalised the first
statement in the macro.
This overflows the buffer allocated by detect_cache_attributes(),
resulting in boot failures under QEMU where neither the L2 or L2 caches
are present.
Enclose the macro statements in a do {} while (0) block to keep the
whole macro inside the conditionals.
Fixes: ef462f3b64e9 ("MIPS: Add cacheinfo support")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Justin Chen <justin.chen@broadcom.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: bcm-kernel-feedback-list@broadcom.com
Patchwork: https://patchwork.linux-mips.org/patch/15276/
|
|
When building for microMIPS we need to ensure that the assembler always
knows that there is code at the target of a branch or jump. Commit
7170bdc77755 ("MIPS: Add return errors to protected cache ops")
introduced a fixup path to protected_cache(e)_op() which does not meet
this requirement. The fixup path jumps to the "2" label but the .section
pseudo-op immediately following it causes the label to be marked as
data. Linking then fails with:
mips-img-linux-gnu-ld: arch/mips/mm/c-r4k.o: .fixup+0x0: Unsupported
jump between ISA modes; consider recompiling with interlinking
enabled.
Fix this by declaring that "2" labels code using the .insn directive.
Fixes: 7170bdc77755 ("MIPS: Add return errors to protected cache ops")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/15274/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
|
|
MIPS dependencies for KVM
Miscellaneous MIPS architecture changes depended on by the MIPS KVM
changes in the KVM tree.
- Move pgd_alloc() out of header.
- Exports so KVM can access page table management and TLBEX functions.
- Add return errors to protected cache ops.
|
|
The code sets the default return code to -ENOSYS but then overrides this
to -EINVAL in the switch() statement's default case, which is clearly
silly.
This patch removes the override and sets the default return code to
-ENOTTY, which is the conventional return for an unimplemented ioctl.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
|
|
Commit a92e7c3d82a1 ("spi: s3c64xx: consider the case when the CS
line is not connected") introduced an inconsistency between the
binding, where the disconnected CS line was marked as
'no-cs-readback', and the driver.
The driver is erroneously checking for that attribute with
property name of 'broken-cs'.
Check for 'no-cs-readback' in the driver as well.
Fixes: a92e7c3d82a1 ("spi: s3c64xx: consider the case when the CS line is not connected")
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
|
|
The tree used for staging pstore changes has moved to my repo. The -next
tree already pulls from here, so update MAINTAINERS to reflect reality.
While at it, add some more pstore-related files to track.
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Instead of needing additional checks in callers for unallocated przs,
perform the check in the walker, which gives us a more universal way to
handle the situation.
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
The ram backend wasn't always initializing its spinlock correctly. Since
it was coming from kzalloc memory, though, it was harmless on
architectures that initialize unlocked spinlocks to 0 (at least x86 and
ARM). This also fixes a possibly ignored flag setting too.
When running under CONFIG_DEBUG_SPINLOCK, the following Oops was visible:
[ 0.760836] persistent_ram: found existing buffer, size 29988, start 29988
[ 0.765112] persistent_ram: found existing buffer, size 30105, start 30105
[ 0.769435] persistent_ram: found existing buffer, size 118542, start 118542
[ 0.785960] persistent_ram: found existing buffer, size 0, start 0
[ 0.786098] persistent_ram: found existing buffer, size 0, start 0
[ 0.786131] pstore: using zlib compression
[ 0.790716] BUG: spinlock bad magic on CPU#0, swapper/0/1
[ 0.790729] lock: 0xffffffc0d1ca9bb0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[ 0.790742] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc2+ #913
[ 0.790747] Hardware name: Google Kevin (DT)
[ 0.790750] Call trace:
[ 0.790768] [<ffffff900808ae88>] dump_backtrace+0x0/0x2bc
[ 0.790780] [<ffffff900808b164>] show_stack+0x20/0x28
[ 0.790794] [<ffffff9008460ee0>] dump_stack+0xa4/0xcc
[ 0.790809] [<ffffff9008113cfc>] spin_dump+0xe0/0xf0
[ 0.790821] [<ffffff9008113d3c>] spin_bug+0x30/0x3c
[ 0.790834] [<ffffff9008113e28>] do_raw_spin_lock+0x50/0x1b8
[ 0.790846] [<ffffff9008a2d2ec>] _raw_spin_lock_irqsave+0x54/0x6c
[ 0.790862] [<ffffff90083ac3b4>] buffer_size_add+0x48/0xcc
[ 0.790875] [<ffffff90083acb34>] persistent_ram_write+0x60/0x11c
[ 0.790888] [<ffffff90083aab1c>] ramoops_pstore_write_buf+0xd4/0x2a4
[ 0.790900] [<ffffff90083a9d3c>] pstore_console_write+0xf0/0x134
[ 0.790912] [<ffffff900811c304>] console_unlock+0x48c/0x5e8
[ 0.790923] [<ffffff900811da18>] register_console+0x3b0/0x4d4
[ 0.790935] [<ffffff90083aa7d0>] pstore_register+0x1a8/0x234
[ 0.790947] [<ffffff90083ac250>] ramoops_probe+0x6b8/0x7d4
[ 0.790961] [<ffffff90085ca548>] platform_drv_probe+0x7c/0xd0
[ 0.790972] [<ffffff90085c76ac>] driver_probe_device+0x1b4/0x3bc
[ 0.790982] [<ffffff90085c7ac8>] __device_attach_driver+0xc8/0xf4
[ 0.790996] [<ffffff90085c4bfc>] bus_for_each_drv+0xb4/0xe4
[ 0.791006] [<ffffff90085c7414>] __device_attach+0xd0/0x158
[ 0.791016] [<ffffff90085c7b18>] device_initial_probe+0x24/0x30
[ 0.791026] [<ffffff90085c648c>] bus_probe_device+0x50/0xe4
[ 0.791038] [<ffffff90085c35b8>] device_add+0x3a4/0x76c
[ 0.791051] [<ffffff90087d0e84>] of_device_add+0x74/0x84
[ 0.791062] [<ffffff90087d19b8>] of_platform_device_create_pdata+0xc0/0x100
[ 0.791073] [<ffffff90087d1a2c>] of_platform_device_create+0x34/0x40
[ 0.791086] [<ffffff900903c910>] of_platform_default_populate_init+0x58/0x78
[ 0.791097] [<ffffff90080831fc>] do_one_initcall+0x88/0x160
[ 0.791109] [<ffffff90090010ac>] kernel_init_freeable+0x264/0x31c
[ 0.791123] [<ffffff9008a25bd0>] kernel_init+0x18/0x11c
[ 0.791133] [<ffffff9008082ec0>] ret_from_fork+0x10/0x50
[ 0.793717] console [pstore-1] enabled
[ 0.797845] pstore: Registered ramoops as persistent store backend
[ 0.804647] ramoops: attached 0x100000@0xf7edc000, ecc: 0/0
Fixes: 663deb47880f ("pstore: Allow prz to control need for locking")
Fixes: 109704492ef6 ("pstore: Make spinlock per zone instead of global")
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Add a new compatible for sun4i-i2s driver to handle some
SoCs that have a reset line that must be asserted/deasserted.
This new compatible, "allwinner,sun6i-a31-i2s", requires the
property "resets" which should be a phandle to the reset line.
Except these differences, the compatible is identical to previous one
which will not handle a reset line.
Signed-off-by: Mylène Josserand <mylene.josserand@free-electrons.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
GCC complains about unused variable 'vma' in mark_screen_rdonly() if THP is
disabled:
arch/x86/kernel/vm86_32.c: In function ‘mark_screen_rdonly’:
arch/x86/kernel/vm86_32.c:180:26: warning: unused variable ‘vma’
[-Wunused-variable]
struct vm_area_struct *vma = find_vma(mm, 0xA0000);
That's silly. pmd_trans_huge() resolves to 0 when THP is disabled, so the
whole block should be eliminated.
Moving the variable declaration outside the if() block shuts GCC up.
Reported-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Carlos O'Donell <carlos@redhat.com>
Link: http://lkml.kernel.org/r/20170213125228.63645-1-kirill.shutemov@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When we check for additional DT properties in the current node we
use the device_node passed in with the configuration data, this
will not point to the correct DT node, use the one passed in
for this purpose.
Fixes: d2a2e729a666 ("regulator: tps65086: Add regulator driver for the TPS65086 PMIC")
Reported-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Tested-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The three load switches are called SWA1, SWB1, and SWB2. The
node names describing properties for these are expected to be
the same, but due to a typo they are not. Fix this here.
Fixes: d2a2e729a666 ("regulator: tps65086: Add regulator driver for the TPS65086 PMIC")
Reported-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Tested-by: Steven Kipisz <s-kipisz2@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The FSL SAI can support up to 32 channels using TDM. Report that value so
they can actually be used.
Tested using 8 channels.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a bit needed during initialization into the driver, where it is supposed
to be. Currently, this is happening in the VCE firmware, and although
functional, this is the correct place to perform this initialization.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alan Harrison <Alan.Harrison@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The "delta-abs" compute method will show most changed entries on top.
So users can easily see how much effect between the data. Note that it
also changes the default of -o option to 1 in order to apply the compute
method. To see original-style (sorted by baseline) use -o 0 option.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170210161856.18422-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The diff.compute config variable is to set the default compute method of
perf diff command (-c option). Possible values 'delta' (default),
'delta-abs', 'ratio' and 'wdiff'.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170210073614.24584-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In many cases, I need to look at differences between two data so I often
used the -o option to sort the result base on the difference first.
It'd be nice to have a config option to set it by default.
The diff.order config option is to set the default value of -o/--order
option.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170210073614.24584-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'delta-abs' compute method is same as 'delta' but shows entries with
bigger absolute delta first instead of sorting numerically. This is
only useful together with -o option.
Below is default output (-c delta):
$ perf diff -o 1 -c delta | grep -v ^# | head
42.22% +4.97% [kernel.kallsyms] [k] cfb_imageblit
0.62% +1.23% [kernel.kallsyms] [k] mutex_lock
+1.15% [kernel.kallsyms] [k] copy_user_generic_string
2.40% +0.95% [kernel.kallsyms] [k] bit_putcs
0.31% +0.79% [kernel.kallsyms] [k] link_path_walk
+0.64% [kernel.kallsyms] [k] kmem_cache_alloc
0.00% +0.57% [kernel.kallsyms] [k] __rcu_read_unlock
+0.45% [kernel.kallsyms] [k] alloc_set_pte
0.16% +0.45% [kernel.kallsyms] [k] menu_select
+0.41% ld-2.24.so [.] do_lookup_x
Now with 'delta-abs' it shows entries have bigger delta value either
positive or negative.
$ perf diff -o 1 -c delta-abs | grep -v ^# | head
42.22% +4.97% [kernel.kallsyms] [k] cfb_imageblit
12.72% -3.01% [kernel.kallsyms] [k] intel_idle
9.72% -1.31% [unknown] [.] 0x0000000000411343
0.62% +1.23% [kernel.kallsyms] [k] mutex_lock
2.40% +0.95% [kernel.kallsyms] [k] bit_putcs
0.31% +0.79% [kernel.kallsyms] [k] link_path_walk
1.35% -0.71% [kernel.kallsyms] [k] smp_call_function_single
0.00% +0.57% [kernel.kallsyms] [k] __rcu_read_unlock
0.16% +0.45% [kernel.kallsyms] [k] menu_select
0.72% -0.44% [kernel.kallsyms] [k] lookup_fast
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170210073614.24584-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To match the kernel headers structure, setting up things that are
specific to gcc or to some specific version of gcc.
It gets included by linux/compiler.h when gcc is the compiler being
used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fabcqfq4asodq9t158hcs8t3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Using new link mode indices instead deprecated SUPPORTED_/ADVERTISED_
macro.
Added indication for 2500 and 5000mbit link modes (AQtion adapter already
supports these speeds).
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ghostprotocols.net domain is not working, remove it from CREDITS and
MAINTAINERS, and change the status to "Odd fixes", and since I haven't
been maintaining those, remove my address from there.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
stripes which are being reclaimed are still accounted into cached
stripes. The reclaim takes time. r5c_do_reclaim isn't aware of the
stripes and does unnecessary stripe reclaim. In practice, I saw one
stripe is reclaimed one time. This will cause bad IO pattern. Fixing
this by excluding the reclaing stripes in the check.
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
When log space is tight, we try to reclaim stripes from log head. There
are stripes which can't be reclaimed right now if some conditions are
met. We skip such stripes but accidentally count them, which might cause
no stripes are claimed. Fixing this by only counting valid stripes.
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
I'm starting document of the raid5-cache feature. Please note this is a
kernel doc instead of a mdadm manual, so I don't add the details about
how to use the feature in mdadm side.
Cc: NeilBrown <neilb@suse.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Commit: cbd199837750 ("md: Fix unfortunate interaction with evms")
change mddev_put() so that it would not destroy an md device while
->ctime was non-zero.
Unfortunately, we didn't make sure to clear ->ctime when unloading
the module, so it is possible for an md device to remain after
module unload. An attempt to open such a device will trigger
an invalid memory reference in:
get_gendisk -> kobj_lookup -> exact_lock -> get_disk
when tring to access disk->fops, which was in the module that has
been removed.
So ensure we clear ->ctime in md_exit(), and explain how that is useful,
as it isn't immediately obvious when looking at the code.
Fixes: cbd199837750 ("md: Fix unfortunate interaction with evms")
Tested-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
It is important to be able to flush all stripes in raid5-cache.
Therefore, we need reserve some space on the journal device for
these flushes. If flush operation includes pending writes to the
stripe, we need to reserve (conf->raid_disk + 1) pages per stripe
for the flush out. This reduces the efficiency of journal space.
If we exclude these pending writes from flush operation, we only
need (conf->max_degraded + 1) pages per stripe.
With this patch, when log space is critical (R5C_LOG_CRITICAL=1),
pending writes will be excluded from stripe flush out. Therefore,
we can reduce reserved space for flush out and thus improve journal
device efficiency.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Chunk aligned read significantly reduces CPU usage of raid456.
However, it is not safe to fully bypass the write back cache.
This patch enables chunk aligned read with write back cache.
For chunk aligned read, we track stripes in write back cache at
a bigger granularity, "big_stripe". Each chunk may contain more
than one stripe (for example, a 256kB chunk contains 64 4kB-page,
so this chunk contain 64 stripes). For chunk_aligned_read, these
stripes are grouped into one big_stripe, so we only need one lookup
for the whole chunk.
For each big_stripe, struct big_stripe_info tracks how many stripes
of this big_stripe are in the write back cache. We count how many
stripes of this big_stripe are in the write back cache. These
counters are tracked in a radix tree (big_stripe_tree).
r5c_tree_index() is used to calculate keys for the radix tree.
chunk_aligned_read() calls r5c_big_stripe_cached() to look up
big_stripe of each chunk in the tree. If this big_stripe is in the
tree, chunk_aligned_read() aborts. This look up is protected by
rcu_read_lock().
It is necessary to remember whether a stripe is counted in
big_stripe_tree. Instead of adding new flag, we reuses existing flags:
STRIPE_R5C_PARTIAL_STRIPE and STRIPE_R5C_FULL_STRIPE. If either of these
two flags are set, the stripe is counted in big_stripe_tree. This
requires moving set_bit(STRIPE_R5C_PARTIAL_STRIPE) to
r5c_try_caching_write(); and moving clear_bit of
STRIPE_R5C_PARTIAL_STRIPE and STRIPE_R5C_FULL_STRIPE to
r5c_finish_stripe_write_out().
Signed-off-by: Song Liu <songliubraving@fb.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
It will be used in drivers/md/raid5-cache.c
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
We made raid5 stripe handling multi-thread before. It works well for
SSD. But for harddisk, the multi-threading creates more disk seek, so
not always improve performance. For several hard disks based raid5,
multi-threading is required as raid5d becames a bottleneck especially
for sequential write.
To overcome the disk seek issue, we only dispatch IO from raid5d if the
array is harddisk based. Other threads can still handle stripes, but
can't dispatch IO.
Idealy, we should control IO dispatching order according to IO position
interrnally. Right now we still depend on block layer, which isn't very
efficient sometimes though.
My setup has 9 harddisks, each disk can do around 180M/s sequential
write. So in theory, the raid5 can do 180 * 8 = 1440M/s sequential
write. The test machine uses an ATOM CPU. I measure sequential write
with large iodepth bandwidth to raid array:
without patch: ~600M/s
without patch and group_thread_cnt=4: 750M/s
with patch and group_thread_cnt=4: 950M/s
with patch, group_thread_cnt=4, skip_copy=1: 1150M/s
We are pretty close to the maximum bandwidth in the large iodepth
iodepth case. The performance gap of small iodepth sequential write
between software raid and theory value is still very big though, because
we don't have an efficient pipeline.
Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Recently I receive a bug report that on Linux v3.0 based kerenl, hot add
disk to a md linear device causes kernel crash at linear_congested(). From
the crash image analysis, I find in linear_congested(), mddev->raid_disks
contains value N, but conf->disks[] only has N-1 pointers available. Then
a NULL pointer deference crashes the kernel.
There is a race between linear_add() and linear_congested(), RCU stuffs
used in these two functions cannot avoid the race. Since Linuv v4.0
RCU code is replaced by introducing mddev_suspend(). After checking the
upstream code, it seems linear_congested() is not called in
generic_make_request() code patch, so mddev_suspend() cannot provent it
from being called. The possible race still exists.
Here I explain how the race still exists in current code. For a machine
has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
md linear device; at the same time on other CPU, linear_congested() is
called to detect whether this md linear device is congested before issuing
an I/O request onto it.
Now I use a possible code execution time sequence to demo how the possible
race happens,
seq linear_add() linear_congested()
0 conf=mddev->private
1 oldconf=mddev->private
2 mddev->raid_disks++
3 for (i=0; i<mddev->raid_disks;i++)
4 bdev_get_queue(conf->disks[i].rdev->bdev)
5 mddev->private=newconf
In linear_add() mddev->raid_disks is increased in time seq 2, and on
another CPU in linear_congested() the for-loop iterates conf->disks[i] by
the increased mddev->raid_disks in time seq 3,4. But conf with one more
element (which is a pointer to struct dev_info type) to conf->disks[] is
not updated yet, accessing its structure member in time seq 4 will cause a
NULL pointer deference fault.
To fix this race, there are 2 parts of modification in the patch,
1) Add 'int raid_disks' in struct linear_conf, as a copy of
mddev->raid_disks. It is initialized in linear_conf(), always being
consistent with pointers number of 'struct dev_info disks[]'. When
iterating conf->disks[] in linear_congested(), use conf->raid_disks to
replace mddev->raid_disks in the for-loop, then NULL pointer deference
will not happen again.
2) RCU stuffs are back again, and use kfree_rcu() in linear_add() to
free oldconf memory. Because oldconf may be referenced as mddev->private
in linear_congested(), kfree_rcu() makes sure that its memory will not
be released until no one uses it any more.
Also some code comments are added in this patch, to make this modification
to be easier understandable.
This patch can be applied for kernels since v4.0 after commit:
3be260cc18f8 ("md/linear: remove rcu protections in favour of
suspend/resume"). But this bug is reported on Linux v3.0 based kernel, for
people who maintain kernels before Linux v4.0, they need to do some back
back port to this patch.
Changelog:
- V3: add 'int raid_disks' in struct linear_conf, and use kfree_rcu() to
replace rcu_call() in linear_add().
- v2: add RCU stuffs by suggestion from Shaohua and Neil.
- v1: initial effort.
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Neil Brown <neilb@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Inside set_status, transfer need to setup again, so
we have to drain IO before the transition, otherwise
oops may be triggered like the following:
divide error: 0000 [#1] SMP KASAN
CPU: 0 PID: 2935 Comm: loop7 Not tainted 4.10.0-rc7+ #213
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
task: ffff88006ba1e840 task.stack: ffff880067338000
RIP: 0010:transfer_xor+0x1d1/0x440 drivers/block/loop.c:110
RSP: 0018:ffff88006733f108 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800688d7000 RCX: 0000000000000059
RDX: 0000000000000000 RSI: 1ffff1000d743f43 RDI: ffff880068891c08
RBP: ffff88006733f160 R08: ffff8800688d7001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800688d7000
R13: ffff880067b7d000 R14: dffffc0000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88006d000000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000006c17e0 CR3: 0000000066e3b000 CR4: 00000000001406f0
Call Trace:
lo_do_transfer drivers/block/loop.c:251 [inline]
lo_read_transfer drivers/block/loop.c:392 [inline]
do_req_filebacked drivers/block/loop.c:541 [inline]
loop_handle_cmd drivers/block/loop.c:1677 [inline]
loop_queue_work+0xda0/0x49b0 drivers/block/loop.c:1689
kthread_worker_fn+0x4c3/0xa30 kernel/kthread.c:630
kthread+0x326/0x3f0 kernel/kthread.c:227
ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
Code: 03 83 e2 07 41 29 df 42 0f b6 04 30 4d 8d 44 24 01 38 d0 7f 08
84 c0 0f 85 62 02 00 00 44 89 f8 41 0f b6 48 ff 25 ff 01 00 00 99 <f7>
7d c8 48 63 d2 48 03 55 d0 48 89 d0 48 89 d7 48 c1 e8 03 83
RIP: transfer_xor+0x1d1/0x440 drivers/block/loop.c:110 RSP:
ffff88006733f108
---[ end trace 0166f7bd3b0c0933 ]---
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Poll messages that are used to allocate a logical address should
use the same initiator as the destination. Instead, it expected that
the initiator was 0xf which is not according to the standard.
This also had consequences for the message checks in cec_transmit_msg_fh
that incorrectly rejected poll messages with the same initiator and
destination.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
|
|
This reverts 'commit 7e0739cd9c40 ("[media] videodev2.h: fix
sYCC/AdobeYCC default quantization range").
The problem is that many drivers can convert R'G'B' content (often
from sensors) to Y'CbCr, but they all produce limited range Y'CbCr.
To stay backwards compatible the default quantization range for
sRGB and AdobeRGB Y'CbCr encoding should be limited range, not full
range, even though the corresponding standards specify full range.
Update the V4L2_MAP_QUANTIZATION_DEFAULT define accordingly and
also update the documentation.
Fixes: 7e0739cd9c40 ("[media] videodev2.h: fix sYCC/AdobeYCC default quantization range")
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org> # for v4.9 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
|
|
The PCI BAR0 and BAR1 for the PCI-IDIO-16 hold information for the PLX
9052 bridge chip on the device. The PCI BAR2 holds the necessary base
address for I/O control of the PCI-IDIO-16. This patch corrects the PCI
BAR index mismatch for the PCI-IDIO-16 GPIO driver.
Fixes: 02e74fc0401a ("gpio: Add GPIO support for the ACCES PCI-IDIO-16")
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The UEVENT user mode helper is enabled before the initcalls are executed
and is available when the root filesystem has been mounted.
The user mode helper is triggered by device init calls and the executable
might use the futex syscall.
futex_init() is marked __initcall which maps to device_initcall, but there
is no guarantee that futex_init() is invoked _before_ the first device init
call which triggers the UEVENT user mode helper.
If the user mode helper uses the futex syscall before futex_init() then the
syscall crashes with a NULL pointer dereference because the futex subsystem
has not been initialized yet.
Move futex_init() to core_initcall so futexes are initialized before the
root filesystem is mounted and the usermode helper becomes available.
[ tglx: Rewrote changelog ]
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: jiang.biao2@zte.com.cn
Cc: jiang.zhengxiong@zte.com.cn
Cc: zhong.weidong@zte.com.cn
Cc: deng.huali@zte.com.cn
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In case of error, the function devm_ioremap() returns NULL pointer not
ERR_PTR(). Fix by using devm_ioremap_resource instead of devm_ioremap.
Fixes: 8b1bd11c1f8f ("pinctrl: samsung: Add the support the multiple IORESOURCE_MEM for one pin-bank")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
The ACCES PCI-IDIO-16 has a PCI device ID code of 0x0DC8. It is
incorrect to use the PCI device ID code of the ACCES PCI-IIRO-8
(0x0F00). This patch fixes the said PCI device ID code mismatch.
Fixes: 02e74fc0401a ("gpio: Add GPIO support for the ACCES PCI-IDIO-16")
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
We need to unlock before returning -EINVAL on this error path.
Fixes: 04cc058f0c52 ("pinctrl: intel: Add support for 1k additional pull-down")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
Andrew Lunn says:
====================
mv88e6xxx Watchdog support
The Marvell switches have an in built watchdog over some of the
internal state machine. The watchdog can be configured to raise an
interrupt on error. The problem the watchdog found is then logged to
the kernel log.
The older switches can automagically perform a software reset when the
watchdog triggers. This just resets the internal state machine, but
leaves the switch configuration unchanged.
The 6390 family of switches cannot both raise an interrupt and
automagically perform a software reset. So the interrupt handler has
to perform the switch reset, and then re-enable the watchdog
interrupts.
This has been tested using hacked together debugfs code which allows
the "force" bit to be set, so cause a watchdog interrupt.
v2: Remove g2_prefix
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement the ops needed to support the watchdog for the MV88E6390
family.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The switch contains a watchdog looking for issues with the internal
gubbins of the switch. Hook the interrupt the watchdog triggers and
log the value of the control register indicating why the watchdog
fired. The watchdog can only be cleared with a switch reset, which
will destroy the current configuration. Rather than doing this, just
disable the interrupt.
The mv88e6390 family has different watchdog registers. So use an ops
structure, so support for the mv88e6390 family can be added later.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The pcm_stream_info.running field is only set in the PCM trigger
callback but never referred, thus it can be safely removed.
Also, properly cover the spinlock in both the trigger START and STOP
to protect had_enable_audio() calls.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Sort the device ids by vendor id.
Signed-off-by: Johan Hovold <johan@kernel.org>
|
|
Currently the driver handles some reset procedure at the trigger STOP
and the underrun functions, where both are executed in the interrupt
context. Especially the underrun function has a sync-loop to clear
the UNDERRUN status bit, and this is supposed to be one of plausible
causes of GPU hangup.
Since the job to be done in the interrupt handler should be minimum,
we move the reset function out of trigger and underrun, and push it
into the prepare (and hw_free) callbacks instead. Here a new flag,
need_reset, is introduced to indicate the requirement of the reset
procedure. This is for avoiding the multiple resets when PCM prepare
is called sequentially.
Also in the UNDERRUN bit-clear sync loop, take a longer pause to be in
the safer side. Taking a longer delay is no longer a problem now
because we're running in the normal context.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|