Age | Commit message (Collapse) | Author |
|
The cached value of the last selected channel prevents retries on the
next call, even on failure to update the selected channel. Fix that.
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
|
|
In case the high-level controller (HLC) is used the status code is
reported at a different location. Check that location after HLC
write operations if the ready bit is not set and return an appropriate
error code instead of always returning -EAGAIN.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Due to a bug in the ThunderX I2C hardware sending STOP during
a recovery attempt could lock up the hardware. To work around
this problem do not send STOP at the beginning of the recovery
but use the override registers to bring the TWSI including
the high-level controller out of the bad state.
Signed-off-by: Dmitry Bazhenov <dmitry.bazhenov@auriga.com>
Signed-off-by: Jan Glauber <jglauber@cavium.com>
[Changed commit message]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The set SCL recovery function unconditionally pulls the SCL line low.
Only pull SCL line low according to val parameter.
Signed-off-by: Dmitry Bazhenov <dmitry.bazhenov@auriga.com>
Signed-off-by: Jan Glauber <jglauber@cavium.com>
[Changed commit message]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Free memory mapping, if lpc32xx_clk_init is not successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
|
|
PCI_DOMAINS config should be selected for any SoCs
having more than a single PCIe controller. Without PCI_DOMAINS
config, only one PCIe controller gets registered.
Select PCI_DOMAINS in ARCH_MULTIPLATFORM if PCI is selected, since
it doesn't harm even if a platform has a single PCIe port.
Also remove PCI_DOMAINS being selected from other platform
specific configs.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
*MIGHT_HAVE_PCI* config is already selected in ARCH_MULTIPLATFORM.
Don't select it redundantly in all ARCH_MULTIPLATFORM based machines.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
calling fill_cmd() using a MACRO definition not handled in switch
statement causes BUG() to be called.
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/drivers
Pull "Qualcomm EBI2 bindings and bus driver" from Linus Walleij
* tag 'qcom-ebi2-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
bus: qcom: add EBI2 driver
bus: qcom: add EBI2 device tree bindings
Acked-by: Andy Gross <andy.gross@linaro.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into next/late
Pull "Allwinner DT changes for 4.9, late edition" from Maxime Ripard:
Here is a bunch of late changes for the 4.9 merge window, mostly:
- Added a bunch of touchscreens nodes to tablets
- Added support for the AXP806 PMIC found in the A80 boards
- Enabled a few pinmux options for the H3
* tag 'sunxi-dt-for-4.9-3' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
ARM: dts: sun8i: Add accelerometer to polaroid-mid2407pxe03
ARM: dts: sun8i: enable UART1 for iNet D978 Rev2 board
ARM: dts: sun8i: add pinmux for UART1 at PG
dts: sun8i-h3: add I2C0-2 peripherals to H3 SOC
dts: sun8i-h3: add pinmux definitions for I2C0-2
dts: sun8i-h3: associate exposed UARTs on Orange Pi Boards
dts: sun8i-h3: split off RTS/CTS for UART1 in seperate pinmux
dts: sun8i-h3: add pinmux definitions for UART2-3
ARM: dts: sun9i: a80-optimus: Disable EHCI1
ARM: dts: sun9i: cubieboard4: Add AXP806 PMIC device node and regulators
ARM: dts: sun9i: a80-optimus: Add AXP806 PMIC device node and regulators
ARM: dts: sun9i: cubieboard4: Declare AXP809 SW regulator as unused
ARM: dts: sun9i: a80-optimus: Declare AXP809 SW regulator as unused
ARM: dts: sun8i: Add touchscreen node for sun8i-a33-ga10h
ARM: dts: sun8i: Add touchscreen node for sun8i-a23-polaroid-mid2809pxe04
ARM: dts: sun8i: Add touchscreen node for sun8i-a23-polaroid-mid2407pxe03
ARM: dts: sun8i: Add touchscreen node for sun8i-a23-inet86dz
ARM: dts: sun8i: Add touchscreen node for sun8i-a23-gt90h
|
|
next/drivers
Pull "mvebu drivers for 4.9 (part 1)" from Gregory CLEMENT:
- Add pinctrl and clk support for the Orion5x SoC mv88f5181 variant
* tag 'mvebu-drivers-4.9-1' of git://git.infradead.org/linux-mvebu:
pinctrl: mvebu: orion5x: Generalise mv88f5181l support for 88f5181
clk: mvebu: Add clk support for the orion5x SoC mv88f5181
|
|
Pull "mvebu soc for 4.9 (part 1)" from Gregory CLEMENT:
- irq cleanup for old mvebu SoC
* tag 'mvebu-soc-4.9-1' of git://git.infradead.org/linux-mvebu:
ARM: orion5x: remove extraneous NO_IRQ
ARM: orion: simplify orion_ge00_switch_init
ARM: mvebu/orion: remove NO_IRQ check from device init
ARM: mv78xx0: simplify ethernet device creation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into next/soc
Pull "i.MX legacy board file changes for 4.9" from Shawn Guo:
It includes a patch series that moves registrations and initializations
of all peripherals which are GPIO line consumers for all legacy boards
from .init_machine to .init_late init level. This is needed to
proactively prevent boot time issues on the legacy boards due to the
deprioritized init level of the GPIO controller driver (set lower than
IOMUX controller driver init level), which is shared among all i.MX
SoCs.
* tag 'imx-legacy-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: imx legacy: pca100: move peripheral initialization to .init_late
ARM: imx legacy: mx27ads: move peripheral initialization to .init_late
ARM: imx legacy: mx21ads: move peripheral initialization to .init_late
ARM: imx legacy: pcm043: move peripheral initialization to .init_late
ARM: imx legacy: mx35-3ds: move peripheral initialization to .init_late
ARM: imx legacy: mx27-3ds: move peripheral initialization to .init_late
ARM: imx legacy: imx27-visstrim-m10: move peripheral initialization to .init_late
ARM: imx legacy: vpr200: move peripheral initialization to .init_late
ARM: imx legacy: mx31moboard: move peripheral initialization to .init_late
ARM: imx legacy: armadillo5x0: move peripheral initialization to .init_late
ARM: imx legacy: qong: move peripheral initialization to .init_late
ARM: imx legacy: mx31-3ds: move peripheral initialization to .init_late
ARM: imx legacy: pcm037: move peripheral initialization to .init_late
ARM: imx legacy: mx31lilly: move peripheral initialization to .init_late
ARM: imx legacy: mx31ads: move peripheral initialization to .init_late
ARM: imx legacy: mx31lite: move peripheral initialization to .init_late
ARM: imx legacy: kzm: move peripheral initialization to .init_late
|
|
When any UFS host controller receives a TM(Task Management) response
from a UFS device, UFS driver has been recognize like receiving a
message of "Task Management Function Complete"(00h) in all cases, so
far. That means there is no pending task for a tag of the TM request
sent before in the UFS device. That's because the byte offset 6 in TM
response which has been used to get a TM service response so far
represents just whether or not a TM transmission passes.
Regarding UFS spec, the correct byte offset to get TM service response
is 15, not 6.
I tested that UFS driver responds properly for the TM response from a
UFS device with an reference board with exynos8890, as follow: No
pending task -> Task Management Function Complete (00h) Pending task ->
Task Management Function Succeeded (08h)
[mkp: applied by hand]
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: HeonGwang Chu <hg.chu@samsung.com>
Tested-by: : Kiwoong Kim <kwmad.kim@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Some platform firmware may interfere with the RTC alarm over suspend,
resulting in the kernel and hardware having different ideas about system
state but also potentially causing problems with firmware that assumes the
OS will clean this case up. This patch restores the RTC alarm on resume
to ensure that kernel and hardware are in sync.
The case we've seen is Intel Rapid Start, which is a firmware-mediated
feature that automatically transitions systems from suspend-to-RAM to
suspend-to-disk without OS involvement. It does this by setting the RTC
alarm and a flag that indicates that on wake it should perform the
transition rather than re-starting the OS. However, if the OS has set a
wakeup alarm that would wake the machine earlier, it refuses to overwrite
it and allows the system to wake instead.
This fails in the following situation:
1) User configures Intel Rapid Start to transition after (say) 15
minutes
2) User suspends to RAM. Firmware sets the wakeup alarm for 15 minutes
in the future
3) User resumes after 5 minutes. Firmware does not reset the alarm, and
as such it is still set for 10 minutes in the future
4) User suspends after 5 minutes. Firmware notices that the alarm is set
for 5 minutes in the future, which is less than the 15 minute transition
threshold. It therefore assumes that the user wants the machine to wake
in 5 minutes
5) System resumes after 5 minutes
The worst case scenario here is that the user may have put the system in a
bag between (4) and (5), resulting in it running in a confined space and
potentially overheating. This seems reasonably important. The Rapid
Start support code got added in 3.11, but it can be configured in the
firmware regardless of kernel support.
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Currently ACPI-driven alarms are not cleared when they wake the
system. As consequence, expired alarms must be manually cleared to
program a new alarm. Fix this by correctly handling ACPI-driven
alarms.
More specifically, the ACPI specification [1] provides for two
alternative implementations of the RTC. Depending on the
implementation, the driver either clear the alarm from the resume
callback or from ACPI interrupt handler:
- The platform has the RTC wakeup status fixed in hardware
(ACPI_FADT_FIXED_RTC is 0). In this case the driver can determine
if the RTC was the reason of the wakeup from the resume callback
by reading the RTC status register.
- The platform has no fixed hardware feature event bits. In this
case a GPE is used to wake the system and the driver clears the
alarm from its handler.
[1] http://www.acpi.info/DOWNLOADS/ACPI_5_Errata%20A.pdf
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
Support configuration of ext_wakeup sources. This patch makes it
possible to enable ext_wakeup and set it's polarity, depending on board
configuration. AM335x's dedicated PMIC (tps65217) uses ext_wakeup to
notify about power-button presses. Handling power-button presses enables
to recover from RTC-only power states correctly.
Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
|
|
When we want to identify whether the proc_id is unreasonable or not, we
can call the "acpi_processor_validate_proc_id" function. It will search
in the duplicate IDs. If we find the proc_id in the IDs, we return true
to the call function. Conversely, the false represents available.
When we establish all possible cpuid <-> nodeid mapping to handle the
cpu hotplugs, we will use the proc_id from ACPI table.
We do validation when we get the proc_id. If the result is true, we
will stop the mapping.
[ tglx: Mark the new function __init ]
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-8-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
[Problem]
When we set cpuid <-> nodeid mapping to be persistent, it will use the DSDT
As we know, the ACPI tables are just like user's input in that respect, and
we don't crash if user's input is unreasonable.
Such as, the mapping of the proc_id and pxm in some machine's ACPI table is
like this:
proc_id | pxm
--------------------
0 <-> 0
1 <-> 0
2 <-> 1
3 <-> 1
89 <-> 0
89 <-> 0
89 <-> 0
89 <-> 1
89 <-> 1
89 <-> 2
89 <-> 3
.....
We can't be sure which one is correct to the proc_id 89. We may map a wrong
node to a cpu. When pages are allocated, this may cause a kernal panic.
So, we should provide mechanisms to validate the ACPI tables, just like we
do validation to check user's input in web project.
The mechanism is that the processor objects which have the duplicate IDs
are not valid.
[Solution]
We add a validation function, like this:
foreach Processor in DSDT
proc_id = get_ACPI_Processor_number(Processor)
if (proc_id exists )
mark both of them as being unreasonable;
The function will record the unique or duplicate processor IDs.
The duplicate processor IDs such as 89 are regarded as the unreasonable IDs
which mean that the processor objects in question are not valid.
[ tglx: Add __init[data] annotations ]
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-7-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
wq_numa_possible_cpumask will not cause any problem.
It contains 4 steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
2. Introduce a new array storing all possible cpuid <-> apicid mapping.
3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
4. Establish all possible cpuid <-> nodeid mapping.
This patch finishes step 4.
This patch set the persistent cpuid <-> nodeid mapping for all enabled/disabled
processors at boot time via an additional acpi namespace walk for processors.
[ tglx: Remove the unneeded exports ]
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-6-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
wq_numa_possible_cpumask will not cause any problem.
It contains 4 steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
2. Introduce a new array storing all possible cpuid <-> apicid mapping.
3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
4. Establish all possible cpuid <-> nodeid mapping.
This patch finishes step 3.
There are four mappings in the kernel:
1. nodeid (logical node id) <-> pxm (persistent)
2. apicid (physical cpu id) <-> nodeid (persistent)
3. cpuid (logical cpu id) <-> apicid (not persistent, now persistent by step 2)
4. cpuid (logical cpu id) <-> nodeid (not persistent)
So, in order to setup persistent cpuid <-> nodeid mapping for all possible CPUs,
we should:
1. Setup cpuid <-> apicid mapping for all possible CPUs, which has been done in step 1, 2.
2. Setup cpuid <-> nodeid mapping for all possible CPUs. But before that, we should
obtain all apicids from MADT.
All processors' apicids can be obtained by _MAT method or from MADT in ACPI.
The current code ignores disabled processors and returns -ENODEV.
After this patch, a new parameter will be added to MADT APIs so that caller
is able to control if disabled processors are ignored.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-5-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
wq_numa_possible_cpumask will not cause any problem.
It contains 4 steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
2. Introduce a new array storing all possible cpuid <-> apicid mapping.
3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
4. Establish all possible cpuid <-> nodeid mapping.
This patch finishes step 2.
In this patch, we introduce a new static array named cpuid_to_apicid[],
which is large enough to store info for all possible cpus.
And then, we modify the cpuid calculation. In generic_processor_info(),
it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid
mapping changes with node hotplug.
After this patch, we find the next unused cpuid, map it to an apicid,
and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid
mapping will be persistent.
And finally we will use this array to make cpuid <-> nodeid persistent.
cpuid <-> apicid mapping is established at local apic registeration time.
But non-present or disabled cpus are ignored.
In this patch, we establish all possible cpuid <-> apicid mapping when
registering local apic.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-4-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches
the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time.
When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed,
which means, cpuid <-> nodeid mapping will change if node hotplug happens. But
workqueue does not update wq_numa_possible_cpumask.
So here is the problem:
Assume we have the following cpuid <-> nodeid in the beginning:
Node | CPU
------------------------
node 0 | 0-14, 60-74
node 1 | 15-29, 75-89
node 2 | 30-44, 90-104
node 3 | 45-59, 105-119
and we hot-remove node2 and node3, it becomes:
Node | CPU
------------------------
node 0 | 0-14, 60-74
node 1 | 15-29, 75-89
and we hot-add node4 and node5, it becomes:
Node | CPU
------------------------
node 0 | 0-14, 60-74
node 1 | 15-29, 75-89
node 4 | 30-59
node 5 | 90-119
But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like.
When a pool workqueue is initialized, if its cpumask belongs to a node, its
pool->node will be mapped to that node. And memory used by this workqueue will
also be allocated on that node.
static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs){
...
/* if cpumask is contained inside a NUMA node, we belong to that node */
if (wq_numa_enabled) {
for_each_node(node) {
if (cpumask_subset(pool->attrs->cpumask,
wq_numa_possible_cpumask[node])) {
pool->node = node;
break;
}
}
}
Since wq_numa_possible_cpumask is not updated, it could be mapped to an offline node,
which will lead to memory allocation failure:
SLUB: Unable to allocate memory on node 2 (gfp=0x80d0)
cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min order: 0
node 0: slabs: 6172, objs: 259224, free: 245741
node 1: slabs: 3261, objs: 136962, free: 127656
It happens here:
create_worker(struct worker_pool *pool)
|--> worker = alloc_worker(pool->node);
static struct worker *alloc_worker(int node)
{
struct worker *worker;
worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, node); --> Here, useing the wrong node.
......
return worker;
}
[Solution]
There are four mappings in the kernel:
1. nodeid (logical node id) <-> pxm
2. apicid (physical cpu id) <-> nodeid
3. cpuid (logical cpu id) <-> apicid
4. cpuid (logical cpu id) <-> nodeid
1. pxm (proximity domain) is provided by ACPI firmware in SRAT, and nodeid <-> pxm
mapping is setup at boot time. This mapping is persistent, won't change.
2. apicid <-> nodeid mapping is setup using info in 1. The mapping is setup at boot
time and CPU hotadd time, and cleared at CPU hotremove time. This mapping is also
persistent.
3. cpuid <-> apicid mapping is setup at boot time and CPU hotadd time. cpuid is
allocated, lower ids first, and released at CPU hotremove time, reused for other
hotadded CPUs. So this mapping is not persistent.
4. cpuid <-> nodeid mapping is also setup at boot time and CPU hotadd time, and
cleared at CPU hotremove time. As a result of 3, this mapping is not persistent.
To fix this problem, we establish cpuid <-> nodeid mapping for all the possible
cpus at boot time, and make it persistent. And according to init_cpu_to_node(),
cpuid <-> nodeid mapping is based on apicid <-> nodeid mapping and cpuid <-> apicid
mapping. So the key point is obtaining all cpus' apicid.
apicid can be obtained by _MAT (Multiple APIC Table Entry) method or found in
MADT (Multiple APIC Description Table). So we finish the job in the following steps:
1. Enable apic registeration flow to handle both enabled and disabled cpus.
This is done by introducing an extra parameter to generic_processor_info to let the
caller control if disabled cpus are ignored.
2. Introduce a new array storing all possible cpuid <-> apicid mapping. And also modify
the way cpuid is calculated. Establish all possible cpuid <-> apicid mapping when
registering local apic. Store the mapping in this array.
3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
This is also done by introducing an extra parameter to these apis to let the caller
control if disabled cpus are ignored.
4. Establish all possible cpuid <-> nodeid mapping.
This is done via an additional acpi namespace walk for processors.
This patch finished step 1.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-3-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
For now, x86 does not support memory-less node. A node without memory
will not be onlined, and the cpus on it will be mapped to the other
online nodes with memory in init_cpu_to_node(). The reason of doing this
is to ensure each cpu has mapped to a node with memory, so that it will
be able to allocate local memory for that cpu.
But we don't have to do it in this way.
In this series of patches, we are going to construct cpu <-> node mapping
for all possible cpus at boot time, which is a persistent mapping. It means
that the cpu will be mapped to the node which it belongs to, and will never
be changed. If a node has only cpus but no memory, the cpus on it will be
mapped to a memory-less node. And the memory-less node should be onlined.
Allocate pgdats for all memory-less nodes and online them at boot
time. Then build zonelists for these nodes. As a result, when cpus on these
memory-less nodes try to allocate memory from local node, it will
automatically fall back to the proper zones in the zonelists.
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: mika.j.penttila@gmail.com
Cc: len.brown@intel.com
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: rafael@kernel.org
Cc: rjw@rjwysocki.net
Cc: yasu.isimatu@gmail.com
Cc: linux-mm@kvack.org
Cc: linux-acpi@vger.kernel.org
Cc: isimatu.yasuaki@jp.fujitsu.com
Cc: gongzhaogang@inspur.com
Cc: tj@kernel.org
Cc: izumi.taku@jp.fujitsu.com
Cc: cl@linux.com
Cc: chen.tang@easystack.cn
Cc: akpm@linux-foundation.org
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1472114120-3281-2-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Every so often, with a special config or a architecture change, running
function or function_graph tracing can cause the machien to hard reboot,
crash, or simply hard lockup. There's some functions in the function graph
tracer that can not be traced otherwise it causes the function tracer to
recurse before the recursion protection mechanisms are in place.
When this occurs, using the dynamic ftrace featuer that allows limiting what
actually gets traced can be used to bisect down to the problem function.
This adds a script that helps with this process in the scripts/tracing
directory, called ftrace-bisect.sh
The set up is to read all the functions that can be traced from
available_filter_functions into a file (full_file). Then run this script
passing it the full_file and a "test_file" and "non_test_file", where the
test_file will be add to set_ftrace_filter. What ftarce_bisect.sh does, is
to copy half of the functions in full_file into the test_file and the other
half into the non_test_file. This way, one can cat the test_file into the
set_ftrace_filter functions and only test the functions that are in that
file. If it works, then we run the process again after copying non_test_file
to full_file and repeating the process. If the system crashed, then the bad
function is in the test_file and after a reboot, the test_file becomes the
new full_file in the next iteration.
When we get down to a single function in the full_file, then
ftrace_bisect.sh will report that as the bad function.
Full documentation of how to use this simple script is within the script
file itself.
Link: http://lkml.kernel.org/r/20160920100716.131d3647@gandalf.local.home
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Provide a nicer to_sa1111_device macro to convert a struct device to a
sa1111_dev. We will need this for drivers when converting them to
dev_pm_ops, or removing shutdown methods.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
register_shrinker can fail after commit 1d3d4437eae1 ("vmscan: per-node
deferred work"), we should detect the failure of it, otherwise we may
fail to register shrinker after gfs2 module was been inited successfully.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Ben Greear reported:
> I see lots of instability as soon as I load up the carl9710 NIC.
> My application is going to be poking at it's debugfs files...
>
> BUG: KASAN: slab-out-of-bounds in carl9170_debugfs_read+0xd5/0x2a0
> [carl9170] at addr 0xffff8801bc1208b0
> Read of size 8 by task btserver/5888
> =======================================================================
> BUG kmalloc-256 (Tainted: G W ): kasan: bad access detected
> -----------------------------------------------------------------------
>
> INFO: Allocated in seq_open+0x50/0x100 age=2690 cpu=2 pid=772
>...
This breakage was caused by the introduction of intermediate
fops in debugfs by commit 9fd4dcece43a
("debugfs: prevent access to possibly dead file_operations at file open")
Thankfully, the original/real fops are still available in d_fsdata.
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Cc: stable <stable@vger.kernel.org> # 4.7+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
OrangeFS 2.9.6 was released without support for the features op. Thus
OrangeFS 2.9.7 will be required to use it.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
|
|
If platform firmware fails to populate unique / non-zero serial number
data for each nvdimm in an interleave-set it may cause pmem region
initialization to fail. Add a debug message for this case.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Add an nfit_test specific attribute for gating whether a get_config_size
DSM, or any DSM for that matter, succeeds or fails. The get_config_size
DSM is initial motivation since that is the first command libnvdimm core
issues to determine the state of the namespace label area.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
For the DSMs where the kernel knows the format of the output buffer and
originates those DSMs from within the kernel, return -EIO for any
non-zero status. If the BIOS is indicating a status that we do not know
how to handle, fail the DSM.
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The internal alloc_nvdimm_map() helper might fail, particularly if the
memory region is already busy. Report request_mem_region() failures and
check for the failure.
Reported-by: Ryan Chen <ryan.chan105@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
This patch fixes a NULL pointer dereference caused by a race codition in
the probe function of the legousbtower driver. It re-structures the
probe function to only register the interface after successfully reading
the board's firmware ID.
The probe function does not deregister the usb interface after an error
receiving the devices firmware ID. The device file registered
(/dev/usb/legousbtower%d) may be read/written globally before the probe
function returns. When tower_delete is called in the probe function
(after an r/w has been initiated), core dev structures are deleted while
the file operation functions are still running. If the 0 address is
mappable on the machine, this vulnerability can be used to create a
Local Priviege Escalation exploit via a write-what-where condition by
remapping dev->interrupt_out_buffer in tower_write. A forged USB device
and local program execution would be required for LPE. The USB device
would have to delay the control message in tower_probe and accept
the control urb in tower_open whilst guest code initiated a write to the
device file as tower_delete is called from the error in tower_probe.
This bug has existed since 2003. Patch tested by emulated device.
Reported-by: James Patrick-Evans <james@jmp-e.com>
Tested-by: James Patrick-Evans <james@jmp-e.com>
Signed-off-by: James Patrick-Evans <james@jmp-e.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
register_shrinker() now can fail. When it happens, shrinker.nr_deferred is
null. We use it to determine if unregister_shrinker is required.
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
register_shrinker can fail after commit 1d3d4437eae1 ("vmscan: per-node
deferred work"), we should detect the failure of it, otherwise we may
fail to register shrinker after raid5 configuration was setup successfully.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
lockdep reports a potential deadlock. Fix this by droping the mutex
before md_import_device
[ 1137.126601] ======================================================
[ 1137.127013] [ INFO: possible circular locking dependency detected ]
[ 1137.127013] 4.8.0-rc4+ #538 Not tainted
[ 1137.127013] -------------------------------------------------------
[ 1137.127013] mdadm/16675 is trying to acquire lock:
[ 1137.127013] (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff81243cf3>] __blkdev_get+0x63/0x450
[ 1137.127013]
but task is already holding lock:
[ 1137.127013] (detected_devices_mutex){+.+.+.}, at: [<ffffffff81a5138c>] md_ioctl+0x2ac/0x1f50
[ 1137.127013]
which lock already depends on the new lock.
[ 1137.127013]
the existing dependency chain (in reverse order) is:
[ 1137.127013]
-> #1 (detected_devices_mutex){+.+.+.}:
[ 1137.127013] [<ffffffff810b6f19>] lock_acquire+0xb9/0x220
[ 1137.127013] [<ffffffff81c51647>] mutex_lock_nested+0x67/0x3d0
[ 1137.127013] [<ffffffff81a4eeaf>] md_autodetect_dev+0x3f/0x90
[ 1137.127013] [<ffffffff81595be8>] rescan_partitions+0x1a8/0x2c0
[ 1137.127013] [<ffffffff81590081>] __blkdev_reread_part+0x71/0xb0
[ 1137.127013] [<ffffffff815900e5>] blkdev_reread_part+0x25/0x40
[ 1137.127013] [<ffffffff81590c4b>] blkdev_ioctl+0x51b/0xa30
[ 1137.127013] [<ffffffff81242bf1>] block_ioctl+0x41/0x50
[ 1137.127013] [<ffffffff81214c96>] do_vfs_ioctl+0x96/0x6e0
[ 1137.127013] [<ffffffff81215321>] SyS_ioctl+0x41/0x70
[ 1137.127013] [<ffffffff81c56825>] entry_SYSCALL_64_fastpath+0x18/0xa8
[ 1137.127013]
-> #0 (&bdev->bd_mutex){+.+.+.}:
[ 1137.127013] [<ffffffff810b6af2>] __lock_acquire+0x1662/0x1690
[ 1137.127013] [<ffffffff810b6f19>] lock_acquire+0xb9/0x220
[ 1137.127013] [<ffffffff81c51647>] mutex_lock_nested+0x67/0x3d0
[ 1137.127013] [<ffffffff81243cf3>] __blkdev_get+0x63/0x450
[ 1137.127013] [<ffffffff81244307>] blkdev_get+0x227/0x350
[ 1137.127013] [<ffffffff812444f6>] blkdev_get_by_dev+0x36/0x50
[ 1137.127013] [<ffffffff81a46d65>] lock_rdev+0x35/0x80
[ 1137.127013] [<ffffffff81a49bb4>] md_import_device+0xb4/0x1b0
[ 1137.127013] [<ffffffff81a513d6>] md_ioctl+0x2f6/0x1f50
[ 1137.127013] [<ffffffff815909b3>] blkdev_ioctl+0x283/0xa30
[ 1137.127013] [<ffffffff81242bf1>] block_ioctl+0x41/0x50
[ 1137.127013] [<ffffffff81214c96>] do_vfs_ioctl+0x96/0x6e0
[ 1137.127013] [<ffffffff81215321>] SyS_ioctl+0x41/0x70
[ 1137.127013] [<ffffffff81c56825>] entry_SYSCALL_64_fastpath+0x18/0xa8
[ 1137.127013]
other info that might help us debug this:
[ 1137.127013] Possible unsafe locking scenario:
[ 1137.127013] CPU0 CPU1
[ 1137.127013] ---- ----
[ 1137.127013] lock(detected_devices_mutex);
[ 1137.127013] lock(&bdev->bd_mutex);
[ 1137.127013] lock(detected_devices_mutex);
[ 1137.127013] lock(&bdev->bd_mutex);
[ 1137.127013]
*** DEADLOCK ***
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
if bitmap_create fails, the bitmap is already cleaned up and the returned value
is an error number. We can't do the cleanup again.
Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
raid5 will split bio to proper size internally, there is no point to use
underlayer disk's max_hw_sectors. In my qemu system, without the change,
the raid5 only receives 128k size bio, which reduces the chance of bio
merge sending to underlayer disks.
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Optimize RAID6 xor_syndrome functions to take advantage of the 512-bit
ZMM integer instructions introduced in AVX512.
AVX512 optimized xor_syndrome functions, which is simply based on sse2.c
written by hpa.
The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Adding avx512 gen_syndrome and recovery functions so as to allow code to
be compiled and tested successfully in userspace.
This patch is tested in userspace and improvement in performace is
observed.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.
AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim Kukunas
This patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.
AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.
The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
When one node is perform resync or recovery, other nodes
can't get resync lock and could block for a while before
it holds the lock, so we can't stop array immediately for
this scenario.
To make array could be stop quickly, we check MD_CLOSING
in dlm_lock_sync_interruptible to make us can interrupt
the lock request.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
When some node leaves cluster, then it's bitmap need to be
synced by another node, so "md*_recover" thread is triggered
for the purpose. However, with below steps. we can find tasks
hang happened either in B or C.
1. Node A create a resyncing cluster raid1, assemble it in
other two nodes (B and C).
2. stop array in B and C.
3. stop array in A.
linux44:~ # ps aux|grep md|grep D
root 5938 0.0 0.1 19852 1964 pts/0 D+ 14:52 0:00 mdadm -S md0
root 5939 0.0 0.0 0 0 ? D 14:52 0:00 [md0_recover]
linux44:~ # cat /proc/5939/stack
[<ffffffffa04cf321>] dlm_lock_sync+0x71/0x90 [md_cluster]
[<ffffffffa04d0705>] recover_bitmaps+0x125/0x220 [md_cluster]
[<ffffffffa052105d>] md_thread+0x16d/0x180 [md_mod]
[<ffffffff8107ad94>] kthread+0xb4/0xc0
[<ffffffff8152a518>] ret_from_fork+0x58/0x90
linux44:~ # cat /proc/5938/stack
[<ffffffff8107afde>] kthread_stop+0x6e/0x120
[<ffffffffa0519da0>] md_unregister_thread+0x40/0x80 [md_mod]
[<ffffffffa04cfd20>] leave+0x70/0x120 [md_cluster]
[<ffffffffa0525e24>] md_cluster_stop+0x14/0x30 [md_mod]
[<ffffffffa05269ab>] bitmap_free+0x14b/0x150 [md_mod]
[<ffffffffa0523f3b>] do_md_stop+0x35b/0x5a0 [md_mod]
[<ffffffffa0524e83>] md_ioctl+0x873/0x1590 [md_mod]
[<ffffffff81288464>] blkdev_ioctl+0x214/0x7d0
[<ffffffff811dd3dd>] block_ioctl+0x3d/0x40
[<ffffffff811b92d4>] do_vfs_ioctl+0x2d4/0x4b0
[<ffffffff811b9538>] SyS_ioctl+0x88/0xa0
[<ffffffff8152a5c9>] system_call_fastpath+0x16/0x1b
The problem is caused by recover_bitmaps can't reliably abort
when the thread is unregistered. So dlm_lock_sync_interruptible
is introduced to detect the thread's situation to fix the problem.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Previously, we used completion to sync between require dlm lock
and sync_ast, however we will have to expose completion.wait
and completion.done in dlm_lock_sync_interruptible (introduced
later), it is not a common usage for completion, so convert
related things to wait queue.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
We need to use rcu_read_lock/unlock to avoid potential
race.
Reported-by: Shaohua Li <shli@fb.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
cluster_info and bitmap_info.nodes also need to be
cleared when array is stopped.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
When stop clustered raid while it is pending on resync,
MD_STILL_CLOSED flag could be cleared since udev rule
is triggered to open the mddev. So obviously array can't
be stopped soon and returns EBUSY.
mdadm -Ss md-raid-arrays.rules
set MD_STILL_CLOSED md_open()
... ... ... clear MD_STILL_CLOSED
do_md_stop
We make below changes to resolve this issue:
1. rename MD_STILL_CLOSED to MD_CLOSING since it is set
when stop array and it means we are stopping array.
2. let md_open returns early if CLOSING is set, so no
other threads will open array if one thread is trying
to close it.
3. no need to clear CLOSING bit in md_open because 1 has
ensure the bit is cleared, then we also don't need to
test CLOSING bit in do_md_stop.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|
Since DLM_LKF_FORCEUNLOCK is used in lockres_free,
we don't need to call dlm_unlock_sync before free
lock resource.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|