Age | Commit message (Collapse) | Author |
|
- It makes no sense that we deal with a inode in the dead tree.
- fix the race between dio and page copy by waiting the dio completion
- avoid the page copy vs truncate/punch hole
- check if the page is in the page cache or not
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
- It make no sense that we continue to do something after the error
happened, just go back with this patch.
- remove some check of copy_nocow_pages_for_inode(), such as page check
after write, inode check in the end of the function, because we are
sure they exist.
- remove the unnecessary goto in the return value check of the write
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We get oops while running btrfs replace start test,
------------[ cut here ]------------
kernel BUG at mm/filemap.c:608!
[SNIP]
Call Trace:
[<ffffffffa04b36c7>] copy_nocow_pages_for_inode+0x217/0x3f0 [btrfs]
[<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[<ffffffffa04bb8ce>] iterate_extent_inodes+0x1ae/0x300 [btrfs]
[<ffffffffa04bbab2>] iterate_inodes_from_logical+0x92/0xb0 [btrfs]
[<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[<ffffffffa04b3b07>] copy_nocow_pages_worker+0x97/0x150 [btrfs]
[<ffffffffa048eed4>] worker_loop+0x134/0x540 [btrfs]
[<ffffffff816274ea>] ? __schedule+0x3ca/0x7f0
[<ffffffffa048eda0>] ? btrfs_queue_worker+0x300/0x300 [btrfs]
[<ffffffff8106f2f0>] kthread+0xc0/0xd0
[<ffffffff8106f230>] ? flush_kthread_worker+0x80/0x80
[<ffffffff8163181c>] ret_from_fork+0x7c/0xb0
[<ffffffff8106f230>] ? flush_kthread_worker+0x80/0x80
[SNIP]
RIP [<ffffffff8111f4c5>] unlock_page+0x35/0x40
RSP <ffff88010316bb98>
---[ end trace 421e79ad0dd72c7d ]---
it is because we forgot to lock the page again after we read data to
the page. Fix it.
Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
When adjusting the enospc rules for relocation I ran into a deadlock because we
were relocating the only system chunk and that forced us to try and allocate a
new system chunk while holding locks in the chunk tree, which caused us to
deadlock. To fix this I've moved all of the dev extent addition and chunk
addition out to the delayed chunk completion stuff. We still keep the in-memory
stuff which makes sure everything is consistent.
One change I had to make was to search the commit root of the device tree to
find a free dev extent, and hold onto any chunk em's that we allocated in that
transaction so we do not allocate the same dev extent twice. This has the side
effect of fixing a bug with balance that has been there ever since balance
existed. Basically you can free a block group and it's dev extent and then
immediately allocate that dev extent for a new block group and write stuff to
that dev extent, all within the same transaction. So if you happen to crash
during a balance you could come back to a completely broken file system. This
patch should keep these sort of things from happening in the future since we
won't be able to allocate free'd dev extents until after the transaction
commits. This has passed all of the xfstests and my super annoying stress test
followed by a balance. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
I hit a weird problem were my root item had been deleted but the orphan item had
not. This isn't necessarily a problem, but it keeps the file system from being
mounted. To fix this we just need to axe the orphan item if we can't find the
fs root when we're putting them altogether. With this patch I was able to
successfully mount my file system. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Now reading the data from the target device of the replace operation is allowed,
so the mirror number that is greater than the stripes number of a chunk is valid,
we will tune it when we find there is no target device later. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Using the structure btrfs_sector_sum to keep the checksum value is
unnecessary, because the extents that btrfs_sector_sum points to are
continuous, we can find out the expected checksums by btrfs_ordered_sum's
bytenr and the offset, so we can remove btrfs_sector_sum's bytenr. After
removing bytenr, there is only one member in the structure, so it makes
no sense to keep the structure, just remove it, and use a u32 array to
store the checksum value.
By this change, we don't use the while loop to get the checksums one by
one. Now, we can get several checksum value at one time, it improved the
performance by ~74% on my SSD (31MB/s -> 54MB/s).
test command:
# dd if=/dev/zero of=/mnt/btrfs/file0 bs=1M count=1024 oflag=sync
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We always just try and reserve data space when we write, but if we are out of
space but have prealloc'ed extents we should still successfully write. This
patch will try and see if we can write to prealloc'ed space and if we can go
ahead and allow the write to continue. With this patch we now pass xfstests
generic/274. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
try_to_writeback_inodes_sb_nr returns 1 if writeback is already underway, which
is completely fraking useless for us as we need to make sure pages are actually
written before we go and check if there are ordered extents. So replace this
with an open coding of try_to_writeback_inodes_sb_nr minus the writeback
underway check so that we are sure to actually have flushed some dirty pages out
and will have ordered extents to use. With this patch xfstests generic/273 now
passes. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
There are all of these checks in the ENOSPC code to see if committing the
transaction would free up enough space to make the allocation. This is because
early on we just committed the transaction and hoped and prayed, which resulted
in cases where it took _forever_ to get an ENOSPC when we really were out of
space. So we check space_info->bytes_pinned, except this isn't completely true
because it doesn't account for space we may free but are stuck in delayed refs.
So tests like xfstests 226 would fail because we wouldn't commit the transaction
to free up the data space. So instead add a percpu counter that will be a
little fuzzier, it will add bytes as soon as we try to free up the space, and
remove any space it doesn't actually free up when we get around to doing the
actual free. We then 0 out this counter every transaction period so we have a
better idea of how much space we will actually free up by committing this
transaction. With this patch we now pass xfstests 226. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We have an optimization that will go ahead and cache no acls on an inode if
there are no xattrs on the inode. This saves us a lookup later to check the
acls for writes or any other access. The problem is I use selinux so I always
have an xattr on inodes, so make this test a little smarter and check for the
actual acl hash on the key and if it isn't there then we still get to cache no
acl which makes everybody who uses selinux a little happier. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
drivers/leds/leds-mc13783.c: In function 'mc13xxx_led_probe':
drivers/leds/leds-mc13783.c:195:2: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
|
|
The route_irq() function needs to preserve the irq mask by using the
_irqsave/irqrestore variants of raw spin lock functions instead of the
_irq variants. This is because it is called from __cpu_disable() (via
migrate_irqs()), which is called with IRQs disabled, so using the _irq
variants re-enables IRQs.
This appears to have been causing occasional hits of the
BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
testing:
BUG: failure at kernel/irq_work.c:122/__irq_work_run()!
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
|
|
kick_handler() doesn't have an irq_enter/exit pair, but it's used for
handling SMP IPIs which require work to be done in softirqs, which are
invoked from irq_exit() when the hard irq nest count reaches 0.
The scheduler_ipi() callback in the IPI handler calls irq_enter/exit
itself, but this is inside kick_handler()'s spin lock critical section,
so if an invoked softirq issues an IPI the kick_handler() will be
re-entered on the same CPU and will deadlock.
This is easily fixed by adding the missing irq_enter/exit to
kick_handler() so that the hard irq nest count doesn't reach 0 until
after the spin lock has been released.
Ideally the spin lock protected handler list will also be replaced by a
lockless RCU protected list since it is certainly mostly read. That can
be done in a later change though.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The recent implementation of a generic dummy timer resulted in a
different registration order of per cpu local timers which made the
broadcast control logic go belly up.
If the dummy timer is the first clock event device which is registered
for a CPU, then it is installed, the broadcast timer is initialized
and the CPU is marked as broadcast target.
If a real clock event device is installed after that, we can fail to
take the CPU out of the broadcast mask. In the worst case we end up
with two periodic timer events firing for the same CPU. One from the
per cpu hardware device and one from the broadcast.
Now the problem is that we have no way to distinguish whether the
system is in a state which makes broadcasting necessary or the
broadcast bit was set due to the nonfunctional dummy timer
installment.
To solve this we need to keep track of the system state seperately and
provide a more detailed decision logic whether we keep the CPU in
broadcast mode or not.
The old decision logic only clears the broadcast mode, if the newly
installed clock event device is not affected by power states.
The new logic clears the broadcast mode if one of the following is
true:
- The new device is not affected by power states.
- The system is not in a power state affected mode
- The system has switched to oneshot mode. The oneshot broadcast is
controlled from the deep idle state. The CPU is not in idle at
this point, so it's safe to remove it from the mask.
If we clear the broadcast bit for the CPU when a new device is
installed, we also shutdown the broadcast device when this was the
last CPU in the broadcast mask.
If the broadcast bit is kept, then we leave the new device in shutdown
state and rely on the broadcast to deliver the timer interrupts via
the broadcast ipis.
Reported-and-tested-by: Stehle Vincent-B46079 <B46079@freescale.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: John Stultz <john.stultz@linaro.org>,
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When the system switches from periodic to oneshot mode, the broadcast
logic causes a possibility that a CPU which has not yet switched to
oneshot mode puts its own clock event device into oneshot mode without
updating the state and the timer handler.
CPU0 CPU1
per cpu tickdev is in periodic mode
and switched to broadcast
Switch to oneshot mode
tick_broadcast_switch_to_oneshot()
cpumask_copy(tick_oneshot_broacast_mask,
tick_broadcast_mask);
broadcast device mode = oneshot
Timer interrupt
irq_enter()
tick_check_oneshot_broadcast()
dev->set_mode(ONESHOT);
tick_handle_periodic()
if (dev->mode == ONESHOT)
dev->next_event += period;
FAIL.
We fail, because dev->next_event contains KTIME_MAX, if the device was
in periodic mode before the uncontrolled switch to oneshot happened.
We must copy the broadcast bits over to the oneshot mask, because
otherwise a CPU which relies on the broadcast would not been woken up
anymore after the broadcast device switched to oneshot mode.
So we need to verify in tick_check_oneshot_broadcast() whether the CPU
has already switched to oneshot mode. If not, leave the device
untouched and let the CPU switch controlled into oneshot mode.
This is a long standing bug, which was never noticed, because the main
user of the broadcast x86 cannot run into that scenario, AFAICT. The
nonarchitected timer mess of ARM creates a gazillion of differently
broken abominations which trigger the shortcomings of that broadcast
code, which better had never been necessary in the first place.
Reported-and-tested-by: Stehle Vincent-B46079 <B46079@freescale.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: John Stultz <john.stultz@linaro.org>,
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In periodic mode we remove offline cpus from the broadcast propagation
mask. In oneshot mode we fail to do so. This was not a problem so far,
but the recent changes to the broadcast propagation introduced a
constellation which can result in a NULL pointer dereference.
What happens is:
CPU0 CPU1
idle()
arch_idle()
tick_broadcast_oneshot_control(OFF);
set cpu1 in tick_broadcast_force_mask
if (cpu_offline())
arch_cpu_dead()
cpu_dead_cleanup(cpu1)
cpu1 tickdevice pointer = NULL
broadcast interrupt
dereference cpu1 tickdevice pointer -> OOPS
We dereference the pointer because cpu1 is still set in
tick_broadcast_force_mask and tick_do_broadcast() expects a valid
cpumask and therefor lacks any further checks.
Remove the cpu from the tick_broadcast_force_mask before we set the
tick device pointer to NULL. Also add a sanity check to the oneshot
broadcast function, so we can detect such issues w/o crashing the
machine.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Cc: athorlton@sgi.com
Cc: CAI Qian <caiqian@redhat.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1306261303260.4013@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Use a completion to block until a secondary CPU has started up, like ARM
do, instead of a loop of udelays.
On Meta, SMP is really SMT, with each "CPU" being a different hardware
thread on the same Meta processor core, so as well as being more
efficient and latency friendly, using a completion prevents the bogomips
of the secondary CPU from being drastically skewed every time by the
execution of the tight in-cache udelay loop on the other CPU.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
|
|
In secondary_start_kernel() interrupts should be enabled with
local_irq_enable() after the cpu is marked as online with
set_cpu_online(). Otherwise it's possible for a timer interrupt to
trigger a softirq, which if the cpu is marked as offline may have it's
affinity altered.
Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
|
|
Checking for process->mm is not enough because process' main thread may
exit or detach its mm via use_mm(), but other threads may still have a
valid mm.
To fix this we would need to use find_lock_task_mm(), which would walk
up all threads and returns an appropriate task (with task lock held).
clear_tasks_mm_cpumask() was introduced in v3.5-rc1 to fix this issue,
so let's use it for metag too.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
|
|
Incorporate the addition of hsize argument in write_buf callback
of pstore. This was forgotten in
6bbbca735936e15b9431882eceddcf6dff76e03c
pstore: Pass header size in the pstore write callback
Causing a build failure when ftrace and pstore are enabled.
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
From Anatolij:
"There are small cleanups and fixes for mpc512x common code,
mpc512x_defconfig updates and soft reboot support for mpc5125
based boards."
|
|
Merge Freescale updates
|
|
If user requests many data writes and fsync together, the last updated i_size
should be stored to the inode block consistently.
But, previous write_end just marks the inode as dirty and doesn't update its
metadata into its inode block.
After that, fsync just writes the inode block with newly updated data index
excluding inode metadata updates.
So, this patch introduces write_end in which updates inode block too when the
i_size is changed.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
As destroy_fsync_dnodes() is a simple list-cleanup func, so delete the unused
and unrelated f2fs_sb_info argument of it.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
This patch removes check_prefree_segments initially designed to enhance the
performance by narrowing the range of LBA usage across the whole block device.
When allocating a new segment, previous f2fs tries to find proper prefree
segments, and then, if finds a segment, it reuses the segment for further
data or node block allocation.
However, I found that this was totally wrong approach since the prefree segments
have several data or node blocks that will be used by the roll-forward mechanism
operated after sudden-power-off.
Let's assume the following scenario.
/* write 8MB with fsync */
for (i = 0; i < 2048; i++) {
offset = i * 4096;
write(fd, offset, 4KB);
fsync(fd);
}
In this case, naive segment allocation sequence will be like:
data segment: x, x+1, x+2, x+3
node segment: y, y+1, y+2, y+3.
But, if we can reuse prefree segments, the sequence can be like:
data segment: x, x+1, y, y+1
node segment: y, y+1, y+2, y+3.
Because, y, y+1, and y+2 became prefree segments one by one, and those are
reused by data allocation.
After conducting this workload, we should consider how to recover the latest
inode with its data.
If we reuse the prefree segments such as y or y+1, we lost the old node blocks
so that f2fs even cannot start roll-forward recovery.
Therefore, I suggest that we should remove reusing prefree segments.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
This patch simplifies list operations in find_gc_inode and add_gc_inode.
Just simple code cleanup.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
[Jaegeuk Kim: add description]
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
Optimize the while loop condition
Since this condition will always be true and while loop will
be terminated by the following condition in code:
if (segno >= TOTAL_SEGS(sbi))
break;
Hence we can replace the while loop condition with while(1)
instead of always checking for segno to be less than Total segs.
Also we do not need to use TOTAL_SEGS() everytime. We can store
this value in a local variable since this value is constant.
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
This patch should fix the following bug reported by kbuild test robot.
fs/f2fs/recovery.c:233:33: sparse: incorrect type in assignment
(different base types)
parse warnings: (new ones prefixed by >>)
>> recovery.c:233: sparse: incorrect type in assignment (different base types)
recovery.c:233: expected unsigned int [unsigned] [assigned] ofs_in_node
recovery.c:233: got restricted __le16 [assigned] [usertype] ofs_in_node
>> recovery.c:238: sparse: incorrect type in assignment (different base types)
recovery.c:238: expected unsigned int [unsigned] ofs_in_node
recovery.c:238: got restricted __le16 [assigned] [usertype] ofs_in_node
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
While calculating CRC for the checkpoint block, we use __u32, but when storing
the crc value to the disk, we use __le32.
Let's fix the inconsistency.
Reported-and-Tested-by: Oded Gabbay <ogabbay@advaoptical.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
|
|
The driver provides a way to wake up the system by the MPIC timer.
For example,
echo 5 > /sys/devices/system/mpic/timer_wakeup
echo standby > /sys/power/state
After 5 seconds the MPIC timer will generate an interrupt to wake up
the system.
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
|
|
Register a mpic subsystem at /sys/devices/system/
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
The MPIC global timer is a hardware timer inside the Freescale PIC complying
with OpenPIC standard. When the specified interval times out, the hardware
timer generates an interrupt. The driver currently is only tested on fsl chip,
but it can potentially support other global timers complying to OpenPIC
standard.
The two independent groups of global timer on fsl chip, group A and group B,
are identical in their functionality, except that they appear at different
locations within the PIC register map. The hardware timer can be cascaded to
create timers larger than the default 31-bit global timers. Timer cascade
fields allow configuration of up to two 63-bit timers. But These two groups
of timers cannot be cascaded together.
It can be used as a wakeup source for low power modes. It also could be used
as periodical timer for protocols, drivers and etc.
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
Add irq_set_wake support. Just add IRQF_NO_SUSPEND to desc->action->flag.
So the wake up interrupt will not be disable in suspend_device_irqs.
Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
With the patch 7230c564 (powerpc: Rework lazy-interrupt handling),
it seems that the coreint works pretty well on the 85xx 64bit kernel.
So use the coreint by default for these boards.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
irq_eoi() is already called by generic_handle_irq() so
it shall not be called a again
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
On the most boards of Freescale platform, they use the PCI-Express
Intel(R) PRO/1000 gigabit ethernet card to work. So enable the
corresponding driver for it.
Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
MPIC version is useful information for both mpic_alloc() and mpic_init().
The patch provide an API to get MPIC version for reusing the code.
Also, some other IP block may need MPIC version for their own use.
The API for external use is also provided.
Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
|
|
When Jeremy introduced the new device-tree based reserve map, he made
the code in early_reserve_mem_dt() bail out if it found one, thus not
reserving the initrd nor processing the old style map.
I hit problems with variants of kexec that didn't put the initrd in
the new style map either. While these could/will be fixed, I believe
we should be safe here and rather reserve more than not enough.
We could have a firmware passing stuff via the new style map, and
in the middle, a kexec that knows nothing about it and adding other
things to the old style map.
I don't see a big issue with processing both and reserving everything
that needs to be. memblock_reserve() supports overlaps fine these days.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
The Data Address Watchpoint Register (DAWR) on POWER8 can take a 512
byte range but this range must not cross a 512 byte boundary.
Unfortunately we were off by one when calculating the end of the region,
hence we were not allowing some breakpoint regions which were actually
valid. This fixes this error.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Edjunior Barbosa Machado <emachado@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
There is no need to roll our own polling scheme when we already have
one implemented by the core.
Tested-by: Manish Badarkhe <badarkhe.manish@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
The vref field was not used by the driver and was removed from the
platform data structure.
Signed-off-by: Manish Badarkhe <badarkhe.manish@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
|
|
I introduced a new temporary variable "info" instead of
"m->m_info[mds]". Also I reversed the if condition and pulled
everything in one indent level.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Alex Elder <elder@inktank.com>
|
|
This patch makes the following improvements to the error handling
in the ceph_mdsmap_decode function:
- Add a NULL check for return value from kcalloc
- Make use of the variable err
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Signed-off-by: Sage Weil <sage@inktank.com>
|
|
The reference to the original request dropped at the end of
rbd_img_obj_exists_callback() corresponds to the reference taken
in rbd_img_obj_exists_submit() to account for the stat request
referring to it. Move the put of that reference up right after
clearing that pointer to make its purpose more obvious.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
|
|
Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Alex Elder <elder@inktank.com>
|
|
drivers/block/rbd.c: In function ‘zero_pages’:
drivers/block/rbd.c:1102: warning: comparison of distinct pointer types lacks a cast
Remove the hackish casts and use min_t() to fix this.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Alex Elder <elder@inktank.com>
|
|
Whenever a DASD request encounters a timeout we might
need to abort all outstanding requests on this or
even other devices.
This is especially useful if one wants to fail all
devices on one side of a RAID10 configuration, even
though only one device exhibited an error.
To handle this I've introduced a new device flag
DASD_FLAG_ABORTIO.
This flag is evaluated in __dasd_process_request_queue()
and will invoke blk_abort_request() for all
outstanding requests with DASD_CQR_FLAGS_FAILFAST set.
This will cause any of these requests to be aborted
immediately if the blk_timeout function is activated.
The DASD_FLAG_ABORTIO is also evaluated in
__dasd_process_request_queue to abort all
new request which would have the
DASD_CQR_FLAGS_FAILFAST bit set.
The flag can be set with the new ioctls 'BIODASDABORTIO'
and removed with 'BIODASDALLOWIO'.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|
|
This patch adds a 'timeout' attibute to the DASD driver.
When set to non-zero, the blk_timeout function will
be enabled with the timeout specified in the attribute.
Setting 'timeout' to '0' will disable block timeouts.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
|