Age | Commit message (Collapse) | Author |
|
This reverts commit 1e5965bf1f018cc30a4659fa3f1a40146e4276f6. Ezequiel
Garcia has a better fix.
|
|
Jan Kara have spotted interesting issue:
There are potential data corruption issue with direct IO overwrites
racing with truncate:
Like:
dio write truncate_task
->ext4_ext_direct_IO
->overwrite == 1
->down_read(&EXT4_I(inode)->i_data_sem);
->mutex_unlock(&inode->i_mutex);
->ext4_setattr()
->inode_dio_wait()
->truncate_setsize()
->ext4_truncate()
->down_write(&EXT4_I(inode)->i_data_sem);
->__blockdev_direct_IO
->ext4_get_block
->submit_io()
->up_read(&EXT4_I(inode)->i_data_sem);
# truncate data blocks, allocate them to
# other inode - bad stuff happens because
# dio is still in flight.
In order to serialize with truncate dio worker should grab extra i_dio_count
reference before drop i_mutex.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
If we have enough aggressive DIO readers, truncate and other dio
waiters will wait forever inside inode_dio_wait(). It is reasonable
to disable nonlock DIO read optimization during truncate.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Current serialization will works only for DIO which holds
i_mutex, but nonlocked DIO following race is possible:
dio_nolock_read_task truncate_task
->ext4_setattr()
->inode_dio_wait()
->ext4_ext_direct_IO
->ext4_ind_direct_IO
->__blockdev_direct_IO
->ext4_get_block
->truncate_setsize()
->ext4_truncate()
#alloc truncated blocks
#to other inode
->submit_io()
#INFORMATION LEAK
In order to serialize with unlocked DIO reads we have to
rearrange wait sequence
1) update i_size first
2) if i_size about to be reduced wait for outstanding DIO requests
3) and only after that truncate inode blocks
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Inode's block defrag and ext4_change_inode_journal_flag() may
affect nonlocked DIO reads result, so proper synchronization
required.
- Add missed inode_dio_wait() calls where appropriate
- Check inode state under extra i_dio_count reference.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Current unwritten extent conversion state-machine is very fuzzy.
- For unknown reason it performs conversion under i_mutex. What for?
My diagnosis:
We already protect extent tree with i_data_sem, truncate and punch_hole
should wait for DIO, so the only data we have to protect is end_io->flags
modification, but only flush_completed_IO and end_io_work modified this
flags and we can serialize them via i_completed_io_lock.
Currently all these games with mutex_trylock result in the following deadlock
truncate: kworker:
ext4_setattr ext4_end_io_work
mutex_lock(i_mutex)
inode_dio_wait(inode) ->BLOCK
DEADLOCK<- mutex_trylock()
inode_dio_done()
#TEST_CASE1_BEGIN
MNT=/mnt_scrach
unlink $MNT/file
fallocate -l $((1024*1024*1024)) $MNT/file
aio-stress -I 100000 -O -s 100m -n -t 1 -c 10 -o 2 -o 3 $MNT/file
sleep 2
truncate -s 0 $MNT/file
#TEST_CASE1_END
Or use 286's xfstests https://github.com/dmonakhov/xfstests/blob/devel/286
This patch makes state machine simple and clean:
(1) xxx_end_io schedule final extent conversion simply by calling
ext4_add_complete_io(), which append it to ei->i_completed_io_list
NOTE1: because of (2A) work should be queued only if
->i_completed_io_list was empty, otherwise the work is scheduled already.
(2) ext4_flush_completed_IO is responsible for handling all pending
end_io from ei->i_completed_io_list
Flushing sequence consists of following stages:
A) LOCKED: Atomically drain completed_io_list to local_list
B) Perform extents conversion
C) LOCKED: move converted io's to to_free list for final deletion
This logic depends on context which we was called from.
D) Final end_io context destruction
NOTE1: i_mutex is no longer required because end_io->flags modification
is protected by ei->ext4_complete_io_lock
Full list of changes:
- Move all completion end_io related routines to page-io.c in order to improve
logic locality
- Move open coded logic from various xx_end_xx routines to ext4_add_complete_io()
- remove EXT4_IO_END_FSYNC
- Improve SMP scalability by removing useless i_mutex which does not
protect io->flags anymore.
- Reduce lock contention on i_completed_io_lock by optimizing list walk.
- Rename ext4_end_io_nolock to end4_end_io and make it static
- Check flush completion status to ext4_ext_punch_hole(). Because it is
not good idea to punch blocks from corrupted inode.
Changes since V3 (in request to Jan's comments):
Fall back to active flush_completed_IO() approach in order to prevent
performance issues with nolocked DIO reads.
Changes since V2:
Fix use-after-free caused by race truncate vs end_io_work
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
ext4_set_io_unwritten_flag() will increment i_unwritten counter, so
once we mark end_io with EXT4_END_IO_UNWRITTEN we have to revert it back
on error path.
- add missed error checks to prevent counter leakage
- ext4_end_io_nolock() will clear EXT4_END_IO_UNWRITTEN flag to signal
that conversion finished.
- add BUG_ON to ext4_free_end_io() to prevent similar leakage in future.
Visible effect of this bug is that unaligned aio_stress may deadlock
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
AIO/DIO prefix is wrong because it account unwritten extents which
also may be scheduled from buffered write endio
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
Generic inode has unused i_private pointer which may be used as cur_aio_dio
storage.
TODO: If cur_aio_dio will be passed as an argument to get_block_t this allow
to have concurent AIO_DIO requests.
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
|
|
We shouldn't need more than 1 worker thread per cpu, since rpciod
is designed to run without sleeping in most cases.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
For building perf without libelf, we can set NO_LIBELF=1 as a argument
of make. It then defines NO_LIBELF_SUPPORT macro for C code to do the
proper handling. However it usually used in a negative semantics -
e.g. #ifndef - so we saw double negations which can be misleading.
Convert it to a positive form to make it more readable.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1348824728-14025-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It seems that the PYRF_OBJS variable is not used anymore or has no
effect at least. The util/setup.py tracks its dependency using
util/python-ext-sources file and resulting objects are saved under
python_ext_build/tmp/.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1348824728-14025-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since NO_DWARF is used in arch/$(ARCH)/Makefiles, it should be checked
before including those files. It was moved by mistake during libelf
dependency removal work by me, sorry.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1348824728-14025-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is the DA9055 MFD core driver that instantiate all the dependent
component drivers and provides them the device access via I2C.
This patch is functionally tested on Samsung SMDK6410.
Signed-off-by: David Dajun Chen <dchen@diasemi.com>
Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
fib6_add_1() should consistently return errno pointers,
rather than a mixture of NULL and errno pointers.
Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Marc Kleine-Budde says:
====================
this pull request is for net-next, for the v3.7 release cycle.
AnilKumar Ch contributed a fix for a segfault in the c_can driver,
which is triggered by an earlier commit [1] in net-next (so no backport
is needed).
...
[1] 4cdd34b can: c_can: Add runtime PM support to Bosch C_CAN/D_CAN controller
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch enables wake from system suspend on magic packet.
Patch updated to change BUG_ON to WARN_ON_ONCE.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch instructs the device to enter its lowest power SUSPEND2
state during system suspend.
This patch also explicitly wakes the device after resume, which
should address reports of the device not automatically coming
back after system suspend:
Patch updated to change BUG_ON to WARN_ON_ONCE.
http://code.google.com/p/chromium-os/issues/detail?id=31871
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds an explicit test that the READY bit is set on
the device when attempting to initialize it.
If this bit is clear then the device hasn't succesfully started
all its clocks, and this patch helps make the resulting logged
error more helpful.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch enables wake from system suspend on magic packet.
Patch updated to replace BUG_ON with WARN_ON_ONCE and return.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch enables the device to enter its lowest power SUSPEND2
state during system suspend, instead of staying up using full power.
Patch updated to not add two pointers to .suspend & .resume.
Patch updated to replace BUG_ON with WARN_ON_ONCE and return.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fixes an issue on some systems, where after suspend the
link is re-established but the ethernet interface does not resume.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds additional checks of the values returned by
smsc95xx_(read|write)_reg, and wraps their common patterns
in macros.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Removes unnecessary variables as smsc95xx_write_reg takes its
value by parameter. Early versions passed this parameter by
reference.
Also replace hardcoded interrupt status value with a #define
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During init, the device reset is unexpected to complete immediately,
so sleep before testing the condition rather than after it.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add RTC alarm interrupt details to TPS65910 MFD device list, to support
RTC alarm events.
Signed-off-by: Venu Byravarasu <vbyravarasu@nvidia.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Sort out conflict with Arnd's patch that preserves
the unconditional LATCH value.
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
Patch a7ea3bbf5d "time/jiffies: Allow CLOCK_TICK_RATE to be undefined"
breaks the compilation of targets that rely on the LATCH definition,
because of recursive header file inclusion not defining CLOCK_TICK_RATE
before it is checked here.
This fixes the problem by moving LATCH back to where it was, but it
seems that there are still cases where SHIFTED_HZ is defined incorrectly
because of the same problem. Need to investigate further.
Without this patch, building h7201_defconfig results in:
arch/arm/mach-h720x/common.c: In function 'h720x_gettimeoffset':
arch/arm/mach-h720x/common.c:50:73: error: 'LATCH' undeclared (first use in this function)
arch/arm/mach-h720x/common.c:50:73: note: each undeclared identifier is reported only once for each function it appears in
arch/arm/mach-h720x/common.c:51:1: warning: control reaches end of non-void function [-Wreturn-type]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
It can misreport.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
|
|
Rebased and resending the patch.
Path based queries can fail for lack of access, especially during lookup
during open.
open itself would actually succeed becasue of back up intent bit
but queries (either path or file handle based) do not have a means to
specifiy backup intent bit.
So query the file info during lookup using
trans2 / findfirst / file_id_full_dir_info
to obtain file info as well as file_id/inode value.
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
|
|
Moving ARCH_MVEBU for multi-platform support caused several breakages in
recently added addr-map and pinctrl support for mvebu. This adds the
necessary selects and include paths to fix the build.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The nhk8815 board files uses NAND_NO_READRDY in its platform data, but
this macro is getting removed because it was not being used anywhere.
Without this patch, building nhk8815_defconfig results in:
arch/arm/mach-nomadik/board-nhk8815.c:118:6: error: 'NAND_NO_READRDY' undeclared here (not in a function)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alessandro Rubini <rubini@unipv.it>
Cc: Linus Walleij <linus.walleij@linaro.org>
|
|
arm: Add ARM ERRATA 775420 workaround
Workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum.
In case a date cache maintenance operation aborts with MMU exception, it
might cause the processor to deadlock. This workaround puts DSB before
executing ISB if an abort may occur on cache maintenance.
Based on work by Kouei Abe and feedback from Catalin Marinas.
Signed-off-by: Kouei Abe <kouei.abe.cp@rms.renesas.com>
[ horms@verge.net.au: Changed to implementation
suggested by catalin.marinas@arm.com ]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Some architectures like xscale and feroceon have cache API variants that
map cache flushing functions as aliases to the base architecture.
This patch adds the required aliases to complete the implementation of
cache flushing LoUIS API.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
* cleanup/__iomem:
ARM: Orion5x: ts78xx: Add IOMEM for virtual addresses.
ARM: ux500: use __iomem pointers for MMIO
Two new cleanup patches that were not already part of the
first cleanup branch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Also convert logical or to + for register offsets from base
addresses. This fixes a number of warnings currently seen in
linux-next:
warning: passing argument 2 of '__raw_writeb' makes pointer from
interger without cast.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
In the earlier sweeping changes, the ux500 uncompress.h file was missed
because other problems were hiding this one.
Without this patch, building u8500_defconfig results in:
In file included from arch/arm/boot/compressed/misc.c:33:0:
arch/arm/mach-ux500/include/mach/uncompress.h: In function 'putc':
arch/arm/mach-ux500/include/mach/uncompress.h:32:2: warning: passing argument 1 of '__raw_readb' makes pointer from integer without a cast [enabled by default]
arch/arm/include/asm/io.h:95:89: note: expected 'const volatile void *' but argument is of type 'u32'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
|
|
Currently it does not do so if the RPC call failed to start. Fix is to
move the decrement of plh_block_lgets into nfs4_layoutreturn_release.
Also remove a redundant test of task->tk_status in nfs4_layoutreturn_done:
if lrp->res.lrs_present is set, then obviously the RPC call succeeded.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Failure of the layoutreturn allocation fails is not a good reason to
mark the pnfs_layout_hdr as having failed a layoutget or i/o. Just
exit cleanly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
It serves no purpose that the test for whether or not we have valid
layout segments doesn't already serve.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Once all the affected layout segments have been freed up, clear the
NFS_LAYOUT_BULK_RECALL flag so that we can reuse the pnfs_layout_hdr
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
We already have a mechanism for blocking LAYOUTGET by means of the
plh_block_lgets counter. The only "service" that NFS_LAYOUT_DESTROYED
provides at this point is to block layoutget once the layout segment
list is empty, which basically means that you have to wait until
the pnfs_layout_hdr is destroyed before you can do pNFS on that file
again.
This patch enables the reuse of the pnfs_layout_hdr if the layout
segment list is empty.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
...and ditto for pnfs_free_layout_hdr()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
These are all in static declared functions that are called only once.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Ensure that the reference count for pnfs_layout_hdr reverts to the
original value after a call to pnfs_layout_remove_lseg().
Note that the caller is expected to hold a reference to the struct
pnfs_layout_hdr.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
There is no longer a need to use pnfs_free_lseg_list(). Just call
pnfs_free_lseg() directly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
Move the code into pnfs_free_layout_hdr(), and add checks to
get_layout_by_fh_locked to ensure that they don't reference a layout
that is being freed.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
|
None of the existing pNFS layout drivers seem to require the inode
to be locked while they free the layout header.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|