Age | Commit message (Collapse) | Author |
|
When exercising error injection on IBM pseries machine, I hit the
following warning:
[ 251.450043] RTAS: event: 89, Type: Platform Error, Severity: 2
[ 253.549822] cxgb3 0006:01:00.0: enabling device (0140 -> 0142)
[ 253.713560] cxgb3 0006:01:00.0: adapter recovering, PEX ERR 0x100
[ 254.895437] RTNL: assertion failed at net/core/dev.c (2031)
[ 254.895467] CPU: 6 PID: 5449 Comm: eehd Tainted: G W 3.10.0-rc7-00157-gea461ab #19
[ 254.895474] Call Trace:
[ 254.895483] [c000000fac56f7d0] [c000000000014dcc] .show_stack+0x7c/0x1f0 (unreliable)
[ 254.895493] [c000000fac56f8a0] [c0000000007ba318] .dump_stack+0x28/0x3c
[ 254.895500] [c000000fac56f910] [c0000000006c0384] .netif_set_real_num_tx_queues+0x224/0x230
[ 254.895515] [c000000fac56f9b0] [d00000000ef35510] .cxgb_open+0x80/0x3f0 [cxgb3]
[ 254.895525] [c000000fac56fa50] [d00000000ef35914] .t3_resume_ports+0x94/0x100 [cxgb3]
[ 254.895533] [c000000fac56fae0] [c00000000005fc8c] .eeh_report_resume+0x8c/0xd0
[ 254.895539] [c000000fac56fb60] [c00000000005e9fc] .eeh_pe_dev_traverse+0x9c/0x190
[ 254.895545] [c000000fac56fc10] [c000000000060000] .eeh_handle_event+0x110/0x330
[ 254.895551] [c000000fac56fca0] [c000000000060350] .eeh_event_handler+0x130/0x1a0
[ 254.895558] [c000000fac56fd30] [c0000000000ad758] .kthread+0xe8/0xf0
[ 254.895566] [c000000fac56fe30] [c00000000000a05c] .ret_from_kernel_thread+0x5c/0x80
It appears that t3_resume_ports() is called with the rtnl_lock held from
the fatal error task but not from the PCI error callbacks. This fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here's the big driver core merge for 3.11-rc1
Lots of little things, and larger firmware subsystem updates, all
described in the shortlog. Nice thing here is that we finally get rid
of CONFIG_HOTPLUG, after 10+ years, thanks to Stephen Rohtwell (it had
been always on for a number of kernel releases, now it's just
removed)"
* tag 'driver-core-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (27 commits)
driver core: device.h: fix doc compilation warnings
firmware loader: fix another compile warning with PM_SLEEP unset
build some drivers only when compile-testing
firmware loader: fix compile warning with PM_SLEEP set
kobject: sanitize argument for format string
sysfs_notify is only possible on file attributes
firmware loader: simplify holding module for request_firmware
firmware loader: don't export cache_firmware and uncache_firmware
drivers/base: Use attribute groups to create sysfs memory files
firmware loader: fix compile warning
firmware loader: fix build failure with !CONFIG_FW_LOADER_USER_HELPER
Documentation: Updated broken link in HOWTO
Finally eradicate CONFIG_HOTPLUG
driver core: firmware loader: kill FW_ACTION_NOHOTPLUG requests before suspend
driver core: firmware loader: don't cache FW_ACTION_NOHOTPLUG firmware
Documentation: Tidy up some drivers/base/core.c kerneldoc content.
platform_device: use a macro instead of platform_driver_register
firmware: move EXPORT_SYMBOL annotations
firmware: Avoid deadlock of usermodehelper lock at shutdown
dell_rbu: Select CONFIG_FW_LOADER_USER_HELPER explicitly
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc updates from Greg KH:
"Here's the big char/misc driver tree merge for 3.11-rc1
A variety of different driver patches here. All of these have been in
linux-next for a while, and the networking patches were acked-by David
Miller, as it made sense for those patches to come through this tree"
* tag 'char-misc-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (102 commits)
Revert "char: misc: assign file->private_data in all cases"
drivers: uio_pdrv_genirq: Use of_match_ptr() macro
mei: check whether hw start has succeeded
mei: check if the hardware reset succeeded
mei: mei_cl_connect: don't multiply the timeout twice
mei: do not override a client writing state when buffering
mei: move mei_cl_irq_write_complete to client.c
UIO: Fix concurrency issue
drivers: uio_dmem_genirq: Use of_match_ptr() macro
char: misc: assign file->private_data in all cases
drivers: hv: allocate synic structures before hv_synic_init()
drivers: hv: check interrupt mask before read_index
vme: vme_tsi148.c: fix error return code in tsi148_probe()
FMC: fix error handling in probe() function
fmc: avoid readl/writel namespace conflict
FMC: NULL dereference on allocation failure
UIO: fix uio_pdrv_genirq with device tree but no interrupt
UIO: allow binding uio_pdrv_genirq.c to devices using command line option
FMC: add a char-device mezzanine driver
FMC: add a driver to write mezzanine EEPROM
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree update from Greg KH:
"Here's the large staging tree merge for 3.11-rc1
Huge thing here is the Lustre client code. Unfortunatly, due to it
not building properly on a wide variety of different architectures
(this was production code???), it is currently disabled from the build
so as to not annoy people.
Other than Lustre, there are loads of comedi patches, working to clean
up that subsystem, iio updates and new drivers, and a load of cleanups
from the OPW applicants in their quest to get a summer internship.
All of these have been in the linux-next releases for a while (hence
the Lustre code being disabled)"
Fixed up trivial conflict in drivers/staging/serqt_usb2/serqt_usb2.c due
to independent renamings in the staging driver cleanup and the USB
tree..
* tag 'staging-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (868 commits)
Revert "Revert "Revert "staging/lustre: drop CONFIG_BROKEN dependency"""
staging: rtl8192u: fix line length in r819xU_phy.h
staging: rtl8192u: rename variables in r819xU_phy.h
staging: rtl8192u: fix comments in r819xU_phy.h
staging: rtl8192u: fix whitespace in r819xU_phy.h
staging: rtl8192u: fix newlines in r819xU_phy.c
staging: comedi: unioxx5: use comedi_alloc_spriv()
staging: comedi: unioxx5: fix unioxx5_detach()
silicom: checkpatch: errors caused by macros
Staging: silicom: remove the board_t typedef in bpctl_mod.c
Staging: silicom: capitalize labels in the bp_media_type enum
Staging: silicom: remove bp_media_type enum typedef
staging: rtl8192u: replace msleep(1) with usleep_range() in r819xU_phy.c
staging: rtl8192u: rename dwRegRead and rtStatus in r819xU_phy.c
staging: rtl8192u: replace __FUNCTION__ in r819xU_phy.c
staging: rtl8192u: limit line size in r819xU_phy.c
zram: allow request end to coincide with disksize
staging: drm/imx: use generic irq chip unused field to block out invalid irqs
staging: drm/imx: use generic irqchip
staging: drm/imx: ipu-dmfc: use defines for ipu channel numbers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the big TTY / Serial driver merge for 3.11-rc1.
It's not all that big, nothing major changed in the tty api, which is
a nice change, just a number of serial driver fixes and updates and
new drivers, along with some n_tty fixes to help resolve some reported
issues.
All of these have been in the linux-next releases for a while, with
the exception of the last revert patch, which was reported this past
weekend by two different people as being needed."
* tag 'tty-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (51 commits)
Revert "serial: 8250_pci: add support for another kind of NetMos Technology PCI 9835 Multi-I/O Controller"
pch_uart: Add uart_clk selection for the MinnowBoard
tty: atmel_serial: prepare clk before calling enable
tty: Reset itty for other pty
n_tty: Buffer work should not reschedule itself
n_tty: Fix unsafe update of available buffer space
n_tty: Untangle read completion variables
n_tty: Encapsulate minimum_to_wake within N_TTY
serial: omap: Fix device tree based PM runtime
serial: imx: Fix serial clock unbalance
serial/mpc52xx_uart: fix kernel panic when system reboot
serial: mfd: Add sysrq support
serial: imx: enable the clocks for console
tty: serial: add Freescale lpuart driver support
serial: imx: Improve Kconfig text
serial: imx: Allow module build
serial: imx: Fix warning when !CONFIG_SERIAL_IMX_CONSOLE
tty/serial/sirf: fix error propagation in sirfsoc_uart_probe()
serial: omap: fix potential NULL pointer dereference in serial_omap_runtime_suspend()
tty: serial: Enable uartlite for ARM zynq
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB updates from Greg KH:
"Here's the big USB 3.11-rc1 merge request.
Lots of gadget and finally, chipidea driver updates (they were much
needed), along with a new host controller driver, lots of little
serial driver fixes, the removal of the 255 usb-serial device
limitation, and a variety of other minor things.
All of these have been in the linux-next releases for a while"
* tag 'usb-3.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (254 commits)
usb: musb: omap2430: make it compile again
usb: chipidea: ci_hdrc_imx: access phy via private data
xhci: Add missing unlocks on error paths
USB: option,qcserial: move Novatel Gobi1K IDs to qcserial
ehci-atmel.c: prepare clk before calling enable
USB: ohci-at91: prepare clk before calling enable
USB: HWA: fix device probe failure
wusbcore: add entries in Documentation/ABI for new wusbhc sysfs attributes
wusbcore: add sysfs attribute for retry count
wusbcore: add sysfs attribute for DNTS count and interval
usb: chipidea: drop "13xxx" infix
usb: phy: tegra: remove duplicated include from phy-tegra-usb.c
usb: host: xhci-plat: release mem region while removing module
usbmisc_imx: allow autoloading on according to dt ids
usb: fix build error without CONFIG_USB_PHY
usb: check usb_hub_to_struct_hub() return value
xhci: check for failed dma pool allocation
usb: gadget: f_subset: fix missing unlock on error in geth_alloc()
usb: gadget: f_ncm: fix missing unlock on error in ncm_alloc()
usb: gadget: f_ecm: fix missing unlock on error in ecm_alloc()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull FS-Cache updates from David Howells:
"This contains a number of fixes for various FS-Cache issues plus some
cleanups. The commits are, in order:
1) Provide a system wait_on_atomic_t() and wake_up_atomic_t() sharing
the bit-wait table (enhancement for #8).
2) Don't put spin_lock() in a while-condition as spin_lock() may have
a do {} while(0) wrapper (cleanup).
3) Symbolically name i_mutex lock classes rather than using numbers
in CacheFiles (cleanup).
4) Don't sleep in page release if __GFP_FS is not set (deadlock vs
ext4).
5) Uninline fscache_object_init() (cleanup for #7).
6) Wrap checks on object state (cleanup for #7).
7) Simplify the object state machine by separating work states from
wait states.
8) Simplify cookie retention by objects (NULL pointer deref fix).
9) Remove unused list_to_page() macro (cleanup).
10) Make the remaining-pages counter in the retrieval op atomic
(assertion failure fix).
11) Don't use spin_is_locked() in assertions (assertion failure fix)"
* tag 'fscache-20130702' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
FS-Cache: Don't use spin_is_locked() in assertions
FS-Cache: The retrieval remaining-pages counter needs to be atomic_t
cachefiles: remove unused macro list_to_page()
FS-Cache: Simplify cookie retention for fscache_objects, fixing oops
FS-Cache: Fix object state machine to have separate work and wait states
FS-Cache: Wrap checks on object state
FS-Cache: Uninline fscache_object_init()
FS-Cache: Don't sleep in page release if __GFP_FS is not set
CacheFiles: name i_mutex lock class explicitly
fs/fscache: remove spin_lock() from the condition in while()
Add wait_on_atomic_t() and wake_up_atomic_t()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This set includes a number of SCTP related fixes in the dlm, and a few
other minor fixes and changes."
* tag 'dlm-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: Avoid LVB truncation
dlm: log an error for unmanaged lockspaces
dlm: config: using strlcpy instead of strncpy
dlm: remove duplicated include from lowcomms.c
dlm: disable nagle for SCTP
dlm: retry failed SCTP sends
dlm: try other IPs when sctp init assoc fails
dlm: clear correct bit during sctp init failure handling
dlm: set sctp assoc id during setup
dlm: clear correct init bit during sctp setup
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches:
- remount_fs callback function
- restore parent inode number to enhance the fsync performance
- xattr security labels
- reduce the number of redundant lock/unlock data pages
- avoid frequent write_inode calls
The other minor bug fixes are as follows.
- endian conversion bugs
- various bugs in the roll-forward recovery routine"
* tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (56 commits)
f2fs: fix to recover i_size from roll-forward
f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes()
f2fs: remove reusing any prefree segments
f2fs: code cleanup and simplify in func {find/add}_gc_inode
f2fs: optimize the init_dirty_segmap function
f2fs: fix an endian conversion bug detected by sparse
f2fs: fix crc endian conversion
f2fs: add remount_fs callback support
f2fs: recover wrong pino after checkpoint during fsync
f2fs: optimize do_write_data_page()
f2fs: make locate_dirty_segment() as static
f2fs: remove unnecessary parameter "offset" from __add_sum_entry()
f2fs: avoid freqeunt write_inode calls
f2fs: optimise the truncate_data_blocks_range() range
f2fs: use the F2FS specific flags in f2fs_ioctl()
f2fs: sync dir->i_size with its block allocation
f2fs: fix i_blocks translation on various types of files
f2fs: set sb->s_fs_info before calling parse_options()
f2fs: support xattr security labels
f2fs: fix iget/iput of dir during recovery
...
|
|
Pull GFS2 updates from Steven Whitehouse:
"There are a few bug fixes for various, mostly very minor corner cases,
plus some interesting new features.
The new features include atomic_open whose main benefit will be the
reduction in locking overhead in case of combined lookup/create and
open operations, sorting the log buffer lists by block number to
improve the efficiency of AIL writeback, and aggressively issuing
revokes in gfs2_log_flush to reduce overhead when dropping glocks."
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Reserve journal space for quota change in do_grow
GFS2: Fix fstrim boundary conditions
GFS2: fix warning message
GFS2: aggressively issue revokes in gfs2_log_flush
GFS2: fix regression in dir_double_exhash
GFS2: Add atomic_open support
GFS2: Only do one directory search on create
GFS2: fix error propagation in init_threads()
GFS2: Remove no-op wrapper function
GFS2: Cocci spatch "ptr_ret.spatch"
GFS2: Eliminate gfs2_rg_lops
GFS2: Sort buffer lists by inplace block number
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 update from Ted Ts'o:
"Lots of bug fixes, cleanups and optimizations. In the bug fixes
category, of note is a fix for on-line resizing file systems where the
block size is smaller than the page size (i.e., file systems 1k blocks
on x86, or more interestingly file systems with 4k blocks on Power or
ia64 systems.)
In the cleanup category, the ext4's punch hole implementation was
significantly improved by Lukas Czerner, and now supports bigalloc
file systems. In addition, Jan Kara significantly cleaned up the
write submission code path. We also improved error checking and added
a few sanity checks.
In the optimizations category, two major optimizations deserve
mention. The first is that ext4_writepages() is now used for
nodelalloc and ext3 compatibility mode. This allows writes to be
submitted much more efficiently as a single bio request, instead of
being sent as individual 4k writes into the block layer (which then
relied on the elevator code to coalesce the requests in the block
queue). Secondly, the extent cache shrink mechanism, which was
introduce in 3.9, no longer has a scalability bottleneck caused by the
i_es_lru spinlock. Other optimizations include some changes to reduce
CPU usage and to avoid issuing empty commits unnecessarily."
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
ext4: optimize starting extent in ext4_ext_rm_leaf()
jbd2: invalidate handle if jbd2_journal_restart() fails
ext4: translate flag bits to strings in tracepoints
ext4: fix up error handling for mpage_map_and_submit_extent()
jbd2: fix theoretical race in jbd2__journal_restart
ext4: only zero partial blocks in ext4_zero_partial_blocks()
ext4: check error return from ext4_write_inline_data_end()
ext4: delete unnecessary C statements
ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
jbd2: move superblock checksum calculation to jbd2_write_superblock()
ext4: pass inode pointer instead of file pointer to punch hole
ext4: improve free space calculation for inline_data
ext4: reduce object size when !CONFIG_PRINTK
ext4: improve extent cache shrink mechanism to avoid to burn CPU time
ext4: implement error handling of ext4_mb_new_preallocation()
ext4: fix corruption when online resizing a fs with 1K block size
ext4: delete unused variables
ext4: return FIEMAP_EXTENT_UNKNOWN for delalloc extents
jbd2: remove debug dependency on debug_fs and update Kconfig help text
jbd2: use a single printk for jbd_debug()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull VFS patches (part 1) from Al Viro:
"The major change in this pile is ->readdir() replacement with
->iterate(), dealing with ->f_pos races in ->readdir() instances for
good.
There's a lot more, but I'd prefer to split the pull request into
several stages and this is the first obvious cutoff point."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (67 commits)
[readdir] constify ->actor
[readdir] ->readdir() is gone
[readdir] convert ecryptfs
[readdir] convert coda
[readdir] convert ocfs2
[readdir] convert fatfs
[readdir] convert xfs
[readdir] convert btrfs
[readdir] convert hostfs
[readdir] convert afs
[readdir] convert ncpfs
[readdir] convert hfsplus
[readdir] convert hfs
[readdir] convert befs
[readdir] convert cifs
[readdir] convert freevxfs
[readdir] convert fuse
[readdir] convert hpfs
reiserfs: switch reiserfs_readdir_dentry to inode
reiserfs: is_privroot_deh() needs only directory inode, actually
...
|
|
When sync does it's WB_SYNC_ALL writeback, it issues data Io and
then immediately waits for IO completion. This is done in the
context of the flusher thread, and hence completely ties up the
flusher thread for the backing device until all the dirty inodes
have been synced. On filesystems that are dirtying inodes constantly
and quickly, this means the flusher thread can be tied up for
minutes per sync call and hence badly affect system level write IO
performance as the page cache cannot be cleaned quickly.
We already have a wait loop for IO completion for sync(2), so cut
this out of the flusher thread and delegate it to wait_sb_inodes().
Hence we can do rapid IO submission, and then wait for it all to
complete.
Effect of sync on fsmark before the patch:
FSUse% Count Size Files/sec App Overhead
.....
0 640000 4096 35154.6 1026984
0 720000 4096 36740.3 1023844
0 800000 4096 36184.6 916599
0 880000 4096 1282.7 1054367
0 960000 4096 3951.3 918773
0 1040000 4096 40646.2 996448
0 1120000 4096 43610.1 895647
0 1200000 4096 40333.1 921048
And a single sync pass took:
real 0m52.407s
user 0m0.000s
sys 0m0.090s
After the patch, there is no impact on fsmark results, and each
individual sync(2) operation run concurrently with the same fsmark
workload takes roughly 7s:
real 0m6.930s
user 0m0.000s
sys 0m0.039s
IOWs, sync is 7-8x faster on a busy filesystem and does not have an
adverse impact on ongoing async data write operations.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Prepare first set of updates for 3.11 merge window.
|
|
My recent truncate patch uncovered this bug, but I can reproduce it without the
truncate patch. If you mount with -o compress-force, do a direct write to some
area, do a buffered write to some other area, and then do a direct read you will
get the wrong data for where you did the buffered write. This is because the
generic direct io helpers only call filemap_write_and_wait once, and for
compression we need it twice. So to be safe add the btrfs_wait_ordered_range to
the start of the direct io function to make sure any compressed writes have
truly been written. This patch makes xfstests 130 pass when you mount with -o
compress-force=lzo. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
There is another bug in the tree mod log stuff in that we're calling
tree_mod_log_free_eb every single time a block is cow'ed. The problem with this
is that if this block is shared by multiple snapshots we will call this multiple
times per block, so if we go to rewind the mod log for this block we'll BUG_ON()
in __tree_mod_log_rewind because we try to rewind a free twice. We only want to
call tree_mod_log_free_eb if we are actually freeing the block. With this patch
I no longer hit the panic in __tree_mod_log_rewind. Thanks,
Cc: stable@vger.kernel.org
Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We need to hold the tree mod log lock in __tree_mod_log_rewind since we walk
forward in the tree mod entries, otherwise we'll end up with random entries and
trip the BUG_ON() at the front of __tree_mod_log_rewind. This fixes the panics
people were seeing when running
find /whatever -type f -exec btrfs fi defrag {} \;
Thansk,
Cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
I missed fixing the backref stuff when I introduced the skinny metadata. If you
try and do things like snapshot aware defrag with skinny metadata you are going
to see tons of warnings related to the backref count being less than 0. This is
because the delayed refs will be found for stuff just fine, but it won't find
the skinny metadata extent refs. With this patch I'm not seeing warnings
anymore. Thanks,
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Several users reported this crash of NULL pointer or general protection,
the story is that we add a rbtree for speedup ulist iteration, and we
use krealloc() to address ulist growth, and krealloc() use memcpy to copy
old data to new memory area, so it's OK for an array as it doesn't use
pointers while it's not OK for a rbtree as it uses pointers.
So krealloc() will mess up our rbtree and it ends up with crash.
Reviewed-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
- It makes no sense that we deal with a inode in the dead tree.
- fix the race between dio and page copy by waiting the dio completion
- avoid the page copy vs truncate/punch hole
- check if the page is in the page cache or not
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
- It make no sense that we continue to do something after the error
happened, just go back with this patch.
- remove some check of copy_nocow_pages_for_inode(), such as page check
after write, inode check in the end of the function, because we are
sure they exist.
- remove the unnecessary goto in the return value check of the write
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We get oops while running btrfs replace start test,
------------[ cut here ]------------
kernel BUG at mm/filemap.c:608!
[SNIP]
Call Trace:
[<ffffffffa04b36c7>] copy_nocow_pages_for_inode+0x217/0x3f0 [btrfs]
[<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[<ffffffffa04bb8ce>] iterate_extent_inodes+0x1ae/0x300 [btrfs]
[<ffffffffa04bbab2>] iterate_inodes_from_logical+0x92/0xb0 [btrfs]
[<ffffffffa04b34b0>] ? scrub_print_warning_inode+0x230/0x230 [btrfs]
[<ffffffffa04b3b07>] copy_nocow_pages_worker+0x97/0x150 [btrfs]
[<ffffffffa048eed4>] worker_loop+0x134/0x540 [btrfs]
[<ffffffff816274ea>] ? __schedule+0x3ca/0x7f0
[<ffffffffa048eda0>] ? btrfs_queue_worker+0x300/0x300 [btrfs]
[<ffffffff8106f2f0>] kthread+0xc0/0xd0
[<ffffffff8106f230>] ? flush_kthread_worker+0x80/0x80
[<ffffffff8163181c>] ret_from_fork+0x7c/0xb0
[<ffffffff8106f230>] ? flush_kthread_worker+0x80/0x80
[SNIP]
RIP [<ffffffff8111f4c5>] unlock_page+0x35/0x40
RSP <ffff88010316bb98>
---[ end trace 421e79ad0dd72c7d ]---
it is because we forgot to lock the page again after we read data to
the page. Fix it.
Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
When adjusting the enospc rules for relocation I ran into a deadlock because we
were relocating the only system chunk and that forced us to try and allocate a
new system chunk while holding locks in the chunk tree, which caused us to
deadlock. To fix this I've moved all of the dev extent addition and chunk
addition out to the delayed chunk completion stuff. We still keep the in-memory
stuff which makes sure everything is consistent.
One change I had to make was to search the commit root of the device tree to
find a free dev extent, and hold onto any chunk em's that we allocated in that
transaction so we do not allocate the same dev extent twice. This has the side
effect of fixing a bug with balance that has been there ever since balance
existed. Basically you can free a block group and it's dev extent and then
immediately allocate that dev extent for a new block group and write stuff to
that dev extent, all within the same transaction. So if you happen to crash
during a balance you could come back to a completely broken file system. This
patch should keep these sort of things from happening in the future since we
won't be able to allocate free'd dev extents until after the transaction
commits. This has passed all of the xfstests and my super annoying stress test
followed by a balance. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
I hit a weird problem were my root item had been deleted but the orphan item had
not. This isn't necessarily a problem, but it keeps the file system from being
mounted. To fix this we just need to axe the orphan item if we can't find the
fs root when we're putting them altogether. With this patch I was able to
successfully mount my file system. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Now reading the data from the target device of the replace operation is allowed,
so the mirror number that is greater than the stripes number of a chunk is valid,
we will tune it when we find there is no target device later. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
Using the structure btrfs_sector_sum to keep the checksum value is
unnecessary, because the extents that btrfs_sector_sum points to are
continuous, we can find out the expected checksums by btrfs_ordered_sum's
bytenr and the offset, so we can remove btrfs_sector_sum's bytenr. After
removing bytenr, there is only one member in the structure, so it makes
no sense to keep the structure, just remove it, and use a u32 array to
store the checksum value.
By this change, we don't use the while loop to get the checksums one by
one. Now, we can get several checksum value at one time, it improved the
performance by ~74% on my SSD (31MB/s -> 54MB/s).
test command:
# dd if=/dev/zero of=/mnt/btrfs/file0 bs=1M count=1024 oflag=sync
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We always just try and reserve data space when we write, but if we are out of
space but have prealloc'ed extents we should still successfully write. This
patch will try and see if we can write to prealloc'ed space and if we can go
ahead and allow the write to continue. With this patch we now pass xfstests
generic/274. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
try_to_writeback_inodes_sb_nr returns 1 if writeback is already underway, which
is completely fraking useless for us as we need to make sure pages are actually
written before we go and check if there are ordered extents. So replace this
with an open coding of try_to_writeback_inodes_sb_nr minus the writeback
underway check so that we are sure to actually have flushed some dirty pages out
and will have ordered extents to use. With this patch xfstests generic/273 now
passes. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
There are all of these checks in the ENOSPC code to see if committing the
transaction would free up enough space to make the allocation. This is because
early on we just committed the transaction and hoped and prayed, which resulted
in cases where it took _forever_ to get an ENOSPC when we really were out of
space. So we check space_info->bytes_pinned, except this isn't completely true
because it doesn't account for space we may free but are stuck in delayed refs.
So tests like xfstests 226 would fail because we wouldn't commit the transaction
to free up the data space. So instead add a percpu counter that will be a
little fuzzier, it will add bytes as soon as we try to free up the space, and
remove any space it doesn't actually free up when we get around to doing the
actual free. We then 0 out this counter every transaction period so we have a
better idea of how much space we will actually free up by committing this
transaction. With this patch we now pass xfstests 226. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
We have an optimization that will go ahead and cache no acls on an inode if
there are no xattrs on the inode. This saves us a lookup later to check the
acls for writes or any other access. The problem is I use selinux so I always
have an xattr on inodes, so make this test a little smarter and check for the
actual acl hash on the key and if it isn't there then we still get to cache no
acl which makes everybody who uses selinux a little happier. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
|
|
drivers/leds/leds-mc13783.c: In function 'mc13xxx_led_probe':
drivers/leds/leds-mc13783.c:195:2: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
|
|
For chips without debugfs dpm support say that it's not
implemented rather than not supported to avoid confusion
about DPM support in general.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The route_irq() function needs to preserve the irq mask by using the
_irqsave/irqrestore variants of raw spin lock functions instead of the
_irq variants. This is because it is called from __cpu_disable() (via
migrate_irqs()), which is called with IRQs disabled, so using the _irq
variants re-enables IRQs.
This appears to have been causing occasional hits of the
BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
testing:
BUG: failure at kernel/irq_work.c:122/__irq_work_run()!
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
|
|
kick_handler() doesn't have an irq_enter/exit pair, but it's used for
handling SMP IPIs which require work to be done in softirqs, which are
invoked from irq_exit() when the hard irq nest count reaches 0.
The scheduler_ipi() callback in the IPI handler calls irq_enter/exit
itself, but this is inside kick_handler()'s spin lock critical section,
so if an invoked softirq issues an IPI the kick_handler() will be
re-entered on the same CPU and will deadlock.
This is easily fixed by adding the missing irq_enter/exit to
kick_handler() so that the hard irq nest count doesn't reach 0 until
after the spin lock has been released.
Ideally the spin lock protected handler list will also be replaced by a
lockless RCU protected list since it is certainly mostly read. That can
be done in a later change though.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The recent implementation of a generic dummy timer resulted in a
different registration order of per cpu local timers which made the
broadcast control logic go belly up.
If the dummy timer is the first clock event device which is registered
for a CPU, then it is installed, the broadcast timer is initialized
and the CPU is marked as broadcast target.
If a real clock event device is installed after that, we can fail to
take the CPU out of the broadcast mask. In the worst case we end up
with two periodic timer events firing for the same CPU. One from the
per cpu hardware device and one from the broadcast.
Now the problem is that we have no way to distinguish whether the
system is in a state which makes broadcasting necessary or the
broadcast bit was set due to the nonfunctional dummy timer
installment.
To solve this we need to keep track of the system state seperately and
provide a more detailed decision logic whether we keep the CPU in
broadcast mode or not.
The old decision logic only clears the broadcast mode, if the newly
installed clock event device is not affected by power states.
The new logic clears the broadcast mode if one of the following is
true:
- The new device is not affected by power states.
- The system is not in a power state affected mode
- The system has switched to oneshot mode. The oneshot broadcast is
controlled from the deep idle state. The CPU is not in idle at
this point, so it's safe to remove it from the mask.
If we clear the broadcast bit for the CPU when a new device is
installed, we also shutdown the broadcast device when this was the
last CPU in the broadcast mask.
If the broadcast bit is kept, then we leave the new device in shutdown
state and rely on the broadcast to deliver the timer interrupts via
the broadcast ipis.
Reported-and-tested-by: Stehle Vincent-B46079 <B46079@freescale.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: John Stultz <john.stultz@linaro.org>,
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When the system switches from periodic to oneshot mode, the broadcast
logic causes a possibility that a CPU which has not yet switched to
oneshot mode puts its own clock event device into oneshot mode without
updating the state and the timer handler.
CPU0 CPU1
per cpu tickdev is in periodic mode
and switched to broadcast
Switch to oneshot mode
tick_broadcast_switch_to_oneshot()
cpumask_copy(tick_oneshot_broacast_mask,
tick_broadcast_mask);
broadcast device mode = oneshot
Timer interrupt
irq_enter()
tick_check_oneshot_broadcast()
dev->set_mode(ONESHOT);
tick_handle_periodic()
if (dev->mode == ONESHOT)
dev->next_event += period;
FAIL.
We fail, because dev->next_event contains KTIME_MAX, if the device was
in periodic mode before the uncontrolled switch to oneshot happened.
We must copy the broadcast bits over to the oneshot mask, because
otherwise a CPU which relies on the broadcast would not been woken up
anymore after the broadcast device switched to oneshot mode.
So we need to verify in tick_check_oneshot_broadcast() whether the CPU
has already switched to oneshot mode. If not, leave the device
untouched and let the CPU switch controlled into oneshot mode.
This is a long standing bug, which was never noticed, because the main
user of the broadcast x86 cannot run into that scenario, AFAICT. The
nonarchitected timer mess of ARM creates a gazillion of differently
broken abominations which trigger the shortcomings of that broadcast
code, which better had never been necessary in the first place.
Reported-and-tested-by: Stehle Vincent-B46079 <B46079@freescale.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: John Stultz <john.stultz@linaro.org>,
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1307012153060.4013@ionos.tec.linutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In periodic mode we remove offline cpus from the broadcast propagation
mask. In oneshot mode we fail to do so. This was not a problem so far,
but the recent changes to the broadcast propagation introduced a
constellation which can result in a NULL pointer dereference.
What happens is:
CPU0 CPU1
idle()
arch_idle()
tick_broadcast_oneshot_control(OFF);
set cpu1 in tick_broadcast_force_mask
if (cpu_offline())
arch_cpu_dead()
cpu_dead_cleanup(cpu1)
cpu1 tickdevice pointer = NULL
broadcast interrupt
dereference cpu1 tickdevice pointer -> OOPS
We dereference the pointer because cpu1 is still set in
tick_broadcast_force_mask and tick_do_broadcast() expects a valid
cpumask and therefor lacks any further checks.
Remove the cpu from the tick_broadcast_force_mask before we set the
tick device pointer to NULL. Also add a sanity check to the oneshot
broadcast function, so we can detect such issues w/o crashing the
machine.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Cc: athorlton@sgi.com
Cc: CAI Qian <caiqian@redhat.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1306261303260.4013@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Use a completion to block until a secondary CPU has started up, like ARM
do, instead of a loop of udelays.
On Meta, SMP is really SMT, with each "CPU" being a different hardware
thread on the same Meta processor core, so as well as being more
efficient and latency friendly, using a completion prevents the bogomips
of the secondary CPU from being drastically skewed every time by the
execution of the tight in-cache udelay loop on the other CPU.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
|
|
In secondary_start_kernel() interrupts should be enabled with
local_irq_enable() after the cpu is marked as online with
set_cpu_online(). Otherwise it's possible for a timer interrupt to
trigger a softirq, which if the cpu is marked as offline may have it's
affinity altered.
Reported-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Kirill Tkhai <tkhai@yandex.ru>
|
|
Checking for process->mm is not enough because process' main thread may
exit or detach its mm via use_mm(), but other threads may still have a
valid mm.
To fix this we would need to use find_lock_task_mm(), which would walk
up all threads and returns an appropriate task (with task lock held).
clear_tasks_mm_cpumask() was introduced in v3.5-rc1 to fix this issue,
so let's use it for metag too.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
|
|
Every other place properly checks whether we've managed to set
up the stolen allocator at boot-up properly, with the exception
of the cleanup code. Which results in an ugly
*ERROR* Memory manager not clean. Delaying takedown
at module unload time since the drm_mm isn't initialized at all.
v2: While at it check whether the stolen drm_mm is initialized instead
of the more obscure stolen_base == 0 check.
v3: Fix up the logic. Also we need to keep the stolen_base check in
i915_gem_object_create_stolen_for_preallocated since that can be
called before stolen memory is fully set up. Spotted by Chris Wilson.
v4: Readd the conversion in i915_gem_object_create_stolen_for_preallocated,
the check is for the dev_priv->mm.gtt_space drm_mm, the stolen
allocatot must already be initialized when calling that function (if
we indeed have stolen memory).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65953
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: lu hua <huax.lu@intel.com> (v3)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Some Icera based Huawei modems handled by this driver are not
completely CDC ECM compliant, using the same USB interface for both
control and data. The CDC functional descriptors include a Union
naming this interface as both master and slave, so it is supportable
by relaxing the descriptor parsing in case these interfaces are
identical.
This has been tested on a Huawei K3806 and verified to add support
for that device.
Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The tx_bytes field was not being updated so the
network card statistics showed 0.0B transmitted.
Signed-off-by: Jim Baxter <jim_baxter@mentor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Incorporate the addition of hsize argument in write_buf callback
of pstore. This was forgotten in
6bbbca735936e15b9431882eceddcf6dff76e03c
pstore: Pass header size in the pstore write callback
Causing a build failure when ftrace and pstore are enabled.
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|
|
From Anatolij:
"There are small cleanups and fixes for mpc512x common code,
mpc512x_defconfig updates and soft reboot support for mpc5125
based boards."
|
|
This is a regression introduced by
commit fd58156e456d9f68fe0448 (IPIP: Use ip-tunneling code.)
Similar to GRE tunnel, previously we only check the parameters
for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the
check is moved for all commands.
So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL.
Also, the check for i_key, o_key etc. is suspicious too,
which did not exist before, reset them before passing
to ip_tunnel_ioctl().
Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add missing .owner of struct pppox_proto. This prevents the
module from being removed from underneath its users.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
davinci_emac_probe()
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert one printk to pr_<level>.
Add a missing newline in several places to avoid message interleaving,
coalesce formats, reflow modified lines to 80 columns.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|