summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2009-06-10nilfs2: modify list of unsupported features in caveatsRyusuke Konishi
This clarifies missing features of nilfs as a regular filesystem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: enable sync_page methodRyusuke Konishi
This adds a missing sync_page method which unplugs bio requests when waiting for page locks. This will improve read performance of nilfs. Here is a measurement result using dd command. Without this patch: # mount -t nilfs2 /dev/sde1 /test # dd if=/test/aaa of=/dev/null bs=512k 1024+0 records in 1024+0 records out 536870912 bytes (537 MB) copied, 6.00688 seconds, 89.4 MB/s With this patch: # mount -t nilfs2 /dev/sde1 /test # dd if=/test/aaa of=/dev/null bs=512k 1024+0 records in 1024+0 records out 536870912 bytes (537 MB) copied, 3.54998 seconds, 151 MB/s Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: set bio unplug flag for the last bio in segmentRyusuke Konishi
This sets BIO_RW_UNPLUG flag on the last bio of each segment during write. The last bio should be unplugged immediately because the caller waits for the completion after the submission. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: allow future expansion of metadata read out via get info ioctlRyusuke Konishi
Nilfs has some ioctl commands to read out metadata from meta data files: - NILFS_IOCTL_GET_CPINFO for checkpoint file, - NILFS_IOCTL_GET_SUINFO for segment usage file, and - NILFS_IOCTL_GET_VINFO for Disk Address Transalation (DAT) file, respectively. Every routine on these metadata files is implemented so that it allows future expansion of on-disk format. But, the above ioctl commands do not support expansion even though nilfs_argv structure can handle arbitrary size for data exchanged via ioctl. This allows future expansion of the following structures which give basic format of the "get information" ioctls: - struct nilfs_cpinfo - struct nilfs_suinfo - struct nilfs_vinfo So, this introduces forward compatility of such ioctl commands. In this patch, a sanity check in nilfs_ioctl_get_info() function is changed to accept larger data structure [1], and metadata read routines are rewritten so that they become compatible for larger structures; the routines will just ignore the remaining fields which the current version of nilfs doesn't know. [1] The ioctl function already has another upper limit (PAGE_SIZE against a structure, which appears in nilfs_ioctl_wrap_copy function), and this will not cause security problem. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10NILFS2: Pagecache usage optimization on NILFS2Hisashi Hifumi
Hi, I introduced "is_partially_uptodate" aops for NILFS2. A page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate on pagesize != blocksize environment. This aops checks that all buffers which correspond to a part of a file that we want to read are uptodate. If so, we do not have to issue actual read IO to HDD even if a page is not uptodate because the portion we want to read are uptodate. "block_is_partially_uptodate" function is already used by ext2/3/4. With the following patch random read/write mixed workloads or random read after random write workloads can be optimized and we can get performance improvement. I did a performance test using the sysbench. 1 --file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0 --fil e-rw-ratio=1 run -2.6.30-rc5 Test execution summary: total time: 151.2907s total number of events: 200000 total time taken by event execution: 2409.8387 per-request statistics: min: 0.0000s avg: 0.0120s max: 0.9306s approx. 95 percentile: 0.0439s Threads fairness: events (avg/stddev): 12500.0000/238.52 execution time (avg/stddev): 150.6149/0.01 -2.6.30-rc5-patched Test execution summary: total time: 140.8828s total number of events: 200000 total time taken by event execution: 2240.8577 per-request statistics: min: 0.0000s avg: 0.0112s max: 0.8750s approx. 95 percentile: 0.0418s Threads fairness: events (avg/stddev): 12500.0000/218.43 execution time (avg/stddev): 140.0536/0.01 arch: ia64 pagesize: 16k Thanks. Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove nilfs_btree_operations from btree mappingRyusuke Konishi
will remove indirect function calls using nilfs_btree_operations table. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove nilfs_direct_operations from direct mappingRyusuke Konishi
will remove indirect function calls using nilfs_direct_operations table. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove bmap pointer operationsRyusuke Konishi
Previously, the bmap codes of nilfs used three types of function tables. The abuse of indirect function calls decreased source readability and suffered many indirect jumps which would confuse branch prediction of processors. This eliminates one type of the function tables, nilfs_bmap_ptr_operations, which was used to dispatch low level pointer operations of the nilfs bmap. This adds a new integer variable "b_ptr_type" to nilfs_bmap struct, and uses the value to select the pointer operations. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove useless b_low and b_high fields from nilfs_bmap structRyusuke Konishi
This will cut off 16 bytes from the nilfs_bmap struct which is embedded in the on-memory inode of nilfs. The b_high field was never used, and the b_low field stores a constant value which can be determined by whether the inode uses btree for block mapping or not. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove pointless NULL check of bpop_commit_alloc_ptr functionRyusuke Konishi
This indirect function is set to NULL only for gc cache inodes, but the gc cache inodes never call this function. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: move get block functions in bmap.c into btree codesRyusuke Konishi
Two get block function for btree nodes, nilfs_bmap_get_block() and nilfs_bmap_get_new_block(), are called only from the btree codes. This relocation will increase opportunities of compiler optimization. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove nilfs_bmap_delete_blockRyusuke Konishi
nilfs_bmap_delete_block() is a wrapper function calling nilfs_btnode_delete(). This removes it for simplicity. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove nilfs_bmap_put_blockRyusuke Konishi
nilfs_bmap_put_block() is a wrapper function calling brelse(). This eliminates the wrapper for simplicity. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove header file for segment list operationsRyusuke Konishi
This will eliminate obsolete list operations of nilfs_segment_entry structure which has been used to handle mutiple segment numbers. The patch ("nilfs2: remove list of freeing segments") removed use of the structure from the segment constructor code, and this patch simplifies the remaining code by integrating it into recovery.c. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: eliminate removal list of segmentsRyusuke Konishi
This will clean up the removal list of segments and the related functions from segment.c and ioctl.c, which have hurt code readability. This elimination is applied by using nilfs_sufile_updatev() previously introduced in the patch ("nilfs2: add sufile function that can modify multiple segment usages"). Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: add sufile function that can modify multiple segment usagesRyusuke Konishi
This is a preparation for the later cleanup patch ("nilfs2: remove list of freeing segments"). This adds nilfs_sufile_updatev() to sufile, which can modify multiple segment usages at a time. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: unify bmap operations starting use of indirect block addressRyusuke Konishi
This simplifies some low level functions of bmap. Three bmap pointer operations, nilfs_bmap_start_v(), nilfs_bmap_commit_v(), and nilfs_bmap_abort_v(), are unified into one nilfs_bmap_start_v() function. And the related indirect function calls are replaced with it. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10nilfs2: remove nilfs_dat_prepare_free functionRyusuke Konishi
This function is unused. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-06-10[SCSI] osd: Remove out-of-tree left oversBoaz Harrosh
* Delete Makefile. It is only used for out-of-tree compilation and was never needed. It slipped in by mistake. * Remove from Kbuild all the out of tree stuff as promised. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: Use REQ_QUIET requests.Boaz Harrosh
libosd has it's own sense decoding and printout. Don't let scsi_lib duplicate that printout. (Which is done wrong in regard to osd commands) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] osduld: use filp_open() when looking up an osd-deviceBoaz Harrosh
This patch was inspired by Al Viro, for simplifying and fixing the retrieval of osd-devices by in-kernel users, eg: file systems. In-Kernel users, now, go through the same path user-mode does by opening a file on the osd char-device and though holding a reference to both the device and the Module. A file pointer was added to the osd_dev structure which is now allocated for each user. The internal osd_dev is no longer exposed outside of the uld. I wanted to do that for a long time so each libosd user can have his own defaults on the device. The API is left the same, so user code need not change. It is no longer needed to open/close a file handle on the osd char-device from user-mode, before mounting an exofs on it. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> CC: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: Define an osd_dev wrapper to retrieve the request_queueBoaz Harrosh
libosd users that need to work with bios, must sometime use the request_queue associated with the osd_dev. Make a wrapper for that, and convert all in-tree users. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: osd_req_{read,write} takes a length parameterBoaz Harrosh
For supporting of chained-bios we can not inspect the first bio only, as before. Caller shall pass the total length of the request, ie. sum_bytes(bio-chain). Also since the bio might be a chain we don't set it's direction on behalf of it's callers. The bio direction should be properly set prior to this call. So fix a couple of write users that now need to set the bio direction properly [In this patch I change both library code and user sites at exofs, to make it easy on integration. It should be submitted via James's scsi-misc tree.] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> CC: Jeff Garzik <jeff@garzik.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: Let _osd_req_finalize_data_integrity receive number of out_bytesBoaz Harrosh
_osd_req_finalize_data_integrity was trying to deduce the number of out_bytes from passed osd_request->out.bio. This is wrong when the bio is chained. The caller of _osd_req_finalize_data_integrity has more ready available information and should just pass it. Also in the light of future support for CDB-continuation segment this is a better solution. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: osd_req_{read,write}_kern new APIBoaz Harrosh
By popular demand, define usefull wrappers for osd_req_read/write that recieve kernel pointers. All users had their own. Also remove these from exofs Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: Better printout of OSD target system informationBoaz Harrosh
Shorten out the Attributes names. Align all results on column 24. Print system ID in a new line. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: OSD2r05: Attribute definitionsBoaz Harrosh
Some New revision 5 Attribute definitions. Some missing definitions of Attributes and pages Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10[SCSI] libosd: OSD2r05: Additional command enumsBoaz Harrosh
Add all constant definitions of new OSD commands added in revision 4 & 5. Mainly for creating snapshots and clones. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-06-10ASoC: Fix lm4857 controlMark Brown
Commit 4eaa9819dc08d7bfd1065ce530e31b18a119dcaf changed semantics of private_value member of kcontrol. This resulted in inability to control amplifier and subsequently in very low output volume. Tested-by: Johannes Schauer <josch@pyneo.org> Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2009-06-10ide: re-implement ide_pci_init_one() on top of ide_pci_init_two()Bartlomiej Zolnierkiewicz
There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-06-10ide: unexport ide_find_dma_mode()Bartlomiej Zolnierkiewicz
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-06-10ide: fix PowerMac bootup oopsHugh Dickins
PowerMac bootup with CONFIG_IDE=y oopses in ide_pio_cycle_time(): because "ide: try to use PIO Mode 0 during probe if possible" causes pmac_ide_set_pio_mode() to be called before drive->id has been set. Bart points out other places which now need drive->id set earlier, so follow his advice to allocate it in ide_port_alloc_devices() (using kzalloc_node, without error message, as when allocating drive) and memset it for reuse in ide_port_init_devices_data(). Fixed in passing: ide_host_alloc() was missing ide_port_free_devices() from an error path. Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Joao Ramos <joao.ramos@inov.pt> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2009-06-10netfilter: nf_conntrack: use per-conntrack locks for protocol dataPatrick McHardy
Introduce per-conntrack locks and use them instead of the global protocol locks to avoid contention. Especially tcp_lock shows up very high in profiles on larger machines. This will also allow to simplify the upcoming reliable event delivery patches. Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-06-10KVM: Prevent overflow in largepages calculationAvi Kivity
If userspace specifies a memory slot that is larger than 8 petabytes, it could overflow the largepages variable. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10KVM: Disable large pages on misaligned memory slotsAvi Kivity
If a slots guest physical address and host virtual address unequal (mod large page size), then we would erronously try to back guest large pages with host large pages. Detect this misalignment and diable large page support for the trouble slot. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com>
2009-06-10ata_piix: Remove stale commentAlan Cox
Combined mode pci quirk hacks went away - so the table to keep in sync no longer exists. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10ata_piix: Turn on hotplugging support for older chipsAlan Cox
We can't do this for the later ones as they have all sorts of magic boot time stuff that needs reviewing and the like. However we can do it for the older ones and it turns out we need to as some IBM docking stations have a second PIIX series device in them and without this change you can't use it very well Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10ahci: misc cleanups for EM stuffTejun Heo
Make the following EM related cleanups. * Use msleep(1) instead of udelay(100) and reduce retry count to 5. * s/MAX_SLOTS/EM_MAX_SLOTS/, s/MAX_RETRY/EM_MAX_RETRY/ * Make EM constants enums as suggested by Jeff. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10[libata] get rid of ATA_MAX_QUEUE loop in ata_qc_complete_multiple() v2Jens Axboe
We very rarely (if ever) complete more than one command in the sactive mask at the time, even for extremely high IO rates. So looping over the entire range of possible tags is pointless, instead use __ffs() to just find the completed tags directly. Updated to clear the tag from the done_mask instead of shifting done_mask down as suggested by From: Tejun Heo <htejun@gmail.com> Verified with a user space tester to produce the same results. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10sata_sil: enable 32-bit PIORobert Hancock
32-bit PIO seems to work fine on sata_sil hardware (tested on SiI3114) and is listed as OK in the Silicon Image datasheets. Enable it. Signed-off-by: Robert Hancock <hancockrwd@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10sata_sx4: speed up ECC initializationAlexander Beregalov
ECC initialization takes too long. It writes zeroes by portions of 4 byte, it takes more than 6 minutes on my machine to initialize 512Mb ECC DIMM module. Change portion to 128Kb - it significantly reduces initialization time. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10libata-sff: avoid byte swapping in ata_sff_data_xfer()Sergei Shtylyov
Handling of the trailing byte in ata_sff_data_xfer() is suboptimal bacause: - it always initializes the padding buffer to 0 which is not really needed in both the read and write cases; - it has to use memcpy() to transfer a single byte from/to the padding buffer; - it uses io{read|write}16() accessors which swap bytes on the big endian CPUs and so have to additionally convert the data from/to the little endian format instead of using io{read|write}16_rep() accessors which are not supposed to change the byte ordering. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10[libata] ahci: use less error-prone array initializersJeff Garzik
Also, remove unneeded prototype. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-06-10perf_counter/x86: Fix the model number of Intel Core2 processorsYong Wang
Fix the model number of Intel Core2 processors according to the documentation: Intel Processor Identification with the CPUID Instruction: http://www.intel.com/support/processors/sb/cs-009861.htm Signed-off-by: Yong Wang <yong.y.wang@intel.com> Also-Reported-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090610090612.GA26580@ywang-moblin2.bj.intel.com> [ Added two more model numbers suggested by Arnd Bergmann ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-10amd64_edac: add MAINTAINERS entryBorislav Petkov
Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10EDAC: do not enable modules by defaultBorislav Petkov
Prevent EDAC compilation units from being built by default and let the user explicitly select the needed modules. Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10amd64_edac: do not enable module by defaultBorislav Petkov
While at it, fix a link failure when !K8_NB. Acked-by: Doug Thompson <dougthompson@xmission.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10amd64_edac: add module registration routinesDoug Thompson
Also, link into Kbuild by adding Kconfig and Makefile entries. Borislav: - Kconfig/Makefile splitting - use zero-sized arrays for the sysfs attrs if not enabled - rename sysfs attrs to more conform values - shorten CONFIG_ names - make multiple structure members assignment vertically aligned - fix/cleanup comments - fix function return value patterns - fix err labels - fix a memleak bug caught by Ingo - remove the NUMA dependency and use num_k8_northbrides for initializing a driver instance per NB. - do not copy the pvt contents into the mci struct in amd64_init_2nd_stage() and save it in the mci->pvt_info void ptr instead. - cleanup debug calls - simplify amd64_setup_pci_device() Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10amd64_edac: add ECC reporting initializersDoug Thompson
Borislav: - convert to the new {rd|wr}msr_on_cpus interfaces. - convert pvt->old_mcgctl to a bitmask thus saving some bytes - fix/cleanup comments - fix function return value patterns - add a proper bugfix found by Doug to amd64_check_ecc_enabled where we missed checking for the ECC enabled bit in NB CFG. - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2009-06-10amd64_edac: add EDAC core-related initializersDoug Thompson
Borislav: - add a amd64_free_mc_sibling_devices() helper instead of opencoding the release-path. - fix/cleanup comments - fix function return value patterns - cleanup debug calls Reviewed-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>