linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-09-19	staging: lustre: statahead: ll_intent_drop_lock() called in spinlock	Lai Siyao
	ll_intent_drop_lock() may sleep, which should not be called inside spinlock. Signed-off-by: Lai Siyao <lai.siyao@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2272 Reviewed-on: http://review.whamcloud.com/9665 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: statahead: use dcache-like interface for sa entry	Lai Siyao
	Rename ll_sa_entry to sa_entry, and manage sa_entry cache with dcache-like interfaces. sa_entry is not needed to be refcounted, because only scanner can free it, so after it's put in stat list, statahead thread shouldn't access it any longer. ll_statahead_interpret() doesn't need to take sai refcount, because statahead thread will wait for all inflight RPC to finish. Signed-off-by: Lai Siyao <lai.siyao@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3270 Reviewed-on: http://review.whamcloud.com/9664 Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: allow setting stripes to specify OSTs	Jinshan Xiong
	Extend the llite layer to support specifying individual target OSTs. Support specifying OSTs for regular files only. Directory support will be implemented later in a separate project. With this a file could have for example a OST index layout of 2,4,5,9,11. In addition, duplicate indices will be eliminated automatically. Calculate the max easize by ld_active_tgt_count instead of ld_tgt_count. However this may introduce problems when the OSTs are in recovery because non sufficient buffer may be allocated to store EA. Signed-off-by: Jian Yu <jian.yu@intel.com> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com> Signed-off-by: James Simmons <uja.ornl@gmail.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4665 Reviewed-on: http://review.whamcloud.com/9383 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: Add ioctl to get parent fids from link EA.	Henri Doreau
	Added LL_IOC_GETPARENT to retrieve the <parent_fid>/name(s) of a given entry, based on its link EA. This saves multiple calls to path2fid/fid2path. Merged with second later patch that does various cleanups. Avoid unneeded allocation. Get read-only attributes from the user getparent structure and write the modified attributes only, instead of populating a whole structure in kernel and copying it back. Signed-off-by: Thomas Leibovici <thomas.leibovici@cea.fr> Signed-off-by: Henri Doreau <henri.doreau@cea.fr> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3613 Reviewed-on: http://review.whamcloud.com/7069 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5837 Reviewed-on: http://review.whamcloud.com/12527 Reviewed-by: Ned Bass <bass6@llnl.gov> Reviewed-by: frank zago <fzago@cray.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: obd: restore linkea support	James Simmons
	Original linkea was only used for the lustre server code so it was removed from the upstream client. Now it needs to be restored for client work that uses this infrastructure. Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: add testing for bad name hash	Fan Yong
	Enable testing of the lfsck recovery feature in the client code for the case when name hash for some entry becomes corrupt. Signed-off-by: Fan Yong <fan.yong@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5519 Reviewed-on: http://review.whamcloud.com/11846 Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: ensure all data flush out when umount	Yang Sheng
	Write out all extents when clear inode. Otherwise we may lose data while umount. Signed-off-by: Yang Sheng <yang.sheng@intel.com> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5584 Reviewed-on: http://review.whamcloud.com/12103 Reviewed-by: Bobi Jam <bobijam@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: ldlm: per-export lock callback timeout	Vitaly Fertman
	The lock callback timeout is calculated as an average per namespace. This does not reflect individual client behavior. Instead, we should calculate it on a per-export basis. This is the client side changes for upstream client. Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4942 Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com> Reviewed-by: Alexey Lyashkov <Alexey_Lyashkov@xyratex.com> Xyratex-bug-id: MRP-417 Reviewed-on: http://review.whamcloud.com/9336 Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: move some inline functions to lustre_lmv.h	Fan Yong
	Move some inline code out of lmv core into lustre_lmv.h. This is to prepare for use outside of the lmv layer in the future of these functions. Change from passing in struct lmv_stripe_md to just int for lmv_is_known_hash_type. Signed-off-by: Fan Yong <fan.yong@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5519 Reviewed-on: http://review.whamcloud.com/11845 Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: Flexible changelog format.	Henri Doreau
	Added jobid fields to Changelog records (and extended records). The CLF_JOBID flags allows to check if the field is present or not (old format) when reading an entry. Jobids are expressed as 32 chars long, zero-terminated strings. Updated test_205 in sanity.sh. Signed-off-by: Henri Doreau <henri.doreau@cea.fr> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1996 Reviewed-on: http://review.whamcloud.com/4060 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: enforce pool name length limit	Li Xi
	The pool related codes have some inconsistency about the length of pool name. Creating and setting a pool name of length 16 to a directory will succeed. However, creating a file under that directory will fail. This patch disables any pool name which is longer or equal to 16. And it changes LOV_MAXPOOLNAME from 16 to 15 which might cause some invalid LLOG records of OST pools with 16 byte names. It is not a problem since invalid LLOG records are just ignored. And OST pools with 16 byte names won't work well anyway on the old versions. There will be problem of inconsistency if part of the servers have this patch and part of the servers don't. But it would be safe to assume that this is not a normal configuration. Signed-off-by: Li Xi <lixi@ddn.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5054 Reviewed-on: http://review.whamcloud.com/10306 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: handle concurrent use of cob_transient_pages	Stephen Champion
	With the lockless __generic_file_aio_write introduced in LU-1669, ll_direct_IO_26 is no longer protected by the inode i_isem. This renders obsoltete checks that all transient pages have been handled before and after entry, and requires atomic access to their counter. Signed-off-by: Stephen Champion <schamp@sgi.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5700 Reviewed-on: http://review.whamcloud.com/12179 Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Bobi Jam <bobijam@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: remove dead code	Dmitry Eremin
	The member lmv_obd->server_timeout and function lmv_set_timeouts() are not used. Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-991 Reviewed-on: http://review.whamcloud.com/11880 Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: misc: Reduce exposure to overflow on page counters.	Stephen Champion
	When the number of an object in use or circulation is tied to memory size of the system, very large memory systems can overflow 32 bit counters. This patch addresses overflow on page counters in the osc LRU and obd accounting. Signed-off-by: Stephen Champion <schamp@sgi.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4856 Reviewed-on: http://review.whamcloud.com/10537 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: change type of lmv_obd->tgts_size to u32	Dmitry Eremin
	tgts_size is used as unsigned. Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5577 Reviewed-on: http://review.whamcloud.com/11881 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: obd: change type of lmv_tgt_desc->ltd_idx to u32	Dmitry Eremin
	ltd_idx is used as unsigned. Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5577 Reviewed-on: http://review.whamcloud.com/11879 Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: don't call make_bad_inode() on an old inode	John L. Hammond
	In ll_iget() if ll_update_inode() fails then do not call make_bad_inode() on the inode since it may still be in use. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5468 Reviewed-on: http://review.whamcloud.com/11609 Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: obd: rename LUSTRE_STRIPE_MAXBYTES	John L. Hammond
	Rename LUSTRE_STRIPE_MAXBYTES to LUSTRE_EXT3_STRIPE_MAXBYTES and correct the comment describing its use. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/11800 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Bob Glossman <bob.glossman@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: remove lustre_lite.h	John L. Hammond
	Move several definition only used in lustre/llite/ to lustre/llite/llite_internal.h. Remove lustre/include/{,linux/}lustre_lite.h and fixup the missing includes in other headers that this exposes. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/11501 Reviewed-by: Bob Glossman <bob.glossman@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: osc: debug to match extent to brw RPC	Patrick Farrell
	Currently, it's difficult to match brw RPCs to objects and extents from client logs. This patch adds a D_RPCTRACE debug message giving the necessary information. Signed-off-by: Patrick Farrell <paf@cray.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5531 Reviewed-on: http://review.whamcloud.com/11548 Reviewed-by: Alexey Lyashkov <alexey.lyashkov@seagate.com> Reviewed-by: Ann Koehler <amk@cray.com> Reviewed-by: Ryan Haasken <haasken@cray.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: cleanup lustre_lib.h	John L. Hammond
	Remove some unused declarations from lustre_lib.h and move some others to more natural headers. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/11500 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Bob Glossman <bob.glossman@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: no need to check dentry is NULL	John L. Hammond
	We are already touching dentry in CDEBUG macros so it will crash long before these checks. Since this is the case no need to do an additional check. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/10769 Reviewed-by: Yang Sheng <yang.sheng@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: style cleanup for ll_mkdir	John L. Hammond
	Style cleanup to make the code readable. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/10769 Reviewed-by: Yang Sheng <yang.sheng@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: turn mode to umode_t for ll_new_inode()	John L. Hammond
	Change int mode to umode_t. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/10769 Reviewed-by: Yang Sheng <yang.sheng@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: remove mode from ll_create_it()	John L. Hammond
	Remove the unused mode parameter from ll_create_it(). Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/10769 Reviewed-by: Yang Sheng <yang.sheng@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: remove lookup_flags from ll_lookup_it()	John L. Hammond
	Remove the effectively unused lookup_flags parameter from ll_lookup_it(). Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/10769 Reviewed-by: Yang Sheng <yang.sheng@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: vvp: Use lockless __generic_file_aio_write	Prakash Surya
	Testing multi-threaded single shard file write performance has shown the inode mutex to be a limiting factor when using the generic_file_write_iter function. To work around this bottle neck, this change replaces the locked version of that call with the lock less version, specifically, __generic_file_write_iter. In order to maintain posix consistency, Lustre must now employ it's own locking mechanism in the higher layers. Currently writes are protected using the lli_write_mutex in the ll_inode_info structure. To protect against simultaneous write and truncate operations, since we no longer take the inode mutex during writes, we must down the lli_trunc_sem semaphore. Unfortunately, this change by itself does not garner any performance benefits. Using FIO on a single machine with 32 GB of RAM, write performance tests were ran with and without this change applied; the results are below: +---------+-----------+---------+--------+--------+ \| fio v2.0.13 \| Write Bandwidth (KB/s) \| +---------+-----------+---------+--------+--------+ \| # Tasks \| GB / Task \| Test 1 \| Test 2 \| Test 3 \| +---------+-----------+---------+--------+--------+ \| 1 \| 64 \| 452446 \| 454623 \| 457653 \| \| 2 \| 32 \| 850318 \| 565373 \| 602498 \| \| 4 \| 16 \| 1058900 \| 463546 \| 529107 \| \| 8 \| 8 \| 1026300 \| 468190 \| 576451 \| \| 16 \| 4 \| 1065500 \| 503160 \| 462902 \| \| 32 \| 2 \| 1068600 \| 462228 \| 466963 \| \| 64 \| 1 \| 991830 \| 556618 \| 557863 \| +---------+-----------+---------+--------+--------+ * Test 1: Lustre client running 04ec54f. File per process write workload. This test was used as a baseline for what we _could_ achieve in the single shared file tests if the bottle necks were removed. * Test 2: Lustre client running 04ec54f. Single shared file workload, each task writing to a unique region. * Test 3: Lustre client running 04ec54f + this patch. Single shared file workload, each task writing to a unique region. In order to garner any real performance benefits out of a single shared file workload, the lli_write_mutex needs to be broken up into a range lock. That would allow write operations to unique regions of a file to be executed concurrently. This work is left to be done in a follow up patch. Signed-off-by: Prakash Surya <surya1@llnl.gov> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1669 Reviewed-on: http://review.whamcloud.com/6672 Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: Replace write mutex with range lock	Prakash Surya
	Testing has shown the ll_inode_inode's lli_write_mutex to be a limiting factor with single shared file write performance, when using many writing threads on a single machine. Even if each thread is writing to a unique portion of the file, the lli_write_mutex will prevent no more than a single thread to ever write to the file simultaneously. This change attempts to remove this bottle neck, by replacing this mutex with a range lock. This should allow multiple threads to write to a single file simultaneously iff the threads are writing to unique regions of the file. Performance testing shows this change to garner a significant performance boost to write bandwidth. Using FIO on a single machine with 32 GB of RAM, write performance tests were run with and without this change applied; the results are below: +---------+-----------+---------+--------+--------+--------+ \| fio v2.0.13 \| Write Bandwidth (KB/s) \| +---------+-----------+---------+--------+--------+--------+ \| # Tasks \| GB / Task \| Test 1 \| Test 2 \| Test 3 \| Test 4 \| +---------+-----------+---------+--------+--------+--------+ \| 1 \| 64 \| 452446 \| 454623 \| 457653 \| 463737 \| \| 2 \| 32 \| 850318 \| 565373 \| 602498 \| 733027 \| \| 4 \| 16 \| 1058900 \| 463546 \| 529107 \| 976284 \| \| 8 \| 8 \| 1026300 \| 468190 \| 576451 \| 963404 \| \| 16 \| 4 \| 1065500 \| 503160 \| 462902 \| 830065 \| \| 32 \| 2 \| 1068600 \| 462228 \| 466963 \| 749733 \| \| 64 \| 1 \| 991830 \| 556618 \| 557863 \| 710912 \| +---------+-----------+---------+--------+--------+--------+ * Test 1: Lustre client running 04ec54f. File per process write workload. This test was used as a baseline for what we _could_ achieve in the single shared file tests if the bottle necks were removed. * Test 2: Lustre client running 04ec54f. Single shared file workload, each task writing to a unique region. * Test 3: Lustre client running 04ec54f + I0023132b. Single shared file workload, each task writing to a unique region. * Test 4: Lustre client running 04ec54f + this patch. Single shared file workload, each task writing to a unique region. Direct IO does not use the page cache like normal IO, so concurrent direct IO reads of the same pages are not safe. As a result, direct IO reads must take the range lock in ll_file_io_generic, otherwise they will attempt to work on the same pages and hit assertions like: (osc_request.c:1219:osc_brw_prep_request()) ASSERTION( i == 0 \|\| pg->off > pg_prev->off ) failed: i 3 p_c 10 pg ffffea00017a5208 [pri 0 ind 2771] off 16384 prev_pg ffffea00017a51d0 [pri 0 ind 2256] off 16384 Signed-off-by: Prakash Surya <surya1@llnl.gov> Signed-off-by: Patrick Farrell <paf@cray.com> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1669 Reviewed-on: http://review.whamcloud.com/6320 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6227 Reviewed-on: http://review.whamcloud.com/14385 Reviewed-by: Bobi Jam <bobijam@hotmail.com> Reviewed-by: Alexander Boyko <alexander.boyko@seagate.com> Reviewed-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: ldlm: restore some of the interval functionality	James Simmons
	Earlier a bunch of interval handling got removed since it wasn't used by the upstream client. Now some of it is needed again for the client code so this patch restores what is needed. Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: ldlm: resend AST callbacks	Vitaly Fertman
	While clients will resend client->server RPCs, servers would not resend server->client RPCs such as LDLM callbacks (blocking or completion callbacks/ASTs). This could result in clients being evicted from the server if blocking callbacks were dropped by the network (a failed router or lossy network) and the client did not cancel the requested lock in time. In order to fix this problem, this patch adds the ability to resend LDLM callbacks from the server and give the client a chance to respond within the timeout period before it is evicted: - resend BL AST within lock callback timeout period; - still do not resend CANCEL_ON_BLOCK; - regular resend for CP AST without BL AST embedded; - prolong lock callback timeout on resend; some fixes: - recovery-small test_10 to actually evict the client with dropped BL AST; - ETIMEDOUT to be returned if send limit is expired; Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5520 Reviewed-by: Alexey Lyashkov <Alexey_Lyashkov@xyratex.com> Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com> Xyratex-bug-id: MRP-417 Reviewed-on: http://review.whamcloud.com/9335 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Johann Lombardi <johann.lombardi@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: ldlm: reconstruct proper flags on enqueue resend	Vitaly Fertman
	otherwise, waiting lock may get granted as no BLOCKED_GRANTED flag is returned Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com> Xyratex-bug-id: MRP-1944 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5496 Reviewed-on: http://review.whamcloud.com/11644 Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: statahead: statahead thread wait for RPCs to finish	Lai Siyao
	Statahead thread should wait for inflight stat RPCs to finish in case statahead RPC callback may access data allocated in statahead thread context. ll_sa_entry_fini() should keep old entry if stat RPC is not finished yet. Simplify sai refcounting: * newly allocated sai will hold one refcount, and it will put it after starting statahead thread. * statahead thread holds one refcount. * agl thread holds one refcount. * stat process calls do_statahead_enter() which will try to get sai, and if it's valid, it will revalidate from statahead cache, and put refcount after use. Signed-off-by: Lai Siyao <lai.siyao@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3270 Reviewed-on: http://review.whamcloud.com/9663 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: Compare of unsigned value against 0 is always true	Dmitry Eremin
	Comparison of unsigned value against 0 is always true. Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5200 Reviewed-on: http://review.whamcloud.com/11217 Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: remove RCU2HANDLE macro	John L. Hammond
	Remove RCU2HANDLE macro from lustre_handles.h. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2675 Reviewed-on: http://review.whamcloud.com/11498 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Bob Glossman <bob.glossman@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: mdc: Report D_CHANGELOG messages as D_HSM	Henri Doreau
	Removed the D_CHANGELOG pseudo-debug flag that wasn't actually defined as a usable one. Report the D_CHANGELOG messages as D_HSM ones instead since this is the primary user of these messages. Signed-off-by: Henri Doreau <henri.doreau@cea.fr> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5538 Reviewed-on: http://review.whamcloud.com/11558 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: James Nunez <james.a.nunez@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llog: add newly opened llog at tail of handle list	Li Xi
	Add newly opened llog handle at the tail of handle list to increase lookup speed, especially for cancel operation, because the canceled log will be removed from the list, and lookup is from the beginning. Signed-off-by: Lai Siyao <lai.siyao@intel.com> Signed-off-by: Li Xi <lixi@ddn.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5405 Reviewed-on: http://review.whamcloud.com/11575 Reviewed-by: Ian Costello <costello.ian@gmail.com> Reviewed-by: Mike Pershin <mike.pershin@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: build: bump build version warnings to x.y.53	Andreas Dilger
	Move the LUSTRE_VERSION_CODE checks to trigger on x.y.53 instead of x.y.50, so that it is into the development cycle that they are hit instead of right at the start. In many cases, the #warning has been removed (to prevent build errors) and instead the code is just disabled outright. The dead code can be seen easily and removed in the future with less interruption to the development process. Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Emoly Liu <emoly.liu@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4217 Reviewed-on: http://review.whamcloud.com/8630 Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Tested-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: release request in lmv_revalidate_slaves()	John L. Hammond
	In lmv_revalidate_slaves() ensure that the request returned by md_intent_lock() is properly released on all paths. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5452 Reviewed-on: http://review.whamcloud.com/11326 Reviewed-by: wang di <di.wang@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: ptlrpc: fix magic return value of ptlrpc_init_portals	Wang Shilong
	Previously, when running 'modprobe lustre', it hit the following error message which is becaue of network initialisation failure: modprobe: ERROR: could not insert 'lustre': Input/output error However, error code is there, just let it return to caller, after this patch, error message will be something like: modprobe: ERROR: could not insert 'lustre': Network is down Signed-off-by: Wang Shilong <wshilong@ddn.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5455 Reviewed-on: http://review.whamcloud.com/11337 Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: James Simmons <uja.ornl@gmail.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: obdclass: fix comparison between signed and unsigned	Dmitry Eremin
	Make lu_buf->lb_len unsigned. Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5417 Reviewed-on: http://review.whamcloud.com/11281 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lov: adjust page bufsize after layout change	Jinshan Xiong
	Otherwise, the coh_page_bufsize keeps increasing when the file's layout keeps changing in lov_init_raid0(). Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5459 Reviewed-on: http://review.whamcloud.com/11394 Reviewed-by: frank zago <fzago@cray.com> Reviewed-by: Bobi Jam <bobijam@gmail.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: fix comparison between signed and unsigned	Dmitry Eremin
	Cleanup in general headers. * use size_t in cfs_size_round() make unsigned index and len in lustre_cfg_() make iteration variable the same type as comparing value * make unsigned pages counters Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5417 Reviewed-on: http://review.whamcloud.com/11327 Reviewed-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: clio: lu_ref_del() mismatch ref add scope	Bobi Jam
	'commit 77605e41a26f ("staging/lustre/clio: add pages into writeback cache in batches")' adds a page to a list aggregate issuing them to writeback cache; A page add is referenced in llite/vvp io scope, while writeback cache commit de-refers it under osc sub io scope, and enabling -lu_ref will detect this scope mismatch. Signed-off-by: Bobi Jam <bobijam.xu@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4503 Reviewed-on: http://review.whamcloud.com/8970 Reviewed-by: frank zago <fzago@cray.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: release locks if lmv_intent_lock() fails	John L. Hammond
	In lmv_intent_lock() if we will return an error then first release any locks referenced by the intent. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5431 Reviewed-on: http://review.whamcloud.com/11319 Reviewed-by: wang di <di.wang@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: osc: update kms in brw_interpret() properly	Niu Yawei
	In brw_interpret(), we forgot page offset when calculating write offset, that leads to wrong kms for sync write. Signed-off-by: Niu Yawei <yawei.niu@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5463 Reviewed-on: http://review.whamcloud.com/11374 Reviewed-by: Bobi Jam <bobijam@gmail.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Li Dongyang <dongyang.li@anu.edu.au> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: lmv: fix some byte order issues	John L. Hammond
	In the handler for LL_IOC_LMV_GETSTRIPE convert stripe FIDs from little to CPU endian when unpacking lmv_user_md. In lmv_unpack_md_v1() fix a double conversion of the stripe count. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5342 Reviewed-on: http://review.whamcloud.com/11106 Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: Jian Yu <jian.yu@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: update ras stride offset	Bobi Jam
	When a read ahead does not reach the end of the region reserved from ras, we'd set ras::ras_next_readahead back to where we left off; For stride read ahead, it needs to make sure that the offset is no less than ras_stride_offset, so that the stride read ahead can work correctly. Signed-off-by: Bobi Jam <bobijam.xu@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5263 Reviewed-on: http://review.whamcloud.com/11181 Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: wang di <di.wang@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: llite: add LL_LEASE_{RD,WR,UN}LCK	John L. Hammond
	Define new constants LL_LEASE_{RD,WR,UN}LCK for use as the argument to and return value from the LL_IOC_{GET,SET}_LEASE ioctls. As arguments, these contants replace the use of F_{RD,WR,UN}LCK from fcntl.h. As return values they replace the use of FMODE_{READ,WRITE} which are internal to the Linux kernel source and not under the control of the Lustre ioctl interface. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5013 Reviewed-on: http://review.whamcloud.com/10233 Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: obdclass: serialize lu_site purge	Niu Yawei
	Umount process relies on lu_site_purge(-1) to purge all objects before umount, however, if there happen to have a cache shrinker which calls lu_site_purge(nr) in parallel, some objects may still being freed by cache shrinker even after the lu_site_purge(-1) called by umount done. This can be simply fixed by serializing purge threads, since it doesn't make any sense to have them in parallel. Signed-off-by: Niu Yawei <yawei.niu@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5331 Reviewed-on: http://review.whamcloud.com/11099 Reviewed-by: Lai Siyao <lai.siyao@intel.com> Reviewed-by: Mike Pershin <mike.pershin@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-19	staging: lustre: obd: add rnb_ prefix to struct niobuf_remote members	John L. Hammond
	Add the prefix rnb_ to the members of struct niobuf_remote. Delete the relevant compat macros from ofd_internal.h. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5061 Reviewed-on: http://review.whamcloud.com/10452 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Mike Pershin <mike.pershin@intel.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>