summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2015-06-30fs: document seq_open()'s usage of file->private_dataYann Droneaud
seq_open() stores its struct seq_file in file->private_data, thus it must not be modified by user of seq_file. Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30fs: allocate structure unconditionally in seq_open()Yann Droneaud
Since patch described below, from v2.6.15-rc1, seq_open() could use a struct seq_file already allocated by the caller if the pointer to the structure is stored in file->private_data before calling the function. Commit 1abe77b0fc4b485927f1f798ae81a752677e1d05 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Mon Nov 7 17:15:34 2005 -0500 [PATCH] allow callers of seq_open do allocation themselves Allow caller of seq_open() to kmalloc() seq_file + whatever else they want and set ->private_data to it. seq_open() will then abstain from doing allocation itself. As there's no more use for such feature, as it could be easily replaced by calls to seq_open_private() (see commit 39699037a5c9 ("[FS] seq_file: Introduce the seq_open_private()")) and seq_release_private() (see v2.6.0-test3), support for this uncommon feature can be removed from seq_open(). Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-30fs: use seq_open_private() for proc_mountsYann Droneaud
A patchset to remove support for passing pre-allocated struct seq_file to seq_open(). Such feature is undocumented and prone to error. In particular, if seq_release() is used in release handler, it will kfree() a pointer which was not allocated by seq_open(). So this patchset drops support for pre-allocated struct seq_file: it's only of use in proc_namespace.c and can be easily replaced by using seq_open_private()/seq_release_private(). Additionally, it documents the use of file->private_data to hold pointer to struct seq_file by seq_open(). This patch (of 3): Since patch described below, from v2.6.15-rc1, seq_open() could use a struct seq_file already allocated by the caller if the pointer to the structure is stored in file->private_data before calling the function. Commit 1abe77b0fc4b485927f1f798ae81a752677e1d05 Author: Al Viro <viro@zeniv.linux.org.uk> Date: Mon Nov 7 17:15:34 2005 -0500 [PATCH] allow callers of seq_open do allocation themselves Allow caller of seq_open() to kmalloc() seq_file + whatever else they want and set ->private_data to it. seq_open() will then abstain from doing allocation itself. Such behavior is only used by mounts_open_common(). In order to drop support for such uncommon feature, proc_mounts is converted to use seq_open_private(), which take care of allocating the proc_mounts structure, making it available through ->private in struct seq_file. Conversely, proc_mounts is converted to use seq_release_private(), in order to release the private structure allocated by seq_open_private(). Then, ->private is used directly instead of proc_mounts() macro to access to the proc_mounts structure. Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-29Merge tag 'libnvdimm-for-4.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm Pull libnvdimm subsystem from Dan Williams: "The libnvdimm sub-system introduces, in addition to the libnvdimm-core, 4 drivers / enabling modules: NFIT: Instantiates an "nvdimm bus" with the core and registers memory devices (NVDIMMs) enumerated by the ACPI 6.0 NFIT (NVDIMM Firmware Interface table). After registering NVDIMMs the NFIT driver then registers "region" devices. A libnvdimm-region defines an access mode and the boundaries of persistent memory media. A region may span multiple NVDIMMs that are interleaved by the hardware memory controller. In turn, a libnvdimm-region can be carved into a "namespace" device and bound to the PMEM or BLK driver which will attach a Linux block device (disk) interface to the memory. PMEM: Initially merged in v4.1 this driver for contiguous spans of persistent memory address ranges is re-worked to drive PMEM-namespaces emitted by the libnvdimm-core. In this update the PMEM driver, on x86, gains the ability to assert that writes to persistent memory have been flushed all the way through the caches and buffers in the platform to persistent media. See memcpy_to_pmem() and wmb_pmem(). BLK: This new driver enables access to persistent memory media through "Block Data Windows" as defined by the NFIT. The primary difference of this driver to PMEM is that only a small window of persistent memory is mapped into system address space at any given point in time. Per-NVDIMM windows are reprogrammed at run time, per-I/O, to access different portions of the media. BLK-mode, by definition, does not support DAX. BTT: This is a library, optionally consumed by either PMEM or BLK, that converts a byte-accessible namespace into a disk with atomic sector update semantics (prevents sector tearing on crash or power loss). The sinister aspect of sector tearing is that most applications do not know they have a atomic sector dependency. At least today's disk's rarely ever tear sectors and if they do one almost certainly gets a CRC error on access. NVDIMMs will always tear and always silently. Until an application is audited to be robust in the presence of sector-tearing the usage of BTT is recommended. Thanks to: Ross Zwisler, Jeff Moyer, Vishal Verma, Christoph Hellwig, Ingo Molnar, Neil Brown, Boaz Harrosh, Robert Elliott, Matthew Wilcox, Andy Rudoff, Linda Knippers, Toshi Kani, Nicholas Moulin, Rafael Wysocki, and Bob Moore" * tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm: (33 commits) arch, x86: pmem api for ensuring durability of persistent memory updates libnvdimm: Add sysfs numa_node to NVDIMM devices libnvdimm: Set numa_node to NVDIMM devices acpi: Add acpi_map_pxm_to_online_node() libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only pmem: flag pmem block devices as non-rotational libnvdimm: enable iostat pmem: make_request cleanups libnvdimm, pmem: fix up max_hw_sectors libnvdimm, blk: add support for blk integrity libnvdimm, btt: add support for blk integrity fs/block_dev.c: skip rw_page if bdev has integrity libnvdimm: Non-Volatile Devices tools/testing/nvdimm: libnvdimm unit test infrastructure libnvdimm, nfit, nd_blk: driver for BLK-mode access persistent memory nd_btt: atomic sector updates libnvdimm: infrastructure for btt devices libnvdimm: write blk label set libnvdimm: write pmem label set libnvdimm: blk labels and namespace instantiation ...
2015-06-28Merge branch 'for-linus-4.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - remove hppfs ("HonePot ProcFS") - initial support for musl libc - uaccess cleanup - random cleanups and bug fixes all over the place * 'for-linus-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (21 commits) um: Don't pollute kernel namespace with uapi um: Include sys/types.h for makedev(), major(), minor() um: Do not use stdin and stdout identifiers for struct members um: Do not use __ptr_t type for stack_t's .ss pointer um: Fix mconsole dependency um: Handle tracehook_report_syscall_entry() result um: Remove copy&paste code from init.h um: Stop abusing __KERNEL__ um: Catch unprotected user memory access um: Fix warning in setup_signal_stack_si() um: Rework uaccess code um: Add uaccess.h to ldt.c um: Add uaccess.h to syscalls_64.c um: Add asm/elf.h to vma.c um: Cleanup mem_32/64.c headers um: Remove hppfs um: Move syscall() declaration into os.h um: kernel: ksyms: Export symbol syscall() for fixing modpost issue um/os-Linux: Use char[] for syscall_stub declarations um: Use char[] for linker script address declarations ...
2015-06-27Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "The main change in this kernel is Casey's generalized LSM stacking work, which removes the hard-coding of Capabilities and Yama stacking, allowing multiple arbitrary "small" LSMs to be stacked with a default monolithic module (e.g. SELinux, Smack, AppArmor). See https://lwn.net/Articles/636056/ This will allow smaller, simpler LSMs to be incorporated into the mainline kernel and arbitrarily stacked by users. Also, this is a useful cleanup of the LSM code in its own right" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits) tpm, tpm_crb: fix le64_to_cpu conversions in crb_acpi_add() vTPM: set virtual device before passing to ibmvtpm_reset_crq tpm_ibmvtpm: remove unneccessary message level. ima: update builtin policies ima: extend "mask" policy matching support ima: add support for new "euid" policy condition ima: fix ima_show_template_data_ascii() Smack: freeing an error pointer in smk_write_revoke_subj() selinux: fix setting of security labels on NFS selinux: Remove unused permission definitions selinux: enable genfscon labeling for sysfs and pstore files selinux: enable per-file labeling for debugfs files. selinux: update netlink socket classes signals: don't abuse __flush_signals() in selinux_bprm_committed_creds() selinux: Print 'sclass' as string when unrecognized netlink message occurs Smack: allow multiple labels in onlycap Smack: fix seq operations in smackfs ima: pass iint to ima_add_violation() ima: wrap event related data to the new ima_event_data structure integrity: add validity checks for 'path' parameter ...
2015-06-27Merge branch 'for-4.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "A relatively quiet cycle, with a mix of cleanup and smaller bugfixes" * 'for-4.2' of git://linux-nfs.org/~bfields/linux: (24 commits) sunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key() nfsd: wrap too long lines in nfsd4_encode_read nfsd: fput rd_file from XDR encode context nfsd: take struct file setup fully into nfs4_preprocess_stateid_op nfsd: refactor nfs4_preprocess_stateid_op nfsd: clean up raparams handling nfsd: use swap() in sort_pacl_range() rpcrdma: Merge svcrdma and xprtrdma modules into one svcrdma: Add a separate "max data segs macro for svcrdma svcrdma: Replace GFP_KERNEL in a loop with GFP_NOFAIL svcrdma: Keep rpcrdma_msg fields in network byte-order svcrdma: Fix byte-swapping in svc_rdma_sendto.c nfsd: Update callback sequnce id only CB_SEQUENCE success nfsd: Reset cb_status in nfsd4_cb_prepare() at retrying svcrdma: Remove svc_rdma_xdr_decode_deferred_req() SUNRPC: Move EXPORT_SYMBOL for svc_process uapi/nfs: Add NFSv4.1 ACL definitions nfsd: Remove dead declarations nfsd: work around a gcc-5.1 warning nfsd: Checking for acl support does not require fetching any acls ...
2015-06-27Merge tag 'gfs2-merge-window' of ↵Linus Torvalds
git://git.kernel.org:/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull GFS2 updates from Bob Peterson: "Here are the patches we've accumulated for GFS2 for the current upstream merge window. We have a good mixture this time. Here are some of the features: - Fix a problem with RO mounts writing to the journal. - Further improvements to quotas on GFS2. - Added support for rename2 and RENAME_EXCHANGE on GFS2. - Increase performance by making glock lru_list less of a bottleneck. - Increase performance by avoiding unnecessary buffer_head releases. - Increase performance by using average glock round trip time from all CPUs. - Fixes for some compiler warnings and minor white space issues. - Other misc bug fixes" * tag 'gfs2-merge-window' of git://git.kernel.org:/pub/scm/linux/kernel/git/gfs2/linux-gfs2: GFS2: Don't brelse rgrp buffer_heads every allocation GFS2: Don't add all glocks to the lru gfs2: Don't support fallocate on jdata files gfs2: s64 cast for negative quota value gfs2: limit quota log messages gfs2: fix quota updates on block boundaries gfs2: fix shadow warning in gfs2_rbm_find() gfs2: kerneldoc warning fixes gfs2: convert simple_str to kstr GFS2: make sure S_NOSEC flag isn't overwritten GFS2: add support for rename2 and RENAME_EXCHANGE gfs2: handle NULL rgd in set_rgrp_preferences GFS2: inode.c: indent with TABs, not spaces GFS2: mark the journal idle to fix ro mounts GFS2: Average in only non-zero round-trip times for congestion stats GFS2: Use average srttb value in congestion calculations
2015-06-27Revert "jbd2: speedup jbd2_journal_dirty_metadata()"Linus Torvalds
This reverts commit 2143c1965a761332ae417b22fd477b636e4f54ec. This commit seems to be the cause of the following jbd2 assertion failure: ------------[ cut here ]------------ kernel BUG at fs/jbd2/transaction.c:1325! invalid opcode: 0000 [#1] SMP Modules linked in: bnep bluetooth fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 ... CPU: 7 PID: 5509 Comm: gcc Not tainted 4.1.0-10944-g2a298679b411 #1 Hardware name: /DH87RL, BIOS RLH8710H.86A.0327.2014.0924.1645 09/24/2014 task: ffff8803bf866040 ti: ffff880308528000 task.ti: ffff880308528000 RIP: jbd2_journal_dirty_metadata+0x237/0x290 Call Trace: __ext4_handle_dirty_metadata+0x43/0x1f0 ext4_handle_dirty_dirent_node+0xde/0x160 ? jbd2_journal_get_write_access+0x36/0x50 ext4_delete_entry+0x112/0x160 ? __ext4_journal_start_sb+0x52/0xb0 ext4_unlink+0xfa/0x260 vfs_unlink+0xec/0x190 do_unlinkat+0x24a/0x270 SyS_unlink+0x11/0x20 entry_SYSCALL_64_fastpath+0x12/0x6a ---[ end trace ae033ebde8d080b4 ]--- which is not easily reproducible (I've seen it just once, and then Ted was able to reproduce it once). Revert it while Ted and Jan try to figure out what is wrong. Cc: Jan Kara <jack@suse.cz> Acked-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-26Merge branch 'for-4.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - threadgroup_lock got reorganized so that its users can pick the actual locking mechanism to use. Its only user - cgroups - is updated to use a percpu_rwsem instead of per-process rwsem. This makes things a bit lighter on hot paths and allows cgroups to perform and fail multi-task (a process) migrations atomically. Multi-task migrations are used in several places including the unified hierarchy. - Delegation rule and documentation added to unified hierarchy. This will likely be the last interface update from the cgroup core side for unified hierarchy before lifting the devel mask. - Some groundwork for the pids controller which is scheduled to be merged in the coming devel cycle. * 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: add delegation section to unified hierarchy documentation cgroup: require write perm on common ancestor when moving processes on the default hierarchy cgroup: separate out cgroup_procs_write_permission() from __cgroup_procs_write() kernfs: make kernfs_get_inode() public MAINTAINERS: add a cgroup core co-maintainer cgroup: fix uninitialised iterator in for_each_subsys_which cgroup: replace explicit ss_mask checking with for_each_subsys_which cgroup: use bitmask to filter for_each_subsys cgroup: add seq_file forward declaration for struct cftype cgroup: simplify threadgroup locking sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem sched, cgroup: reorganize threadgroup locking cgroup: switch to unsigned long for bitmasks cgroup: reorganize include/linux/cgroup.h cgroup: separate out include/linux/cgroup-defs.h cgroup: fix some comment typos
2015-06-26Merge tag 'driver-core-4.2-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the driver core / firmware changes for 4.2-rc1. A number of small changes all over the place in the driver core, and in the firmware subsystem. Nothing really major, full details in the shortlog. Some of it is a bit of churn, given that the platform driver probing changes was found to not work well, so they were reverted. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits) Revert "base/platform: Only insert MEM and IO resources" Revert "base/platform: Continue on insert_resource() error" Revert "of/platform: Use platform_device interface" Revert "base/platform: Remove code duplication" firmware: add missing kfree for work on async call fs: sysfs: don't pass count == 0 to bin file readers base:dd - Fix for typo in comment to function driver_deferred_probe_trigger(). base/platform: Remove code duplication of/platform: Use platform_device interface base/platform: Continue on insert_resource() error base/platform: Only insert MEM and IO resources firmware: use const for remaining firmware names firmware: fix possible use after free on name on asynchronous request firmware: check for file truncation on direct firmware loading firmware: fix __getname() missing failure check drivers: of/base: move of_init to driver_init drivers/base: cacheinfo: fix annoying typo when DT nodes are absent sysfs: disambiguate between "error code" and "failure" in comments driver-core: fix build for !CONFIG_MODULES driver-core: make __device_attach() static ...
2015-06-26Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge second patchbomb from Andrew Morton: - most of the rest of MM - lots of misc things - procfs updates - printk feature work - updates to get_maintainer, MAINTAINERS, checkpatch - lib/ updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (96 commits) exit,stats: /* obey this comment */ coredump: add __printf attribute to cn_*printf functions coredump: use from_kuid/kgid when formatting corename fs/reiserfs: remove unneeded cast NILFS2: support NFSv2 export fs/befs/btree.c: remove unneeded initializations fs/minix: remove unneeded cast init/do_mounts.c: add create_dev() failure log kasan: remove duplicate definition of the macro KASAN_FREE_PAGE fs/efs: femove unneeded cast checkpatch: emit "NOTE: <types>" message only once after multiple files checkpatch: emit an error when there's a diff in a changelog checkpatch: validate MODULE_LICENSE content checkpatch: add multi-line handling for PREFER_ETHER_ADDR_COPY checkpatch: suggest using eth_zero_addr() and eth_broadcast_addr() checkpatch: fix processing of MEMSET issues checkpatch: suggest using ether_addr_equal*() checkpatch: avoid NOT_UNIFIED_DIFF errors on cover-letter.patch files checkpatch: remove local from codespell path checkpatch: add --showfile to allow input via pipe to show filenames ...
2015-06-26fs/block_dev.c: skip rw_page if bdev has integrityVishal Verma
If a block device has bio integrity enabled, rw_page will bypass the integrity payload, which is undesirable. Skip rw_page if this is the case. Currently brd and zram provide rw_page, and the proposed 'nd' drivers will too. Cc: Jens Axboe <axboe@fb.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Suggested-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-06-25coredump: add __printf attribute to cn_*printf functionsNicolas Iooss
This allows detecting improper format string at build time, like: fs/coredump.c:225:5: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'int' [-Wformat=] err = cn_printf(cn, "%ld", cprm->siginfo->si_signo); ^ As si_signo is always an int, the format should be %d here. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25coredump: use from_kuid/kgid when formatting corenameNicolas Iooss
When adding __printf attribute to cn_printf, gcc reports some issues: fs/coredump.c:213:5: warning: format '%d' expects argument of type 'int', but argument 3 has type 'kuid_t' [-Wformat=] err = cn_printf(cn, "%d", cred->uid); ^ fs/coredump.c:217:5: warning: format '%d' expects argument of type 'int', but argument 3 has type 'kgid_t' [-Wformat=] err = cn_printf(cn, "%d", cred->gid); ^ These warnings come from the fact that the value of uid/gid needs to be extracted from the kuid_t/kgid_t structure before being used as an integer. More precisely, cred->uid and cred->gid need to be converted to either user-namespace uid/gid or to init_user_ns uid/gid. Use init_user_ns in order not to break existing ABI, and document this in Documentation/sysctl/kernel.txt. While at it, format uid and gid values with %u instead of %d because uid_t/__kernel_uid32_t and gid_t/__kernel_gid32_t are unsigned int. Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs/reiserfs: remove unneeded castFiro Yang
kmem_cache_alloc() returns void*. Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25NILFS2: support NFSv2 exportNeilBrown
The "fh_len" passed to ->fh_to_* is not guaranteed to be that same as that returned by encode_fh - it may be larger. With NFSv2, the filehandle is fixed length, so it may appear longer than expected and be zero-padded. So we must test that fh_len is at least some value, not exactly equal to it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs/befs/btree.c: remove unneeded initializationsFabian Frederick
bh, od_sup and this_node are unconditionally initialized in befs_bt_read_super() and befs_btree_find() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs/minix: remove unneeded castFiro Yang
kmem_cache_alloc() returns void*. Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs/efs: femove unneeded castFiro Yang
kmem_cache_alloc() returns void*. Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs/ext4/super.c: use strreplace() in ext4_fill_super()Rasmus Villemoes
This makes a very large function a little smaller. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs/jbd2/journal.c: use strreplace()Rasmus Villemoes
In one case, we eliminate a local variable; in the other a strlen() call and some .text. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25fs, proc: introduce CONFIG_PROC_CHILDRENIago López Galeiras
Commit 818411616baf ("fs, proc: introduce /proc/<pid>/task/<tid>/children entry") introduced the children entry for checkpoint restore and the file is only available on kernels configured with CONFIG_EXPERT and CONFIG_CHECKPOINT_RESTORE. This is available in most distributions (Fedora, Debian, Ubuntu, CoreOS) because they usually enable CONFIG_EXPERT and CONFIG_CHECKPOINT_RESTORE. But Arch does not enable CONFIG_EXPERT or CONFIG_CHECKPOINT_RESTORE. However, the children proc file is useful outside of checkpoint restore. I would like to use it in rkt. The rkt process exec() another program it does not control, and that other program will fork()+exec() a child process. I would like to find the pid of the child process from an external tool without iterating in /proc over all processes to find which one has a parent pid equal to rkt. This commit introduces CONFIG_PROC_CHILDREN and makes CONFIG_CHECKPOINT_RESTORE select it. This allows enabling /proc/<pid>/task/<tid>/children without needing to enable CONFIG_CHECKPOINT_RESTORE and CONFIG_EXPERT. Alban tested that /proc/<pid>/task/<tid>/children is present when the kernel is configured with CONFIG_PROC_CHILDREN=y but without CONFIG_CHECKPOINT_RESTORE Signed-off-by: Iago López Galeiras <iago@endocode.com> Tested-by: Alban Crequy <alban@endocode.com> Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Djalal Harouni <djalal@endocode.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25proc: fix PAGE_SIZE limit of /proc/$PID/cmdlineAlexey Dobriyan
/proc/$PID/cmdline truncates output at PAGE_SIZE. It is easy to see with $ cat /proc/self/cmdline $(seq 1037) 2>/dev/null However, command line size was never limited to PAGE_SIZE but to 128 KB and relatively recently limitation was removed altogether. People noticed and ask questions: http://stackoverflow.com/questions/199130/how-do-i-increase-the-proc-pid-cmdline-4096-byte-limit seq file interface is not OK, because it kmalloc's for whole output and open + read(, 1) + sleep will pin arbitrary amounts of kernel memory. To not do that, limit must be imposed which is incompatible with arbitrary sized command lines. I apologize for hairy code, but this it direct consequence of command line layout in memory and hacks to support things like "init [3]". The loops are "unrolled" otherwise it is either macros which hide control flow or functions with 7-8 arguments with equal line count. There should be real setproctitle(2) or something. [akpm@linux-foundation.org: fix a billion min() warnings] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Tested-by: Jarod Wilson <jarod@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Jan Stancek <jstancek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-25Merge branch 'for-4.2/writeback' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull cgroup writeback support from Jens Axboe: "This is the big pull request for adding cgroup writeback support. This code has been in development for a long time, and it has been simmering in for-next for a good chunk of this cycle too. This is one of those problems that has been talked about for at least half a decade, finally there's a solution and code to go with it. Also see last weeks writeup on LWN: http://lwn.net/Articles/648292/" * 'for-4.2/writeback' of git://git.kernel.dk/linux-block: (85 commits) writeback, blkio: add documentation for cgroup writeback support vfs, writeback: replace FS_CGROUP_WRITEBACK with SB_I_CGROUPWB writeback: do foreign inode detection iff cgroup writeback is enabled v9fs: fix error handling in v9fs_session_init() bdi: fix wrong error return value in cgwb_create() buffer: remove unusued 'ret' variable writeback: disassociate inodes from dying bdi_writebacks writeback: implement foreign cgroup inode bdi_writeback switching writeback: add lockdep annotation to inode_to_wb() writeback: use unlocked_inode_to_wb transaction in inode_congested() writeback: implement unlocked_inode_to_wb transaction and use it for stat updates writeback: implement [locked_]inode_to_wb_and_lock_list() writeback: implement foreign cgroup inode detection writeback: make writeback_control track the inode being written back writeback: relocate wb[_try]_get(), wb_put(), inode_{attach|detach}_wb() mm: vmscan: disable memcg direct reclaim stalling if cgroup writeback support is in use writeback: implement memcg writeback domain based throttling writeback: reset wb_domain->dirty_limit[_tstmp] when memcg domain size changes writeback: implement memcg wb_domain writeback: update wb_over_bg_thresh() to use wb_domain aware operations ...
2015-06-25Merge branch 'for-4.2/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block IO update from Jens Axboe: "Nothing really major in here, mostly a collection of smaller optimizations and cleanups, mixed with various fixes. In more detail, this contains: - Addition of policy specific data to blkcg for block cgroups. From Arianna Avanzini. - Various cleanups around command types from Christoph. - Cleanup of the suspend block I/O path from Christoph. - Plugging updates from Shaohua and Jeff Moyer, for blk-mq. - Eliminating atomic inc/dec of both remaining IO count and reference count in a bio. From me. - Fixes for SG gap and chunk size support for data-less (discards) IO, so we can merge these better. From me. - Small restructuring of blk-mq shared tag support, freeing drivers from iterating hardware queues. From Keith Busch. - A few cfq-iosched tweaks, from Tahsin Erdogan and me. Makes the IOPS mode the default for non-rotational storage" * 'for-4.2/core' of git://git.kernel.dk/linux-block: (35 commits) cfq-iosched: fix other locations where blkcg_to_cfqgd() can return NULL cfq-iosched: fix sysfs oops when attempting to read unconfigured weights cfq-iosched: move group scheduling functions under ifdef cfq-iosched: fix the setting of IOPS mode on SSDs blktrace: Add blktrace.c to BLOCK LAYER in MAINTAINERS file block, cgroup: implement policy-specific per-blkcg data block: Make CFQ default to IOPS mode on SSDs block: add blk_set_queue_dying() to blkdev.h blk-mq: Shared tag enhancements block: don't honor chunk sizes for data-less IO block: only honor SG gap prevention for merges that contain data block: fix returnvar.cocci warnings block, dm: don't copy bios for request clones block: remove management of bi_remaining when restoring original bi_end_io block: replace trylock with mutex_lock in blkdev_reread_part() block: export blkdev_reread_part() and __blkdev_reread_part() suspend: simplify block I/O handling block: collapse bio bit space block: remove unused BIO_RW_BLOCK and BIO_EOF flags block: remove BIO_EOPNOTSUPP ...
2015-06-25Merge tag 'upstream-4.2-rc1' of git://git.infradead.org/linux-ubifsLinus Torvalds
Pull UBI/UBIFS updates from Richard Weinberger: "Minor fixes for UBI and UBIFS" * tag 'upstream-4.2-rc1' of git://git.infradead.org/linux-ubifs: UBI: Remove unnecessary `\' UBI: Use static class and attribute groups UBI: add a helper function for updatting on-flash layout volumes UBI: Fastmap: Do not add vol if it already exists UBI: Init vol->reserved_pebs by assignment UBI: Fastmap: Rename variables to make them meaningful UBI: Fastmap: Remove unnecessary `\' UBI: Fastmap: Use max() to get the larger value ubifs: fix to check error code of register_shrinker UBI: block: Dynamically allocate minor numbers
2015-06-25Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "A very large number of cleanups and bug fixes --- in particular for the ext4 encryption patches, which is a new feature added in the last merge window. Also fix a number of long-standing xfstest failures. (Quota writes failing due to ENOSPC, a race between truncate and writepage in data=journalled mode that was causing generic/068 to fail, and other corner cases.) Also add support for FALLOC_FL_INSERT_RANGE, and improve jbd2 performance eliminating locking when a buffer is modified more than once during a transaction (which is very common for allocation bitmaps, for example), in which case the state of the journalled buffer head doesn't need to change" [ I renamed "ext4_follow_link()" to "ext4_encrypted_follow_link()" in the merge resolution, to make it clear that that function is _only_ used for encrypted symlinks. The function doesn't actually work for non-encrypted symlinks at all, and they use the generic helpers - Linus ] * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (52 commits) ext4: set lazytime on remount if MS_LAZYTIME is set by mount ext4: only call ext4_truncate when size <= isize ext4: make online defrag error reporting consistent ext4: minor cleanup of ext4_da_reserve_space() ext4: don't retry file block mapping on bigalloc fs with non-extent file ext4: prevent ext4_quota_write() from failing due to ENOSPC ext4: call sync_blockdev() before invalidate_bdev() in put_super() jbd2: speedup jbd2_journal_dirty_metadata() jbd2: get rid of open coded allocation retry loop ext4: improve warning directory handling messages jbd2: fix ocfs2 corrupt when updating journal superblock fails ext4: mballoc: avoid 20-argument function call ext4: wait for existing dio workers in ext4_alloc_file_blocks() ext4: recalculate journal credits as inode depth changes jbd2: use GFP_NOFS in jbd2_cleanup_journal_tail() ext4: use swap() in mext_page_double_lock() ext4: use swap() in memswap() ext4: fix race between truncate and __ext4_journalled_writepage() ext4 crypto: fail the mount if blocksize != pagesize ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate ...
2015-06-24Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge first patchbomb from Andrew Morton: - a few misc things - ocfs2 udpates - kernel/watchdog.c feature work (took ages to get right) - most of MM. A few tricky bits are held up and probably won't make 4.2. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (91 commits) mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc() mm, thp: respect MPOL_PREFERRED policy with non-local node tmpfs: truncate prealloc blocks past i_size mm/memory hotplug: print the last vmemmap region at the end of hot add memory mm/mmap.c: optimization of do_mmap_pgoff function mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan mm: kmemleak: avoid deadlock on the kmemleak object insertion error path mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup() mm: kmemleak: fix delete_object_*() race when called on the same memory block mm: kmemleak: allow safe memory scanning during kmemleak disabling memcg: convert mem_cgroup->under_oom from atomic_t to int memcg: remove unused mem_cgroup->oom_wakeups frontswap: allow multiple backends x86, mirror: x86 enabling - find mirrored memory ranges mm/memblock: allocate boot time data structures from mirrored memory mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute mm: do not ignore mapping_gfp_mask in page cache allocation paths mm/cma.c: fix typos in comments mm/oom_kill.c: print points as unsigned int mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages ...
2015-06-24Merge tag 'please-pull-pstore' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore updates from Tony Luck: "Miscellaneous pstore improvements" * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: ramoops: make it possible to change mem_type param. pstore/ram: verify ramoops header before saving record fs/pstore: Optimization function ramoops_init_przs fs/pstore: update the backend parameter in pstore module pstore: do not use message compression without lock
2015-06-24Merge tag 'for-f2fs-4.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "New features: - per-file encryption (e.g., ext4) - FALLOC_FL_ZERO_RANGE - FALLOC_FL_COLLAPSE_RANGE - RENAME_WHITEOUT Major enhancement/fixes: - recovery broken superblocks - enhance f2fs_trim_fs with a discard_map - fix a race condition on dentry block allocation - fix a deadlock during summary operation - fix a missing fiemap result .. and many minor bug fixes and clean-ups were done" * tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (83 commits) f2fs: do not trim preallocated blocks when truncating after i_size f2fs crypto: add alloc_bounce_page f2fs crypto: fix to handle errors likewise ext4 f2fs: drop the volatile_write flag only f2fs: skip committing valid superblock f2fs: setting discard option in parse_options() f2fs: fix to return exact trimmed size f2fs: support FALLOC_FL_INSERT_RANGE f2fs: hide common code in f2fs_replace_block f2fs: disable the discard option when device doesn't support f2fs crypto: remove alloc_page for bounce_page f2fs: fix a deadlock for summary page lock vs. sentry_lock f2fs crypto: clean up error handling in f2fs_fname_setup_filename f2fs crypto: avoid f2fs_inherit_context for symlink f2fs crypto: do not set encryption policy for non-directory by ioctl f2fs crypto: allow setting encryption policy once f2fs crypto: check context consistent for rename2 f2fs: avoid duplicated code by reusing f2fs_read_end_io f2fs crypto: use per-inode tfm structure f2fs: recovering broken superblock during mount ...
2015-06-24Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF fixes and cleanups from Jan Kara: "The contains some small fixes and improvements in error handling for UDF. Bundled is also one ext3 coding style fix and a fix in quota documentation" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: fix udf_load_pvoldesc() udf: remove double err declaration in udf_file_write_iter() UDF: support NFSv2 export fs: ext3: super: fixed a space coding style issue quota: Update documentation udf: Return error from udf_find_entry() udf: Make udf_get_filename() return error instead of 0 length file name udf: bug on exotic flag in udf_get_filename() udf: improve error management in udf_CS0toNLS() udf: improve error management in udf_CS0toUTF8() udf: unicode: update function name in comments udf: remove unnecessary test in udf_build_ustr_exact() udf: Return -ENOMEM when allocation fails in udf_get_filename()
2015-06-24mm: do not ignore mapping_gfp_mask in page cache allocation pathsMichal Hocko
page_cache_read, do_generic_file_read, __generic_file_splice_read and __ntfs_grab_cache_pages currently ignore mapping_gfp_mask when calling add_to_page_cache_lru which might cause recursion into fs down in the direct reclaim path if the mapping really relies on GFP_NOFS semantic. This doesn't seem to be the case now because page_cache_read (page fault path) doesn't seem to suffer from the reclaim recursion issues and do_generic_file_read and __generic_file_splice_read also shouldn't be called under fs locks which would deadlock in the reclaim path. Anyway it is better to obey mapping gfp mask and prevent from later breakage. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24mm/hugetlb: reduce arch dependent code about hugetlb_prefault_arch_hookZhang Zhen
Currently we have many duplicates in definitions of hugetlb_prefault_arch_hook. In all architectures this function is empty. Signed-off-by: Zhang Zhen <zhenzhang.zhang@huawei.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24procfs: treat parked tasks as sleeping for task stateChris Metcalf
Allowing watchdog threads to be parked means that we now have the opportunity of actually seeing persistent parked threads in the output of /proc/<pid>/stat and /proc/<pid>/status. The existing code reported such threads as "Running", which is kind-of true if you think of the case where we park them as part of taking cpus offline. But if we allow parking them indefinitely, "Running" is pretty misleading, so we report them as "Sleeping" instead. We could simply report them with a new string, "Parked", but it feels like it's a bit risky for userspace to see unexpected new values; the output is already documented in Documentation/filesystems/proc.txt, and it seems like a mistake to change that lightly. The scheduler does report parked tasks with a "P" in debugging output from sched_show_task() or dump_cpu_task(), but that's a different API. Similarly, the trace_ctxwake_* routines report a "P" for parked tasks, but again, different API. This change seemed slightly cleaner than updating the task_state_array to have additional rows. TASK_DEAD should be subsumed by the exit_state bits; TASK_WAKEKILL is just a modifier; and TASK_WAKING can very reasonably be reported as "Running" (as it is now). Only TASK_PARKED shows up with unreasonable output here. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: mark local functions as staticJoseph Qi
Some functions are only used locally, so mark them as static. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: use swap() in ocfs2_double_lock()Fabian Frederick
Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: use swap() in swap_refcount_rec()Fabian Frederick
Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: use swap() in dx_leaf_sort_swap()Fabian Frederick
Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: fix wrong check in ocfs2_direct_IO_get_blocksJoseph Qi
contig_blocks gotten from ocfs2_extent_map_get_blocks cannot be compared with clusters_to_alloc. So convert it to clusters first. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Weiwei Wang <wangww631@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: fix NULL pointer dereference in function ocfs2_abort_trigger()Xue jiufei
ocfs2_abort_trigger() use bh->b_assoc_map to get sb. But there's no function to set bh->b_assoc_map in ocfs2, it will trigger NULL pointer dereference while calling this function. We can get sb from bh->b_bdev->bd_super instead of b_assoc_map. [akpm@linux-foundation.org: update comment, per Joseph] Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: o2net: should remove debugfs in o2net_init() out branchalex chen
Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: remove OCFS2_IOCB_SEM lock type in direct ioWeiWei Wang
In ocfs2 direct read/write, OCFS2_IOCB_SEM lock type is used to protect inode->i_alloc_sem rw semaphore lock in the earlier kernel version. However, in the latest kernel, inode->i_alloc_sem rw semaphore lock is not used at all, so OCFS2_IOCB_SEM lock type needs to be removed. Signed-off-by: Weiwei Wang <wangww631@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: do not BUG if jbd2_journal_dirty_metadata failsJoseph Qi
jbd2_journal_dirty_metadata may fail. Currently it cannot take care of non zero return value and just BUG in ocfs2_journal_dirty. This patch is aborting the handle and journal instead of BUG. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: remove BUG_ON(!empty_extent) in __ocfs2_rotate_tree_left()Xue jiufei
ocfs2_rotate_tree_left() calls __ocfs2_rotate_tree_left() for left rotation while non-rightmost path containing an empty extent in the leaf block. __ocfs2_rotate_tree_left() returns -EAGAIN if right subtree having an empty extent and pass the empty_extent_path to caller. The caller ocfs2_rotate_tree_left() will restart rotation from the returned path. It will trigger the BUG_ON(!ocfs2_is_empty_extent) when the et on disk is as follows: eb0 is the leaf block of path(say path_a) passed to ocfs2_rotate_tree_left, which has an empty rec[0]. eb1 is the leaf block of path(say path_b) that just right to path_a, which has no empty record. eb2 is the leaf block of path(say path_c) that just right to path_b, which has an empty rec[0]. And path_c is also the rightmost path. Now we want to remove the empty rec[0] in eb0: ocfs2_rotate_tree_left: -> call __ocfs2_rotate_tree_left with path_a as its input *path* -> call ocfs2_rotate_subtree_left with path_a as its input *left_path* and path_b as its input *right_path*. it will move rec[0] in eb1 to eb0, and rec[0] in eb0 is not empty now. -> continue to call ocfs2_rotate_subtree_left with path_b as its input *left_path* and path_c as its input *right_path*, and return -EAGAIN because eb2 has an empty rec[0] -> call __ocfs2_rotate_tree_left with path_c as it input, rotate all records in eb2 to left and return 0. -> call __ocfs2_rotate_tree_left with path_a as its input, and triggers the BUG_ON(!ocfs2_is_empty_extent) as the rec[0] in eb0 is not empty. So the BUG_ON() should be removed and return 0 if rec[0] is no longer an empty extent. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: return error when ocfs2_figure_merge_contig_type() failsXue jiufei
ocfs2_figure_merge_contig_type() still returns CONTIG_NONE when some error occurs which will cause an unpredictable error. So return a proper errno when ocfs2_figure_merge_contig_type() fails. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2/dlm: cleanup unused function __dlm_wait_on_lockres_flags_setJoseph Qi
__dlm_wait_on_lockres_flags_set() is declared but not implemented and used. So remove it. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: use retval instead of status for checking errorDaeseok Youn
The use of 'status' in __ocfs2_add_entry() can return wrong value. Some functions' return value in __ocfs2_add_entry(), i.e ocfs2_journal_access_di() is saved to 'status'. But 'status' is not used in 'bail' label for returning result of __ocfs2_add_entry(). So use retval instead of status. Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: fix a tiny race when truncate dio orohaned entryJoseph Qi
Once dio crashed it will leave an entry in orphan dir. And orphan scan will take care of the clean up. There is a tiny race case that the same entry will be truncated twice and then trigger the BUG in ocfs2_del_inode_from_orphan. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-06-24ocfs2: remove __mlog_cpu_guessAndrew Morton
raw_smp_processor_id() is the means of avoiding the runtime preemptibility check. [akpm@linux-foundation.org: fix printk warning] Cc: Joe Perches <joe@perches.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>