summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2017-08-07Merge tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: "I have a couple more bug fixes for you today: - fix memory leak when issuing discard - fix propagation of the dax inode flag" * tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Fix per-inode DAX flag inheritance xfs: Fix leak of discard bio
2017-08-07Merge tag 'mlx5-shared-2017-08-07' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-shared-2017-08-07 This series includes some mlx5 updates for both net-next and rdma trees. From Saeed, Core driver updates to allow selectively building the driver with or without some large driver components, such as - E-Switch (Ethernet SRIOV support). - Multi-Physical Function Switch (MPFs) support. For that we split E-Switch and MPFs functionalities into separate files. From Erez, Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration is taking place and until it completes. From Rabie, Increase the maximum supported flow counters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07dlm: use sock_create_lite inside tcp_accept_from_sockGuoqing Jiang
With commit 0ffdaf5b41cf ("net/sock: add WARN_ON(parent->sk) in sock_graft()"), a calltrace happened as follows: [ 457.018340] WARNING: CPU: 0 PID: 15623 at ./include/net/sock.h:1703 inet_accept+0x135/0x140 ... [ 457.018381] RIP: 0010:inet_accept+0x135/0x140 [ 457.018381] RSP: 0018:ffffc90001727d18 EFLAGS: 00010286 [ 457.018383] RAX: 0000000000000001 RBX: ffff880012413000 RCX: 0000000000000001 [ 457.018384] RDX: 000000000000018a RSI: 00000000fffffe01 RDI: ffffffff8156fae8 [ 457.018384] RBP: ffffc90001727d38 R08: 0000000000000000 R09: 0000000000004305 [ 457.018385] R10: 0000000000000001 R11: 0000000000004304 R12: ffff880035ae7a00 [ 457.018386] R13: ffff88001282af10 R14: ffff880034e4e200 R15: 0000000000000000 [ 457.018387] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 457.018388] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 457.018389] CR2: 00007fdec22f9000 CR3: 0000000002b5a000 CR4: 00000000000006f0 [ 457.018395] Call Trace: [ 457.018402] tcp_accept_from_sock.part.8+0x12d/0x449 [dlm] [ 457.018405] ? vprintk_emit+0x248/0x2d0 [ 457.018409] tcp_accept_from_sock+0x3f/0x50 [dlm] [ 457.018413] process_recv_sockets+0x3b/0x50 [dlm] [ 457.018415] process_one_work+0x138/0x370 [ 457.018417] worker_thread+0x4d/0x3b0 [ 457.018419] kthread+0x109/0x140 [ 457.018421] ? rescuer_thread+0x320/0x320 [ 457.018422] ? kthread_park+0x60/0x60 [ 457.018424] ret_from_fork+0x25/0x30 Since newsocket created by sock_create_kern sets it's sock by the path: sock_create_kern -> __sock_creat ->pf->create => inet_create -> sock_init_data Then WARN_ON is triggered by "con->sock->ops->accept => inet_accept -> sock_graft", it also means newsock->sk is leaked since sock_graft will replace it with a new sk. To resolve the issue, we need to use sock_create_lite instead of sock_create_kern, like commit 0933a578cd55 ("rds: tcp: use sock_create_lite() to create the accept socket") did. Reported-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: avoid double-free on error path in dlm_device_{register,unregister}Edwin Török
Can be reproduced when running dlm_controld (tested on 4.4.x, 4.12.4): # seq 1 100 | xargs -P0 -n1 dlm_tool join # seq 1 100 | xargs -P0 -n1 dlm_tool leave misc_register fails due to duplicate sysfs entry, which causes dlm_device_register to free ls->ls_device.name. In dlm_device_deregister the name was freed again, causing memory corruption. According to the comment in dlm_device_deregister the name should've been set to NULL when registration fails, so this patch does that. sysfs: cannot create duplicate filename '/dev/char/10:1' ------------[ cut here ]------------ warning: cpu: 1 pid: 4450 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x56/0x70 modules linked in: msr rfcomm dlm ccm bnep dm_crypt uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev btusb media btrtl btbcm btintel bluetooth ecdh_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_hdmi irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel thinkpad_acpi pcbc nvram snd_seq_midi snd_seq_midi_event aesni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_hda_intel snd_hda_codec cryptd intel_cstate arc4 snd_hda_core snd_seq snd_seq_device snd_hwdep iwldvm intel_rapl_perf mac80211 joydev input_leds iwlwifi serio_raw cfg80211 snd_pcm shpchp snd_timer snd mac_hid mei_me lpc_ich mei soundcore sunrpc parport_pc ppdev lp parport autofs4 i915 psmouse e1000e ahci libahci i2c_algo_bit sdhci_pci ptp drm_kms_helper sdhci pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops drm wmi video cpu: 1 pid: 4450 comm: dlm_test.exe not tainted 4.12.4-041204-generic hardware name: lenovo 232425u/232425u, bios g2et82ww (2.02 ) 09/11/2012 task: ffff96b0cbabe140 task.stack: ffffb199027d0000 rip: 0010:sysfs_warn_dup+0x56/0x70 rsp: 0018:ffffb199027d3c58 eflags: 00010282 rax: 0000000000000038 rbx: ffff96b0e2c49158 rcx: 0000000000000006 rdx: 0000000000000000 rsi: 0000000000000086 rdi: ffff96b15e24dcc0 rbp: ffffb199027d3c70 r08: 0000000000000001 r09: 0000000000000721 r10: ffffb199027d3c00 r11: 0000000000000721 r12: ffffb199027d3cd1 r13: ffff96b1592088f0 r14: 0000000000000001 r15: ffffffffffffffef fs: 00007f78069c0700(0000) gs:ffff96b15e240000(0000) knlgs:0000000000000000 cs: 0010 ds: 0000 es: 0000 cr0: 0000000080050033 cr2: 000000178625ed28 cr3: 0000000091d3e000 cr4: 00000000001406e0 call trace: sysfs_do_create_link_sd.isra.2+0x9e/0xb0 sysfs_create_link+0x25/0x40 device_add+0x5a9/0x640 device_create_groups_vargs+0xe0/0xf0 device_create_with_groups+0x3f/0x60 ? snprintf+0x45/0x70 misc_register+0x140/0x180 device_write+0x6a8/0x790 [dlm] __vfs_write+0x37/0x160 ? apparmor_file_permission+0x1a/0x20 ? security_file_permission+0x3b/0xc0 vfs_write+0xb5/0x1a0 sys_write+0x55/0xc0 ? sys_fcntl+0x5d/0xb0 entry_syscall_64_fastpath+0x1e/0xa9 rip: 0033:0x7f78083454bd rsp: 002b:00007f78069bbd30 eflags: 00000293 orig_rax: 0000000000000001 rax: ffffffffffffffda rbx: 0000000000000006 rcx: 00007f78083454bd rdx: 000000000000009c rsi: 00007f78069bee00 rdi: 0000000000000005 rbp: 00007f77f8000a20 r08: 000000000000fcf0 r09: 0000000000000032 r10: 0000000000000024 r11: 0000000000000293 r12: 00007f78069bde00 r13: 00007f78069bee00 r14: 000000000000000a r15: 00007f78069bbd70 code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef e8 2c c8 ff ff 4c 89 e2 48 89 de 48 c7 c7 b0 8e 0c a8 e8 41 e8 ed ff <0f> ff 48 89 df e8 00 d5 f4 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84 ---[ end trace 40412246357cc9e0 ]--- dlm: 59f24629-ae39-44e2-9030-397ebc2eda26: leaving the lockspace group... bug: unable to handle kernel null pointer dereference at 0000000000000001 ip: [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 pgd 0 oops: 0000 [#1] smp modules linked in: dlm 8021q garp mrp stp llc openvswitch nf_defrag_ipv6 nf_conntrack libcrc32c iptable_filter dm_multipath crc32_pclmul dm_mod aesni_intel psmouse aes_x86_64 sg ablk_helper cryptd lrw gf128mul glue_helper i2c_piix4 nls_utf8 tpm_tis tpm isofs nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc xen_wdt ip_tables x_tables autofs4 hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi 8139too serio_raw ata_piix 8139cp mii uhci_hcd ehci_pci ehci_hcd libata scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6 cpu: 0 pid: 394 comm: systemd-udevd tainted: g w 4.4.0+0 #1 hardware name: xen hvm domu, bios 4.7.2-2.2 05/11/2017 task: ffff880002410000 ti: ffff88000243c000 task.ti: ffff88000243c000 rip: e030:[<ffffffff811a3b4a>] [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 rsp: e02b:ffff88000243fd90 eflags: 00010202 rax: 0000000000000000 rbx: ffff8800029864d0 rcx: 000000000007b36c rdx: 000000000007b36b rsi: 00000000024000c0 rdi: ffff880036801c00 rbp: ffff88000243fdc0 r08: 0000000000018880 r09: 0000000000000054 r10: 000000000000004a r11: ffff880034ace6c0 r12: 00000000024000c0 r13: ffff880036801c00 r14: 0000000000000001 r15: ffffffff8118dcc2 fs: 00007f0ab77548c0(0000) gs:ffff880036e00000(0000) knlgs:0000000000000000 cs: e033 ds: 0000 es: 0000 cr0: 0000000080050033 cr2: 0000000000000001 cr3: 000000000332d000 cr4: 0000000000040660 stack: ffffffff8118dc90 ffff8800029864d0 0000000000000000 ffff88003430b0b0 ffff880034b78320 ffff88003430b0b0 ffff88000243fdf8 ffffffff8118dcc2 ffff8800349c6700 ffff8800029864d0 000000000000000b 00007f0ab7754b90 call trace: [<ffffffff8118dc90>] ? anon_vma_fork+0x60/0x140 [<ffffffff8118dcc2>] anon_vma_fork+0x92/0x140 [<ffffffff8107033e>] copy_process+0xcae/0x1a80 [<ffffffff8107128b>] _do_fork+0x8b/0x2d0 [<ffffffff81071579>] sys_clone+0x19/0x20 [<ffffffff815a30ae>] entry_syscall_64_fastpath+0x12/0x71 ] code: f6 75 1c 4c 89 fa 44 89 e6 4c 89 ef e8 a7 e4 00 00 41 f7 c4 00 80 00 00 49 89 c6 74 47 eb 32 49 63 45 20 48 8d 4a 01 4d 8b 45 00 <49> 8b 1c 06 4c 89 f0 65 49 0f c7 08 0f 94 c0 84 c0 74 ac 49 63 rip [<ffffffff811a3b4a>] kmem_cache_alloc+0x7a/0x140 rsp <ffff88000243fd90> cr2: 0000000000000001 --[ end trace 70cb9fd1b164a0e8 ]-- CC: stable@vger.kernel.org Signed-off-by: Edwin Török <edvin.torok@citrix.com> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: constify kset_uevent_ops structureBhumika Goyal
Declare kset_uevent_ops structure as const as it is only passed as an argument to the function kset_create_and_add. This argument is of type const, so declare the structure as const. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: print log message when cluster name is not setZhu Lingshan
Print a message when a cluster name is not specified by the caller. In this case the cluster name configured for the dlm is used without any validation that it is the cluster expected by the application. Signed-off-by: Zhu Lingshan <lszhu@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Delete an unnecessary variable initialisation in dlm_ls_start()Markus Elfring
The local variable "rv" is reassigned by a statement at the beginning. Thus omit the explicit initialisation. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Improve a size determination in two functionsMarkus Elfring
Replace the specification of two data structures by pointer dereferences as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Use kcalloc() in two functionsMarkus Elfring
* Multiplications for the size determination of memory allocations indicated that array data structures should be processed. Thus reuse the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. * Replace the specification of data structures by pointer dereferences to make the corresponding size determinations a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Use kmalloc_array() in make_member_array()Markus Elfring
* A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kmalloc_array". This issue was detected by using the Coccinelle software. * Replace the specification of a data type by a pointer dereference to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Delete an error message for a failed memory allocation in ↵Markus Elfring
dlm_recover_waiters_pre() Omit an extra message for a memory allocation failure in this function. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Improve a size determination in dlm_recover_waiters_pre()Markus Elfring
Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Use kcalloc() in dlm_scan_waiters()Markus Elfring
A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "kcalloc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Improve a size determination in table_seq_start()Markus Elfring
Replace the specification of a data structure by a pointer dereference as the parameter for the operator "sizeof" to make the corresponding size determination a bit safer according to the Linux coding style convention. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Add spaces for better code readabilityMarkus Elfring
The script "checkpatch.pl" pointed information out like the following. CHECK: spaces preferred around that '+' (ctx:VxV) Thus fix the affected source code places. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Replace six seq_puts() calls by seq_putc()Markus Elfring
Six single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Make dismatch error message more clearGang He
This change will try to make this error message more clear, since the upper applications (e.g. ocfs2) invoke dlm_new_lockspace to create a new lockspace with passing a cluster name. Sometimes, dlm_new_lockspace return failure while two cluster names dismatch, the user is a little confused since this line error message is not enough obvious. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07dlm: Fix kernel memory disclosureVlad Tsyrklevich
Clear the 'unused' field and the uninitialized padding in 'lksb' to avoid leaking memory to userland in copy_result_to_user(). Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Signed-off-by: David Teigland <teigland@redhat.com>
2017-08-07quota: correct space limit checkzhangyi (F)
Currently we compare total space (curspace + rsvspace) with space limit in quota-tools when setting grace time and also in check_bdq(), but we missing rsvspace in somewhere else, correct them. This patch also fix incorrect zero dqb_btime and grace time updating failure when we use rsvspace(e.g. ext4 dalloc feature). Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-08-07NFSv4: Cleanup setting of the migration flags.Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-07NFSv4.1: Ensure we clear the SP4_MACH_CRED flags in nfs4_sp4_select_mode()Trond Myklebust
If the server changes, so that it no longer supports SP4_MACH_CRED, or that it doesn't support the same set of SP4_MACH_CRED functionality, then we want to ensure that we clear the unsupported flags. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-07NFSv4: Refactor _nfs4_proc_exchange_id()Trond Myklebust
Tease apart the functionality in nfs4_exchange_id_done() so that it is easier to debug exchange id vs trunking issues by moving all the processing out of nfs4_exchange_id_done() and into the callers. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-06Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A large number of ext4 bug fixes and cleanups for v4.13" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix copy paste error in ext4_swap_extents() ext4: fix overflow caused by missing cast in ext4_resize_fs() ext4, project: expand inode extra size if possible ext4: cleanup ext4_expand_extra_isize_ea() ext4: restructure ext4_expand_extra_isize ext4: fix forgetten xattr lock protection in ext4_expand_extra_isize ext4: make xattr inode reads faster ext4: inplace xattr block update fails to deduplicate blocks ext4: remove unused mode parameter ext4: fix warning about stack corruption ext4: fix dir_nlink behaviour ext4: silence array overflow warning ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesize ext4: release discard bio after sending discard commands ext4: convert swap_inode_data() over to use swap() on most of the fields ext4: error should be cleared if ea_inode isn't added to the cache ext4: Don't clear SGID when inheriting ACLs ext4: preserve i_mode if __ext4_set_acl() fails ext4: remove unused metadata accounting variables ext4: correct comment references to ext4_ext_direct_IO()
2017-08-06ext4: fix copy paste error in ext4_swap_extents()Maninder Singh
This bug was found by a static code checker tool for copy paste problems. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Signed-off-by: Vaneet Narang <v.narang@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-06ext4: fix overflow caused by missing cast in ext4_resize_fs()Jerry Lee
On a 32-bit platform, the value of n_blcoks_count may be wrong during the file system is resized to size larger than 2^32 blocks. This may caused the superblock being corrupted with zero blocks count. Fixes: 1c6bd7173d66 Signed-off-by: Jerry Lee <jerrylee@qnap.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org # 3.7+
2017-08-06ext4, project: expand inode extra size if possibleMiao Xie
When upgrading from old format, try to set project id to old file first time, it will return EOVERFLOW, but if that file is dirtied(touch etc), changing project id will be allowed, this might be confusing for users, we could try to expand @i_extra_isize here too. Reported-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-06ext4: cleanup ext4_expand_extra_isize_ea()Miao Xie
Clean up some goto statement, make ext4_expand_extra_isize_ea() clearer. Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06ext4: restructure ext4_expand_extra_isizeMiao Xie
Current ext4_expand_extra_isize just tries to expand extra isize, if someone is holding xattr lock or some check fails, it will give up. So rename its name to ext4_try_to_expand_extra_isize. Besides that, we clean up unnecessary check and move some relative checks into it. Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06ext4: fix forgetten xattr lock protection in ext4_expand_extra_isizeMiao Xie
We should avoid the contention between the i_extra_isize update and the inline data insertion, so move the xattr trylock in front of i_extra_isize update. Signed-off-by: Miao Xie <miaoxie@huawei.com> Reviewed-by: Wang Shilong <wshilong@ddn.com>
2017-08-06ext4: make xattr inode reads fasterTahsin Erdogan
ext4_xattr_inode_read() currently reads each block sequentially while waiting for io operation to complete before moving on to the next block. This prevents request merging in block layer. Add a ext4_bread_batch() function that starts reads for all blocks then optionally waits for them to complete. A similar logic is used in ext4_find_entry(), so update that code to use the new function. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: inplace xattr block update fails to deduplicate blocksTahsin Erdogan
When an xattr block has a single reference, block is updated inplace and it is reinserted to the cache. Later, a cache lookup is performed to see whether an existing block has the same contents. This cache lookup will most of the time return the just inserted entry so deduplication is not achieved. Running the following test script will produce two xattr blocks which can be observed in "File ACL: " line of debugfs output: mke2fs -b 1024 -I 128 -F -O extent /dev/sdb 1G mount /dev/sdb /mnt/sdb touch /mnt/sdb/{x,y} setfattr -n user.1 -v aaa /mnt/sdb/x setfattr -n user.2 -v bbb /mnt/sdb/x setfattr -n user.1 -v aaa /mnt/sdb/y setfattr -n user.2 -v bbb /mnt/sdb/y debugfs -R 'stat x' /dev/sdb | cat debugfs -R 'stat y' /dev/sdb | cat This patch defers the reinsertion to the cache so that we can locate other blocks with the same contents. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
2017-08-05ext4: remove unused mode parameterTahsin Erdogan
ext4_alloc_file_blocks() does not use its mode parameter. Remove it. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: fix warning about stack corruptionArnd Bergmann
After commit 62d1034f53e3 ("fortify: use WARN instead of BUG for now"), we get a warning about possible stack overflow from a memcpy that was not strictly bounded to the size of the local variable: inlined from 'ext4_mb_seq_groups_show' at fs/ext4/mballoc.c:2322:2: include/linux/string.h:309:9: error: '__builtin_memcpy': writing between 161 and 1116 bytes into a region of size 160 overflows the destination [-Werror=stringop-overflow=] We actually had a bug here that would have been found by the warning, but it was already fixed last year in commit 30a9d7afe70e ("ext4: fix stack memory corruption with 64k block size"). This replaces the fixed-length structure on the stack with a variable-length structure, using the correct upper bound that tells the compiler that everything is really fine here. I also change the loop count to check for the same upper bound for consistency, but the existing code is already correct here. Note that while clang won't allow certain kinds of variable-length arrays in structures, this particular instance is fine, as the array is at the end of the structure, and the size is strictly bounded. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: fix dir_nlink behaviourAndreas Dilger
The dir_nlink feature has been enabled by default for new ext4 filesystems since e2fsprogs-1.41 in 2008, and was automatically enabled by the kernel for older ext4 filesystems since the dir_nlink feature was added with ext4 in kernel 2.6.28+ when the subdirectory count exceeded EXT4_LINK_MAX-1. Automatically adding the file system features such as dir_nlink is generally frowned upon, since it could cause the file system to not be mountable on older kernel, thus preventing the administrator from rolling back to an older kernel if necessary. In this case, the administrator might also want to disable the feature because glibc's fts_read() function does not correctly optimize directory traversal for directories that use st_nlinks field of 1 to indicate that the number of links in the directory are not tracked by the file system, and could fail to traverse the full directory hierarchy. Fortunately, in the past ten years very few users have complained about incomplete file system traversal by glibc's fts_read(). This commit also changes ext4_inc_count() to allow i_nlinks to reach the full EXT4_LINK_MAX links on the parent directory (including "." and "..") before changing i_links_count to be 1. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196405 Signed-off-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: silence array overflow warningDan Carpenter
I get a static checker warning: fs/ext4/ext4.h:3091 ext4_set_de_type() error: buffer overflow 'ext4_type_by_mode' 15 <= 15 It seems unlikely that we would hit this read overflow in real life, but it's also simple enough to make the array 16 bytes instead of 15. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2017-08-05ext4: fix SEEK_HOLE/SEEK_DATA for blocksize < pagesizeJan Kara
ext4_find_unwritten_pgoff() does not properly handle a situation when starting index is in the middle of a page and blocksize < pagesize. The following command shows the bug on filesystem with 1k blocksize: xfs_io -f -c "falloc 0 4k" \ -c "pwrite 1k 1k" \ -c "pwrite 3k 1k" \ -c "seek -a -r 0" foo In this example, neither lseek(fd, 1024, SEEK_HOLE) nor lseek(fd, 2048, SEEK_DATA) will return the correct result. Fix the problem by neglecting buffers in a page before starting offset. Reported-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> CC: stable@vger.kernel.org # 3.8+
2017-08-05ext4: release discard bio after sending discard commandsDaeho Jeong
We've changed the discard command handling into parallel manner. But, in this change, I forgot decreasing the usage count of the bio which was used to send discard request. I'm sorry about that. Fixes: a015434480dc ("ext4: send parallel discards on commit completions") Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2017-08-04xfs: Fix per-inode DAX flag inheritanceLukas Czerner
According to the commit that implemented per-inode DAX flag: commit 58f88ca2df72 ("xfs: introduce per-inode DAX enablement") the flag is supposed to act as "inherit flag". Currently this only works in the situations where parent directory already has a flag in di_flags set, otherwise inheritance does not work. This is because setting the XFS_DIFLAG2_DAX flag is done in a wrong branch designated for di_flags, not di_flags2. Fix this by moving the code to branch designated for setting di_flags2, which does test for flags in di_flags2. Fixes: 58f88ca2df72 ("xfs: introduce per-inode DAX enablement") Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-08-04xfs: Fix leak of discard bioJan Kara
The bio describing discard operation is allocated by __blkdev_issue_discard() which returns us a reference to it. That reference is never released and thus we leak this bio. Drop the bio reference once it completes in xlog_discard_endio(). CC: stable@vger.kernel.org Fixes: 4560e78f40cb55bd2ea8f1ef4001c5baa88531c7 Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-08-03f2fs: use printk_ratelimited for f2fs_msgJaegeuk Kim
This patch reduces contention of printks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03f2fs: expose features to sysfs entryJaegeuk Kim
This patch exposes what features are supported by current f2fs build to sysfs entry via: /sys/fs/f2fs/features/ /sys/fs/f2fs/dev/features Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03f2fs: support inode checksumChao Yu
This patch adds to support inode checksum in f2fs. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix verification flow] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03f2fs: return wrong error number on f2fs_quota_writeJaegeuk Kim
This must return size, not error number. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03f2fs: provide f2fs_balance_fs to __write_node_pageYunlong Song
Let node writeback also do f2fs_balance_fs to ensure there are always enough free segments. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "15 fixes" [ This does not merge the "fortify: use WARN instead of BUG for now" patch, which needs a bit of extra work to build cleanly with all configurations. Arnd is on it. - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: ocfs2: don't clear SGID when inheriting ACLs mm: allow page_cache_get_speculative in interrupt context userfaultfd: non-cooperative: flush event_wqh at release time ipc: add missing container_of()s for randstruct cpuset: fix a deadlock due to incomplete patching of cpusets_enabled() userfaultfd_zeropage: return -ENOSPC in case mm has gone mm: take memory hotplug lock within numa_zonelist_order_handler() mm/page_io.c: fix oops during block io poll in swapin path zram: do not free pool->size_class kthread: fix documentation build warning kasan: avoid -Wmaybe-uninitialized warning userfaultfd: non-cooperative: notify about unmap of destination during mremap mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries pid: kill pidhash_size in pidhash_init() mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors
2017-08-03ecryptfs: convert to file_write_and_wait in ->fsyncJeff Layton
This change is mainly for documentation/completeness, as ecryptfs never calls mapping_set_error, and so will never return a previous writeback error. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2017-08-03fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dioAshish Samant
Commit 8fba54aebbdf ("fuse: direct-io: don't dirty ITER_BVEC pages") fixes the ITER_BVEC page deadlock for direct io in fuse by checking in fuse_direct_io(), whether the page is a bvec page or not, before locking it. However, this check is missed when the "async_dio" mount option is enabled. In this case, set_page_dirty_lock() is called from the req->end callback in request_end(), when the fuse thread is returning from userspace to respond to the read request. This will cause the same deadlock because the bvec condition is not checked in this path. Here is the stack of the deadlocked thread, while returning from userspace: [13706.656686] INFO: task glusterfs:3006 blocked for more than 120 seconds. [13706.657808] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [13706.658788] glusterfs D ffffffff816c80f0 0 3006 1 0x00000080 [13706.658797] ffff8800d6713a58 0000000000000086 ffff8800d9ad7000 ffff8800d9ad5400 [13706.658799] ffff88011ffd5cc0 ffff8800d6710008 ffff88011fd176c0 7fffffffffffffff [13706.658801] 0000000000000002 ffffffff816c80f0 ffff8800d6713a78 ffffffff816c790e [13706.658803] Call Trace: [13706.658809] [<ffffffff816c80f0>] ? bit_wait_io_timeout+0x80/0x80 [13706.658811] [<ffffffff816c790e>] schedule+0x3e/0x90 [13706.658813] [<ffffffff816ca7e5>] schedule_timeout+0x1b5/0x210 [13706.658816] [<ffffffff81073ffb>] ? gup_pud_range+0x1db/0x1f0 [13706.658817] [<ffffffff810668fe>] ? kvm_clock_read+0x1e/0x20 [13706.658819] [<ffffffff81066909>] ? kvm_clock_get_cycles+0x9/0x10 [13706.658822] [<ffffffff810f5792>] ? ktime_get+0x52/0xc0 [13706.658824] [<ffffffff816c6f04>] io_schedule_timeout+0xa4/0x110 [13706.658826] [<ffffffff816c8126>] bit_wait_io+0x36/0x50 [13706.658828] [<ffffffff816c7d06>] __wait_on_bit_lock+0x76/0xb0 [13706.658831] [<ffffffffa0545636>] ? lock_request+0x46/0x70 [fuse] [13706.658834] [<ffffffff8118800a>] __lock_page+0xaa/0xb0 [13706.658836] [<ffffffff810c8500>] ? wake_atomic_t_function+0x40/0x40 [13706.658838] [<ffffffff81194d08>] set_page_dirty_lock+0x58/0x60 [13706.658841] [<ffffffffa054d968>] fuse_release_user_pages+0x58/0x70 [fuse] [13706.658844] [<ffffffffa0551430>] ? fuse_aio_complete+0x190/0x190 [fuse] [13706.658847] [<ffffffffa0551459>] fuse_aio_complete_req+0x29/0x90 [fuse] [13706.658849] [<ffffffffa05471e9>] request_end+0xd9/0x190 [fuse] [13706.658852] [<ffffffffa0549126>] fuse_dev_do_write+0x336/0x490 [fuse] [13706.658854] [<ffffffffa054963e>] fuse_dev_write+0x6e/0xa0 [fuse] [13706.658857] [<ffffffff812a9ef3>] ? security_file_permission+0x23/0x90 [13706.658859] [<ffffffff81205300>] do_iter_readv_writev+0x60/0x90 [13706.658862] [<ffffffffa05495d0>] ? fuse_dev_splice_write+0x350/0x350 [fuse] [13706.658863] [<ffffffff812062a1>] do_readv_writev+0x171/0x1f0 [13706.658866] [<ffffffff810b3d00>] ? try_to_wake_up+0x210/0x210 [13706.658868] [<ffffffff81206361>] vfs_writev+0x41/0x50 [13706.658870] [<ffffffff81206496>] SyS_writev+0x56/0xf0 [13706.658872] [<ffffffff810257a1>] ? syscall_trace_leave+0xf1/0x160 [13706.658874] [<ffffffff816cbb2e>] system_call_fastpath+0x12/0x71 Fix this by making should_dirty a fuse_io_priv parameter that can be checked in fuse_aio_complete_req(). Reported-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Ashish Samant <ashish.samant@oracle.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2017-08-02Merge tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client fixes from Anna Schumaker: "Two fixes from Trond this time, now that he's back from his vacation. The first is a stable fix for the EXCHANGE_ID issue on the mailing list, and the other fixes a double-free situation that he found at the same time. Stable fix: - Fix EXCHANGE_ID corrupt verifier issue Other fix: - Fix double frees in nfs4_test_session_trunk()" * tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4: Fix double frees in nfs4_test_session_trunk() NFSv4: Fix EXCHANGE_ID corrupt verifier issue
2017-08-02ocfs2: don't clear SGID when inheriting ACLsJan Kara
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit set, DIR1 is expected to have SGID bit set (and owning group equal to the owning group of 'DIR0'). However when 'DIR0' also has some default ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on 'DIR1' to get cleared if user is not member of the owning group. Fix the problem by moving posix_acl_update_mode() out of ocfs2_set_acl() into ocfs2_iop_set_acl(). That way the function will not be called when inheriting ACLs which is what we want as it prevents SGID bit clearing and the mode has been properly set by posix_acl_create() anyway. Also posix_acl_chmod() that is calling ocfs2_set_acl() takes care of updating mode itself. Fixes: 073931017b4 ("posix_acl: Clear SGID bit when setting file permissions") Link: http://lkml.kernel.org/r/20170801141252.19675-3-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-02userfaultfd: non-cooperative: flush event_wqh at release timeMike Rapoport
There may still be threads waiting on event_wqh at the time the userfault file descriptor is closed. Flush the events wait-queue to prevent waiting threads from hanging. Link: http://lkml.kernel.org/r/1501398127-30419-1-git-send-email-rppt@linux.vnet.ibm.com Fixes: 9cd75c3cd4c3d ("userfaultfd: non-cooperative: add ability to report non-PF events from uffd descriptor") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>