summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2018-12-26f2fs: fix to dirty inode synchronouslyChao Yu
If user change inode's i_flags via ioctl, let's add it into global dirty list, so that checkpoint can guarantee its persistence before fsync, it can make checkpoint keeping strong consistency. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: clean up structure extent_nodeChao Yu
The union in struct extent_node wass only to indicate below fields struct rb_node rb_node; union { struct { unsigned int fofs; unsigned int len; ... ... can be parsed as fields in struct rb_entry, but they were never be used explicitly before, so let's remove them for cleanup. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: fix block address for __check_sit_bitmapQiuyang Sun
Should use lstart (logical start address) instead of start (in dev) here. This fixes a bug in multi-device scenarios. Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: fix sbi->extent_list corruption issueSahitya Tummala
When there is a failure in f2fs_fill_super() after/during the recovery of fsync'd nodes, it frees the current sbi and retries again. This time the mount is successful, but the files that got recovered before retry, still holds the extent tree, whose extent nodes list is corrupted since sbi and sbi->extent_list is freed up. The list_del corruption issue is observed when the file system is getting unmounted and when those recoverd files extent node is being freed up in the below context. list_del corruption. prev->next should be fffffff1e1ef5480, but was (null) <...> kernel BUG at kernel/msm-4.14/lib/list_debug.c:53! lr : __list_del_entry_valid+0x94/0xb4 pc : __list_del_entry_valid+0x94/0xb4 <...> Call trace: __list_del_entry_valid+0x94/0xb4 __release_extent_node+0xb0/0x114 __free_extent_tree+0x58/0x7c f2fs_shrink_extent_tree+0xdc/0x3b0 f2fs_leave_shrinker+0x28/0x7c f2fs_put_super+0xfc/0x1e0 generic_shutdown_super+0x70/0xf4 kill_block_super+0x2c/0x5c kill_f2fs_super+0x44/0x50 deactivate_locked_super+0x60/0x8c deactivate_super+0x68/0x74 cleanup_mnt+0x40/0x78 __cleanup_mnt+0x1c/0x28 task_work_run+0x48/0xd0 do_notify_resume+0x678/0xe98 work_pending+0x8/0x14 Fix this by not creating extents for those recovered files if shrinker is not registered yet. Once mount is successful and shrinker is registered, those files can have extents again. Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: clean up checkpoint flowChao Yu
This patch cleans up checkpoint flow a bit: - remove unneeded circulation of flushing meta pages. - don't flush nat_bits pages in prior to other checkpoint pages. - add bug_on to check remained meta pages after flushing. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: flush stale issued discard candidatesJaegeuk Kim
Sometimes, I could observe # of issuing_discard to be 1 which blocks background jobs due to is_idle()=false. The only way to get out of it was to trigger gc_urgent. This patch avoids that by checking any candidates as done in the list. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: correct wrong spelling, issing_*Jaegeuk Kim
Let's use "queued" instead of "issuing". Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: use kvmalloc, if kmalloc is failedJaegeuk Kim
One report says memalloc failure during mount. (unwind_backtrace) from [<c010cd4c>] (show_stack+0x10/0x14) (show_stack) from [<c049c6b8>] (dump_stack+0x8c/0xa0) (dump_stack) from [<c024fcf0>] (warn_alloc+0xc4/0x160) (warn_alloc) from [<c0250218>] (__alloc_pages_nodemask+0x3f4/0x10d0) (__alloc_pages_nodemask) from [<c0270450>] (kmalloc_order_trace+0x2c/0x120) (kmalloc_order_trace) from [<c03fa748>] (build_node_manager+0x35c/0x688) (build_node_manager) from [<c03de494>] (f2fs_fill_super+0xf0c/0x16cc) (f2fs_fill_super) from [<c02a5864>] (mount_bdev+0x15c/0x188) (mount_bdev) from [<c03da624>] (f2fs_mount+0x18/0x20) (f2fs_mount) from [<c02a68b8>] (mount_fs+0x158/0x19c) (mount_fs) from [<c02c3c9c>] (vfs_kern_mount+0x78/0x134) (vfs_kern_mount) from [<c02c76ac>] (do_mount+0x474/0xca4) (do_mount) from [<c02c8264>] (SyS_mount+0x94/0xbc) (SyS_mount) from [<c0108180>] (ret_fast_syscall+0x0/0x48) Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26f2fs: remove redundant comment of unused wio_mutexYunlong Song
Commit 089842de ("f2fs: remove codes of unused wio_mutex") removes codes of unused wio_mutex, but missing the comment, so delete it. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-12-26Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The biggest RCU changes in this cycle were: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) rcutorture: Don't do busted forward-progress testing rcutorture: Use 100ms buckets for forward-progress callback histograms rcutorture: Recover from OOM during forward-progress tests rcutorture: Print forward-progress test age upon failure rcutorture: Print time since GP end upon forward-progress failure rcutorture: Print histogram of CB invocation at OOM time rcutorture: Print GP age upon forward-progress failure rcu: Print per-CPU callback counts for forward-progress failures rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings rcutorture: Dump grace-period diagnostics upon forward-progress OOM rcutorture: Prepare for asynchronous access to rcu_fwd_startat torture: Remove unnecessary "ret" variables rcutorture: Affinity forward-progress test to avoid housekeeping CPUs rcutorture: Break up too-long rcu_torture_fwd_prog() function rcutorture: Remove cbflood facility torture: Bring any extra CPUs online during kernel startup rcutorture: Add call_rcu() flooding forward-progress tests rcutorture/formal: Replace synchronize_sched() with synchronize_rcu() tools/kernel.h: Replace synchronize_sched() with synchronize_rcu() net/decnet: Replace rcu_barrier_bh() with rcu_barrier() ...
2018-12-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-nextLinus Torvalds
Pull sparc updates from David Miller: - Automatic system call table generation, from Firoz Khan. - Clean up accesses to the OF device names by using full_name instead of path_component_name. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next: ALSA: sparc: Use of_node_name_eq for node name comparisons sbus: Use of_node_name_eq for node name comparisons sparc: generate uapi header and system call table files sparc: add system call table generation support sparc: add __NR_syscalls along with NR_syscalls sparc: move __IGNORE* entries to non uapi header sparc: Use DT node full_name instead of name for resources sparc: Remove unused leon_trans_init sparc: Use device_type helpers to access the node type sparc: Use of_node_name_eq for node name comparisons sparc: Convert to using %pOFn instead of device_node.name sparc: prom: use property "name" directly to construct node names of: Drop full path from full_name for PDT systems sparc: Convert to using %pOF instead of full_name fs/openpromfs: Use of_node_name_eq for node name comparisons fs/openpromfs: use full_name instead of path_component_name
2018-12-26ceph: don't encode inode pathes into reconnect messageYan, Zheng
mds hasn't used inode pathes since introducing inode backtrace. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26ceph: update wanted caps after resuming stale sessionYan, Zheng
mds contains an optimization, it does not re-issue stale caps if client does not want any cap. A special case of the optimization is that client wants some caps, but skipped updating 'wanted'. For this case, client needs to update 'wanted' when stale session get renewed. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26ceph: skip updating 'wanted' caps if caps are already issuedYan, Zheng
When reading cached inode that already has Fscr caps, this can avoid two cap messages (one updats 'wanted' caps, one clears 'wanted' caps). Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26ceph: don't request excl caps when mount is readonlyYan, Zheng
Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26ceph: don't update importing cap's mseq when handing cap exportYan, Zheng
Updating mseq makes client think importer mds has accepted all prior cap messages and importer mds knows what caps client wants. Actually some cap messages may have been dropped because of mseq mismatch. If mseq is left untouched, importing cap's mds_wanted later will get reset by cap import message. Cc: stable@vger.kernel.org Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26ceph: remove redundant assignmentChengguang Xu
There is redundant assighment of variable i in ceph_mdsmap_get_random_mds(), just remvoe it. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-26ceph: cleanup splice_dentry()Yan, Zheng
splice_dentry() may drop the original dentry and return other dentry. It relies on its caller to update pointer that points to the dropped dentry. This is error-prone. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-12-25Merge tag 'mtd/for-4.21' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull mtd updates from Boris Brezillon: "SPI NOR Core changes: - Parse the 4BAIT SFDP section - Add a bunch of SPI NOR entries to the flash_info table - Add the concept of SFDP fixups and use it to fix a bug on MX25L25635F - A bunch of minor cleanups/comestic changes NAND core changes: - kernel-doc miscellaneous fixes. - Third batch of fixes/cleanup to the raw NAND core impacting various controller drivers (ams-delta, marvell, fsmc, denali, tegra, vf610): * Stop to pass mtd_info objects to internal functions * Reorganize code to avoid forward declarations * Drop useless test in nand_legacy_set_defaults() * Move nand_exec_op() to internal.h * Add nand_[de]select_target() helpers * Pass the CS line to be selected in struct nand_operation * Make ->select_chip() optional when ->exec_op() is implemented * Deprecate the ->select_chip() hook * Move the ->exec_op() method to nand_controller_ops * Move ->setup_data_interface() to nand_controller_ops * Deprecate the dummy_controller field * Fix JEDEC detection * Provide a helper for polling GPIO R/B pin Raw NAND chip drivers changes: - Macronix: * Flag 1.8V AC chips with a broken GET_FEATURES(TIMINGS) Raw NAND controllers drivers changes: - Ams-delta: * Fix the error path * SPDX tag added * May be compiled with COMPILE_TEST=y * Conversion to ->exec_op() interface * Drop .IOADDR_R/W use * Use GPIO API for data I/O - Denali: * Remove denali_reset_banks() * Remove ->dev_ready() hook * Include <linux/bits.h> instead of <linux/bitops.h> * Changes to comply with the above fixes/cleanup done in the core. - FSMC: * Add an SPDX tag to replace the license text * Make conversion from chip to fsmc consistent * Fix unchecked return value in fsmc_read_page_hwecc * Changes to comply with the above fixes/cleanup done in the core. - Marvell: * Prevent timeouts on a loaded machine (fix) * Changes to comply with the above fixes/cleanup done in the core. - OMAP2: * Pass the parent of pdev to dma_request_chan() (fix) - R852: * Use generic DMA API - sh_flctl: * Convert to SPDX identifiers - Sunxi: * Write pageprog related opcodes to the right register: WCMD_SET (fix) - Tegra: * Stop implementing ->select_chip() - VF610: * Add an SPDX tag to replace the license text * Changes to comply with the above fixes/cleanup done in the core. - Various trivial/spelling/coding style fixes. SPI-NAND drivers changes: - Remove the depreacated mt29f_spinand driver from staging. - Add support for: * Toshiba TC58CVG2S0H * GigaDevice GD5FxGQ4xA * Winbond W25N01GV JFFS2 changes: - Fix a lockdep issue MTD changes: - Rework the physmap driver to merge gpio-addr-flash and physmap_of in it - Add a new compatible for RedBoot partitions - Make sub-partitions RW if the parent partition was RO because of a mis-alignment - Add pinctrl support to the - Addition of /* fall-through */ comments where appropriate - Various minor fixes and cleanups Other changes: - Update my email address" * tag 'mtd/for-4.21' of git://git.infradead.org/linux-mtd: (108 commits) mtd: rawnand: sunxi: Write pageprog related opcodes to WCMD_SET MAINTAINERS: Update my email address mtd: rawnand: marvell: prevent timeouts on a loaded machine mtd: rawnand: omap2: Pass the parent of pdev to dma_request_chan() mtd: rawnand: Fix JEDEC detection mtd: spi-nor: Add support for is25lp016d mtd: spi-nor: parse SFDP 4-byte Address Instruction Table mtd: spi-nor: Add 4B_OPCODES flag to is25lp256 mtd: spi-nor: Add an SPDX tag to spi-nor.{c,h} mtd: spi-nor: Make the enable argument passed to set_byte() a bool mtd: spi-nor: Stop passing flash_info around mtd: spi-nor: Avoid forward declaration of internal functions mtd: spi-nor: Drop inline on all internal helpers mtd: spi-nor: Add a post BFPT fixup for MX25L25635E mtd: spi-nor: Add a post BFPT parsing fixup hook mtd: spi-nor: Add the SNOR_F_4B_OPCODES flag mtd: spi-nor: cast to u64 to avoid uint overflows mtd: spi-nor: Add support for IS25LP032/064 mtd: spi-nor: add entry for mt35xu512aba flash mtd: spi-nor: add macros related to MICRON flash ...
2018-12-25Merge tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm updates from Dave Airlie: "Core: - shared fencing staging removal - drop transactional atomic helpers and move helpers to new location - DP/MST atomic cleanup - Leasing cleanups and drop EXPORT_SYMBOL - Convert drivers to atomic helpers and generic fbdev. - removed deprecated obj_ref/unref in favour of get/put - Improve dumb callback documentation - MODESET_LOCK_BEGIN/END helpers panels: - CDTech panels, Banana Pi Panel, DLC1010GIG, - Olimex LCD-O-LinuXino, Samsung S6D16D0, Truly NT35597 WQXGA, - Himax HX8357D, simulated RTSM AEMv8. - GPD Win2 panel - AUO G101EVN010 vgem: - render node support ttm: - move global init out of drivers - fix LRU handling for ghost objects - Support for simultaneous submissions to multiple engines scheduler: - timeout/fault handling changes to help GPU recovery - helpers for hw with preemption support i915: - Scaler/Watermark fixes - DP MST + powerwell fixes - PSR fixes - Break long get/put shmemfs pages - Icelake fixes - Icelake DSI video mode enablement - Engine workaround improvements amdgpu: - freesync support - GPU reset enabled on CI, VI, SOC15 dGPUs - ABM support in DC - KFD support for vega12/polaris12 - SDMA paging queue on vega - More amdkfd code sharing - DCC scanout on GFX9 - DC kerneldoc - Updated SMU firmware for GFX8 chips - XGMI PSP + hive reset support - GPU reset - DC trace support - Powerplay updates for newer Polaris - Cursor plane update fast path - kfd dma-buf support virtio-gpu: - add EDID support vmwgfx: - pageflip with damage support nouveau: - Initial Turing TU104/TU106 modesetting support msm: - a2xx gpu support for apq8060 and imx5 - a2xx gpummu support - mdp4 display support for apq8060 - DPU fixes and cleanups - enhanced profiling support - debug object naming interface - get_iova/page pinning decoupling tegra: - Tegra194 host1x, VIC and display support enabled - Audio over HDMI for Tegra186 and Tegra194 exynos: - DMA/IOMMU refactoring - plane alpha + blend mode support - Color format fixes for mixer driver rcar-du: - R8A7744 and R8A77470 support - R8A77965 LVDS support imx: - fbdev emulation fix - multi-tiled scalling fixes - SPDX identifiers rockchip - dw_hdmi support - dw-mipi-dsi + dual dsi support - mailbox read size fix qxl: - fix cursor pinning vc4: - YUV support (scaling + cursor) v3d: - enable TFU (Texture Formatting Unit) mali-dp: - add support for linear tiled formats sun4i: - Display Engine 3 support - H6 DE3 mixer 0 support - H6 display engine support - dw-hdmi support - H6 HDMI phy support - implicit fence waiting - BGRX8888 support meson: - Overlay plane support - implicit fence waiting - HDMI 1.4 4k modes bridge: - i2c fixes for sii902x" * tag 'drm-next-2018-12-14' of git://anongit.freedesktop.org/drm/drm: (1403 commits) drm/amd/display: Add fast path for cursor plane updates drm/amdgpu: Enable GPU recovery by default for CI drm/amd/display: Fix duplicating scaling/underscan connector state drm/amd/display: Fix unintialized max_bpc state values Revert "drm/amd/display: Set RMX_ASPECT as default" drm/amdgpu: Fix stub function name drm/msm/dpu: Fix clock issue after bind failure drm/msm/dpu: Clean up dpu_media_info.h static inline functions drm/msm/dpu: Further cleanups for static inline functions drm/msm/dpu: Cleanup the debugfs functions drm/msm/dpu: Remove dpu_irq and unused functions drm/msm: Make irq_postinstall optional drm/msm/dpu: Cleanup callers of dpu_hw_blk_init drm/msm/dpu: Remove unused functions drm/msm/dpu: Remove dpu_crtc_is_enabled() drm/msm/dpu: Remove dpu_crtc_get_mixer_height drm/msm/dpu: Remove dpu_dbg drm/msm: dpu: Remove crtc_lock drm/msm: dpu: Remove vblank_requested flag from dpu_crtc drm/msm: dpu: Separate crtc assignment from vblank enable ...
2018-12-25ext4: fix a potential fiemap/page fault deadlock w/ inline_dataTheodore Ts'o
The ext4_inline_data_fiemap() function calls fiemap_fill_next_extent() while still holding the xattr semaphore. This is not necessary and it triggers a circular lockdep warning. This is because fiemap_fill_next_extent() could trigger a page fault when it writes into page which triggers a page fault. If that page is mmaped from the inline file in question, this could very well result in a deadlock. This problem can be reproduced using generic/519 with a file system configuration which has the inline_data feature enabled. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2018-12-24ext4: make sure enough credits are reserved for dioread_nolock writesTheodore Ts'o
There are enough credits reserved for most dioread_nolock writes; however, if the extent tree is sufficiently deep, and/or quota is enabled, the code was not allowing for all eventualities when reserving journal credits for the unwritten extent conversion. This problem can be seen using xfstests ext4/034: WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180 Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180 ... EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28 EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2018-12-23cifs: Save TTL value when parsing DFS referralsPaulo Alcantara
This will be needed by DFS cache. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: auto disable 'serverino' in dfs mountsAurelien Aptel
Different servers have different set of file ids. After failover, unique IDs will be different so we can't validate them. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: Make devname param optional in cifs_compose_mount_options()Paulo Alcantara
If we only want to get the mount options strings, do not return the devname. For DFS failover, we'll be passing the DFS full path down to cifs_mount() rather than the devname. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: Skip any trailing backslashes from UNCPaulo Alcantara
When extracting hostname from UNC, check for leading backslashes before trying to remove them. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: Refactor out cifs_mount()Paulo Alcantara
* Split and refactor the very large function cifs_mount() in multiple functions: - tcp, ses and tcon setup to mount_get_conns() - tcp, ses and tcon cleanup in mount_put_conns() - tcon tlink setup to mount_setup_tlink() - remote path checking to is_path_remote() * Implement 2 version of cifs_mount() for DFS-enabled builds and non-DFS-enabled builds (CONFIG_CIFS_DFS_UPCALL). In preparation for DFS failover support. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23CIFS: Fix error mapping for SMB2_LOCK command which caused OFD lock problemGeorgy A Bystrenin
While resolving a bug with locks on samba shares found a strange behavior. When a file locked by one node and we trying to lock it from another node it fail with errno 5 (EIO) but in that case errno must be set to (EACCES | EAGAIN). This isn't happening when we try to lock file second time on same node. In this case it returns EACCES as expected. Also this issue not reproduces when we use SMB1 protocol (vers=1.0 in mount options). Further investigation showed that the mapping from status_to_posix_error is different for SMB1 and SMB2+ implementations. For SMB1 mapping is [NT_STATUS_LOCK_NOT_GRANTED to ERRlock] (See fs/cifs/netmisc.c line 66) but for SMB2+ mapping is [STATUS_LOCK_NOT_GRANTED to -EIO] (see fs/cifs/smb2maperror.c line 383) Quick changes in SMB2+ mapping from EIO to EACCES has fixed issue. BUG: https://bugzilla.kernel.org/show_bug.cgi?id=201971 Signed-off-by: Georgy A Bystrenin <gkot@altlinux.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23CIFS: return correct errors when pinning memory failed for direct I/OLong Li
When pinning memory failed, we should return the correct error code and rewind the SMB credits. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Cc: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23CIFS: use the correct length when pinning memory for direct I/O for writeLong Li
The current code attempts to pin memory using the largest possible wsize based on the currect SMB credits. This doesn't cause kernel oops but this is not optimal as we may pin more pages then actually needed. Fix this by only pinning what are needed for doing this write I/O. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
2018-12-23cifs: check ntwrk_buf_start for NULL before dereferencing itRonnie Sahlberg
RHBZ: 1021460 There is an issue where when multiple threads open/close the same directory ntwrk_buf_start might end up being NULL, causing the call to smbCalcSize later to oops with a NULL deref. The real bug is why this happens and why this can become NULL for an open cfile, which should not be allowed. This patch tries to avoid a oops until the time when we fix the underlying issue. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: remove coverity warning in calc_lanman_hashRonnie Sahlberg
password_with_pad is a fixed size buffer of 16 bytes, it contains a password string, to be padded with \0 if shorter than 16 bytes but is just truncated if longer. It is not, and we do not depend on it to be, nul terminated. As such, do not use strncpy() to populate this buffer since the str* prefix suggests that this is a string, which it is not, and it also confuses coverity causing a false warning. Detected by CoverityScan CID#113743 ("Buffer not null terminated") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: remove set but not used variable 'smb_buf'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: fs/cifs/sess.c: In function '_sess_auth_rawntlmssp_assemble_req': fs/cifs/sess.c:1157:18: warning: variable 'smb_buf' set but not used [-Wunused-but-set-variable] It never used since commit cc87c47d9d7a ("cifs: Separate rawntlmssp auth from CIFS_SessSetup()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: suppress some implicit-fallthrough warningsGustavo A. R. Silva
To avoid the warning: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: change smb2_query_eas to use the compound query-info helperRonnie Sahlberg
Reducing the number of network roundtrips improves the performance of query xattrs Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23Add vers=3.0.2 as a valid option for SMBv3.0.2Kenneth D'souza
Technically 3.02 is not the dialect name although that is more familiar to many, so we should also accept the official dialect name (3.0.2 vs. 3.02) in vers= Signed-off-by: Kenneth D'souza <kdsouza@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: create a helper function for compound query_infoRonnie Sahlberg
and convert statfs to use it. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: address trivial coverity warningSteve French
This is not actually a bug but as Coverity points out we shouldn't be doing an "|=" on a value which hasn't been set (although technically it was memset to zero so isn't a bug) and so might as well change "|=" to "=" in this line Detected by CoverityScan, CID#728535 ("Unitialized scalar variable") Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-12-23cifs: smb2 commands can not be negative, remove confusing checkSteve French
As Coverity points out le16_to_cpu(midEntry->Command) can not be less than zero. Detected by CoverityScan, CID#1438650 ("Macro compares unsigned to 0") Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-12-23cifs: use a compound for setting an xattrRonnie Sahlberg
Improve performance by reducing number of network round trips for set xattr. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23cifs: clean up indentation, replace spaces with tabColin Ian King
Trivial fix to clean up indentation, replace spaces with tab Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-23Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs fixes from Al Viro: "A couple of fixes - no common topic ;-)" [ The aio spectre patch also came in from Jens, so now we have that doubly fixed .. ] * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: proc/sysctl: don't return ENOMEM on lookup when a table is unregistering aio: fix spectre gadget in lookup_ioctx
2018-12-22Revert "vfs: Allow userns root to call mknod on owned filesystems."Christian Brauner
This reverts commit 55956b59df336f6738da916dbb520b6e37df9fbd. commit 55956b59df33 ("vfs: Allow userns root to call mknod on owned filesystems.") enabled mknod() in user namespaces for userns root if CAP_MKNOD is available. However, these device nodes are useless since any filesystem mounted from a non-initial user namespace will set the SB_I_NODEV flag on the filesystem. Now, when a device node s created in a non-initial user namespace a call to open() on said device node will fail due to: bool may_open_dev(const struct path *path) { return !(path->mnt->mnt_flags & MNT_NODEV) && !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV); } The problem with this is that as of the aforementioned commit mknod() creates partially functional device nodes in non-initial user namespaces. In particular, it has the consequence that as of the aforementioned commit open() will be more privileged with respect to device nodes than mknod(). Before it was the other way around. Specifically, if mknod() succeeded then it was transparent for any userspace application that a fatal error must have occured when open() failed. All of this breaks multiple userspace workloads and a widespread assumption about how to handle mknod(). Basically, all container runtimes and systemd live by the slogan "ask for forgiveness not permission" when running user namespace workloads. For mknod() the assumption is that if the syscall succeeds the device nodes are useable irrespective of whether it succeeds in a non-initial user namespace or not. This logic was chosen explicitly to allow for the glorious day when mknod() will actually be able to create fully functional device nodes in user namespaces. A specific problem people are already running into when running 4.18 rc kernels are failing systemd services. For any distro that is run in a container systemd services started with the PrivateDevices= property set will fail to start since the device nodes in question cannot be opened (cf. the arguments in [1]). Full disclosure, Seth made the very sound argument that it is already possible to end up with partially functional device nodes. Any filesystem mounted with MS_NODEV set will allow mknod() to succeed but will not allow open() to succeed. The difference to the case here is that the MS_NODEV case is transparent to userspace since it is an explicitly set mount option while the SB_I_NODEV case is an implicit property enforced by the kernel and hence opaque to userspace. [1]: https://github.com/systemd/systemd/pull/9483 Signed-off-by: Christian Brauner <christian@brauner.io> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Seth Forshee <seth.forshee@canonical.com> Cc: Serge Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-21xfs: reallocate realtime summary cache on growfsOmar Sandoval
At mount time, we allocate m_rsum_cache with the number of realtime bitmap blocks. However, xfs_growfs_rt() can increase the number of realtime bitmap blocks. Using the cache after this happens may access out of the bounds of the cache. Fix it by reallocating the cache in this case. Fixes: 355e3532132b ("xfs: cache minimum realtime summary level") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-12-21dax: Use non-exclusive wait in wait_entry_unlocked()Dan Williams
get_unlocked_entry() uses an exclusive wait because it is guaranteed to eventually obtain the lock and follow on with an unlock+wakeup cycle. The wait_entry_unlocked() path does not have the same guarantee. Rather than open-code an extra wakeup, just switch to a non-exclusive wait. Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-12-21NFS: nfs_compare_mount_options always compare auth flavors.Chris Perl
This patch removes the check from nfs_compare_mount_options to see if a `sec' option was passed for the current mount before comparing auth flavors and instead just always compares auth flavors. Consider the following scenario: You have a server with the address 192.168.1.1 and two exports /export/a and /export/b. The first export supports `sys' and `krb5' security, the second just `sys'. Assume you start with no mounts from the server. The following results in EIOs being returned as the kernel nfs client incorrectly thinks it can share the underlying `struct nfs_server's: $ mkdir /tmp/{a,b} $ sudo mount -t nfs -o vers=3,sec=krb5 192.168.1.1:/export/a /tmp/a $ sudo mount -t nfs -o vers=3 192.168.1.1:/export/b /tmp/b $ df >/dev/null df: ā€˜/tmp/b’: Input/output error Signed-off-by: Chris Perl <cperl@janestreet.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-21Merge tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb3 fix from Steve French: "An important smb3 fix for an regression to some servers introduced by compounding optimization to rmdir. This fix has been tested by multiple developers (including me) with the usual private xfstesting, but also by the new cifs/smb3 "buildbot" xfstest VMs (thank you Ronnie and Aurelien for good work on this automation). The automated testing has been updated so that it will catch problems like this in the future. Note that Pavel discovered (very recently) some unrelated but extremely important bugs in credit handling (smb3 flow control problem that can lead to disconnects/reconnects) when compounding, that I would have liked to send in ASAP but the complete testing of those two fixes may not be done in time and have to wait for 4.21" * tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: Fix rmdir compounding regression to strict servers
2018-12-21mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNTAl Viro
Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: new method: ->sb_add_mnt_opt()Al Viro
Adding options to growing mnt_opts. NFS kludge with passing context= down into non-text-options mount switched to it, and with that the last use of ->sb_parse_opts_str() is gone. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: hide struct security_mnt_opts from any generic codeAl Viro
Keep void * instead, allocate on demand (in parse_str_opts, at the moment). Eventually both selinux and smack will be better off with private structures with several strings in those, rather than this "counter and two pointers to dynamically allocated arrays" ugliness. This commit allows to do that at leisure, without disrupting anything outside of given module. Changes: * instead of struct security_mnt_opt use an opaque pointer initialized to NULL. * security_sb_eat_lsm_opts(), security_sb_parse_opts_str() and security_free_mnt_opts() take it as var argument (i.e. as void **); call sites are unchanged. * security_sb_set_mnt_opts() and security_sb_remount() take it by value (i.e. as void *). * new method: ->sb_free_mnt_opts(). Takes void *, does whatever freeing that needs to be done. * ->sb_set_mnt_opts() and ->sb_remount() might get NULL as mnt_opts argument, meaning "empty". Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>