summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-06-04afs: Show more a bit more server state in /proc/net/afs/serversDavid Howells
Display more information about the state of a server record, including the flags, rtt and break counter plus the probe state for each server in /proc/net/afs/servers. Rearrange the server flags a bit to make them easier to read at a glance in the proc file. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Don't use probe running state to make decisions outside probe codeDavid Howells
Don't use the running state for fileserver probes to make decisions about which server to use as the state is cleared at the start of a probe and also intermediate values might be misleading. Instead, add a separate 'latest known' rtt in the afs_server struct and a flag to indicate if the server is known to be responding and update these as and when we know what to change them to. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Fix afs_statfs() to not let the values go below zeroDavid Howells
Fix afs_statfs() so that the value for f_bavail and f_bfree don't go "negative" if the number of blocks in use by a volume exceeds the max quota for that volume. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Fix the by-UUID server tree to allow servers with the same UUIDDavid Howells
Whilst it shouldn't happen, it is possible for multiple fileservers to share a UUID, particularly if an entire cell has been duplicated, UUIDs and all. In such a case, it's not necessarily possible to map the effect of the CB.InitCallBackState3 incoming RPC to a specific server unambiguously by UUID and thus to a specific cell. Indeed, there's a problem whereby multiple server records may need to occupy the same spot in the rb_tree rooted in the afs_net struct. Fix this by allowing servers to form a list, with the head of the list in the tree. When the front entry in the list is removed, the second in the list just replaces it. afs_init_callback_state() then just goes down the line, poking each server in the list. This means that some servers will be unnecessarily poked, unfortunately. An alternative would be to route by call parameters. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation")
2020-06-04afs: Reorganise volume and server trees to be rooted on the cellDavid Howells
Reorganise afs_volume objects such that they're in a tree keyed on volume ID, rooted at on an afs_cell object rather than being in multiple trees, each of which is rooted on an afs_server object. afs_server structs become per-cell and acquire a pointer to the cell. The process of breaking a callback then starts with finding the server by its network address, following that to the cell and then looking up each volume ID in the volume tree. This is simpler than the afs_vol_interest/afs_cb_interest N:M mapping web and allows those structs and the code for maintaining them to be simplified or removed. It does make a couple of things a bit more tricky, though: (1) Operations now start with a volume, not a server, so there can be more than one answer as to whether or not the server we'll end up using supports the FS.InlineBulkStatus RPC. (2) CB RPC operations that specify the server UUID. There's still a tree of servers by UUID on the afs_net struct, but the UUIDs in it aren't guaranteed unique. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Add a tracepoint to track the lifetime of the afs_volume structDavid Howells
Add a tracepoint to track the lifetime of the afs_volume struct. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Detect cell aliases 3 - YFS Cells with a canonical cell name opDavid Howells
YFS Volume Location servers have an operation by which the cell name may be queried. Use this to find out what a YFS server thinks the canonical cell name should be. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Detect cell aliases 2 - Cells with no root volumesDavid Howells
Implement the second phase of cell alias detection. This part handles alias detection for cells that don't have root.cell volumes and so we have to find some other volume or fileserver to query. We take the first volume from each such cell and attempt to look it up in the new cell. If found, we compare the records, if they are the same, we judge the cell names to be aliases. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Detect cell aliases 1 - Cells with root volumesDavid Howells
Put in the first phase of cell alias detection. This part handles alias detection for cells that have root.cell volumes (which is expected to be likely). When a cell becomes newly active, it is probed for its root.cell volume, and if it has one, this volume is compared against other root.cell volumes to find out if the list of fileserver UUIDs have any in common - and if that's the case, do the address lists of those fileservers have any addresses in common. If they do, the new cell is adjudged to be an alias of the old cell and the old cell is used instead. Comparing is aided by the server list in struct afs_server_list being sorted in UUID order and the addresses in the fileserver address lists being sorted in address order. The cell then retains the afs_volume object for the root.cell volume, even if it's not mounted for future alias checking. This necessary because: (1) Whilst fileservers have UUIDs that are meant to be globally unique, in practice they are not because cells get cloned without changing the UUIDs - so afs_server records need to be per cell. (2) Sometimes the DNS is used to make cell aliases - but if we don't know they're the same, we may end up with multiple superblocks and multiple afs_server records for the same thing, impairing our ability to deliver callback notifications of third party changes (3) The fileserver RPC API doesn't contain the cell name, so it can't tell us which cell it's notifying and can't see that a change made to to one cell should notify the same client that's also accessed as the other cell. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Implement client support for the YFSVL.GetCellName RPC opDavid Howells
Implement client support for the YFSVL.GetCellName RPC operation by which YFS permits the canonical cell name to be queried from a VL server. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Retain more of the VLDB record for alias detectionDavid Howells
Save more bits from the volume location database record obtained for a server so that we can use this information in cell alias detection. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Fix handling of CB.ProbeUuid cache manager opDavid Howells
The AFS filesystem driver is handling the CB.ProbeUuid request incorrectly. The UUID presented in the request is that of the cache manager, not the fileserver, so afs_deliver_cb_probe_uuid() shouldn't be using that UUID to look up the server. Fix this by looking up the server by address instead. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Don't get epoch from a server because it may be ambiguousDavid Howells
Don't get the epoch from a server, particularly one that we're looking up by UUID, as UUIDs may be ambiguous and may map to more than one server - so we can't draw any conclusions from it. Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04afs: Build an abstraction around an "operation" conceptDavid Howells
Turn the afs_operation struct into the main way that most fileserver operations are managed. Various things are added to the struct, including the following: (1) All the parameters and results of the relevant operations are moved into it, removing corresponding fields from the afs_call struct. afs_call gets a pointer to the op. (2) The target volume is made the main focus of the operation, rather than the target vnode(s), and a bunch of op->vnode->volume are made op->volume instead. (3) Two vnode records are defined (op->file[]) for the vnode(s) involved in most operations. The vnode record (struct afs_vnode_param) contains: - The vnode pointer. - The fid of the vnode to be included in the parameters or that was returned in the reply (eg. FS.MakeDir). - The status and callback information that may be returned in the reply about the vnode. - Callback break and data version tracking for detecting simultaneous third-parth changes. (4) Pointers to dentries to be updated with new inodes. (5) An operations table pointer. The table includes pointers to functions for issuing AFS and YFS-variant RPCs, handling the success and abort of an operation and handling post-I/O-lock local editing of a directory. To make this work, the following function restructuring is made: (A) The rotation loop that issues calls to fileservers that can be found in each function that wants to issue an RPC (such as afs_mkdir()) is extracted out into common code, in a new file called fs_operation.c. (B) The rotation loops, such as the one in afs_mkdir(), are replaced with a much smaller piece of code that allocates an operation, sets the parameters and then calls out to the common code to do the actual work. (C) The code for handling the success and failure of an operation are moved into operation functions (as (5) above) and these are called from the core code at appropriate times. (D) The pseudo inode getting stuff used by the dynamic root code is moved over into dynroot.c. (E) struct afs_iget_data is absorbed into the operation struct and afs_iget() expects to be given an op pointer and a vnode record. (F) Point (E) doesn't work for the root dir of a volume, but we know the FID in advance (it's always vnode 1, unique 1), so a separate inode getter, afs_root_iget(), is provided to special-case that. (G) The inode status init/update functions now also take an op and a vnode record. (H) The RPC marshalling functions now, for the most part, just take an afs_operation struct as their only argument. All the data they need is held there. The result delivery functions write their answers there as well. (I) The call is attached to the operation and then the operation core does the waiting. And then the new operation code is, for the moment, made to just initialise the operation, get the appropriate vnode I/O locks and do the same rotation loop as before. This lays the foundation for the following changes in the future: (*) Overhauling the rotation (again). (*) Support for asynchronous I/O, where the fileserver rotation must be done asynchronously also. Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-04vfio-ccw: make vfio_ccw_regops variables declarations staticVasily Gorbik
Fixes the following sparse warnings: drivers/s390/cio/vfio_ccw_chp.c:62:30: warning: symbol 'vfio_ccw_schib_region_ops' was not declared. Should it be static? drivers/s390/cio/vfio_ccw_chp.c:117:30: warning: symbol 'vfio_ccw_crw_region_ops' was not declared. Should it be static? Link: https://lkml.kernel.org/r/patch.git-a34be7aede18.your-ad-here.call-01591269421-ext-5655@work.hours Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-06-04Merge tag 'vfio-ccw-20200603-v2' of ↵Vasily Gorbik
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features vfio-ccw updates: - accept requests without the prefetch bit set - enable path handling via two new regions * tag 'vfio-ccw-20200603-v2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw: vfio-ccw: Add trace for CRW event vfio-ccw: Wire up the CRW irq and CRW region vfio-ccw: Introduce a new CRW region vfio-ccw: Refactor IRQ handlers vfio-ccw: Introduce a new schib region vfio-ccw: Refactor the unregister of the async regions vfio-ccw: Register a chp_event callback for vfio-ccw vfio-ccw: Introduce new helper functions to free/destroy regions vfio-ccw: document possible errors vfio-ccw: Enable transparent CCW IPL from DASD Link: https://lkml.kernel.org/r/20200603112716.332801-1-cohuck@redhat.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-06-04iommu: Check for deferred attach in iommu_group_do_dma_attach()Joerg Roedel
The iommu_group_do_dma_attach() must not attach devices which have deferred_attach set. Otherwise devices could cause IOMMU faults when re-initialized in a kdump kernel. Fixes: deac0b3bed26 ("iommu: Split off default domain allocation from group assignment") Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/20200604091944.26402-1-joro@8bytes.org
2020-06-04PCI: uniphier: Add Socionext UniPhier Pro5 PCIe endpoint controller driverKunihiko Hayashi
Add driver for the Socionext UniPhier Pro5 SoC endpoint controller. This controller is based on the DesignWare PCIe core. And add "host" to existing controller descriontions for the host controller in Kconfig. Link: https://lore.kernel.org/r/1589457801-12796-3-git-send-email-hayashi.kunihiko@socionext.com Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org>
2020-06-04ovl: make oip->index boolMiklos Szeredi
ovl_get_inode() uses oip->index as a bool value, not as a pointer. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-06-04ovl: only pass ->ki_flags to ovl_iocb_to_rwf()Miklos Szeredi
Next patch will want to pass a modified set of flags, so... Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-06-04ovl: make private mounts longtermMiklos Szeredi
Overlayfs is using clone_private_mount() to create internal mounts for underlying layers. These are used for operations requiring a path, such as dentry_open(). Since these private mounts are not in any namespace they are treated as short term, "detached" mounts and mntput() involves taking the global mount_lock, which can result in serious cacheline pingpong. Make these private mounts longterm instead, which trade the penalty on mntput() for a slightly longer shutdown time due to an added RCU grace period when putting these mounts. Introduce a new helper kern_unmount_many() that can take care of multiple longterm mounts with a single RCU grace period. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-06-04ovl: get rid of redundant members in struct ovl_fsMiklos Szeredi
ofs->upper_mnt is copied to ->layers[0].mnt and ->layers[0].trap could be used instead of a separate ->upperdir_trap. Split the lowerdir option early to get the number of layers, then allocate the ->layers array, and finally fill the upper and lower layers, as before. Get rid of path_put_init() in ovl_lower_dir(), since the only caller will take care of that. [Colin Ian King] Fix null pointer dereference on null stack pointer on error return found by Coverity. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-06-04ovl: add accessor for ofs->upper_mntMiklos Szeredi
Next patch will remove ofs->upper_mnt, so add an accessor function for this field. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-06-04ovl: initialize error in ovl_copy_xattrYuxuan Shui
In ovl_copy_xattr, if all the xattrs to be copied are overlayfs private xattrs, the copy loop will terminate without assigning anything to the error variable, thus returning an uninitialized value. If ovl_copy_xattr is called from ovl_clear_empty, this uninitialized error value is put into a pointer by ERR_PTR(), causing potential invalid memory accesses down the line. This commit initialize error with 0. This is the correct value because when there's no xattr to copy, because all xattrs are private, ovl_copy_xattr should succeed. This bug is discovered with the help of INIT_STACK_ALL and clang. Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1050405 Fixes: 0956254a2d5b ("ovl: don't copy up opaqueness") Cc: stable@vger.kernel.org # v4.8 Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-06-04smb3: fix incorrect number of credits when ioctl MaxOutputResponse > 64KSteve French
We were not checking to see if ioctl requests asked for more than 64K (ie when CIFSMaxBufSize was > 64K) so when setting larger CIFSMaxBufSize then ioctls would fail with invalid parameter errors. When requests ask for more than 64K in MaxOutputResponse then we need to ask for more than 1 credit. Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2020-06-04smb3: default to minimum of two channels when multichannel specifiedSteve French
When "multichannel" is specified on mount, make sure to default to at least two channels. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-06-04drm/nouveau/kms/nv50-: clear SW state of disabled windows harderBen Skeggs
The most innocuous result of not having done this is that we end up sending unnecessary methods when we next enable the window. However, interactions with the code handling skipping disables when an update immediately follows, and window ownership assignment, can lead to upsetting the display hardware on Volta and newer. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-04drm/nouveau: gr/gk20a: Use firmware version 0Thierry Reding
Tegra firmware doesn't actually use any version numbers and passing -1 causes the existing firmware binaries not to be found. Use version 0 to find the correct files. Fixes: ef16dc278ec2 ("drm/nouveau/gr/gf100-: select implementation based on available FW") Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-04drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORsBen Skeggs
Some HDA pin widgets may be disabled by BIOS, and unavailable from a SOR. Our SOR allocation policy uses this information to allocate an appropriate SOR when HDA is supported by a display. Thank you to NVIDIA for providing the information to determine this. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-04drm/nouveau/disp/gp100: split SOR implementation from gm200Ben Skeggs
GP100 needs different HDA detection. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-04drm/nouveau/disp: modify OR allocation policy to account for HDA requirementsBen Skeggs
Since GM200, SORs are no longer tied to a specific connector, and we allocate them instead, with the assumption that all SORs are equally capable. However, there's a 1<->1 mapping between SOR and HDA pin widget, and it turns out that it's possible for some widgets to be disabled... In order to avoid picking a SOR without a valid pin widget, some new rules need to be added. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-04drm/nouveau/disp: split part of OR allocation logic into a functionBen Skeggs
No logical changes here, this is just moving the code to make the changes in the next commit more obvious. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-04drm/nouveau/disp: provide hint to OR allocation about HDA requirementsBen Skeggs
Will be used by a subsequent commit to influence SOR allocation policy. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-06-03atomisp: avoid warning about unused functionLinus Torvalds
The atomisp_mrfld_power() function isn't actually ever called, because the two call-sites have commented out the use because it breaks on some platforms. That results in: drivers/staging/media/atomisp/pci/atomisp_v4l2.c:764:12: warning: ‘atomisp_mrfld_power’ defined but not used [-Wunused-function] 764 | static int atomisp_mrfld_power(struct atomisp_device *isp, bool enable) | ^~~~~~~~~~~~~~~~~~~ during the build. Rather than commenting out the use entirely, just disable it semantically instead (using a "0 &&" construct), leaving the call in place from a syntax standpoint, and avoiding the warning. I really don't want my builds to have any warnings that can then hide real issues. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03Merge tag 'media/v5.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - Media documentation is now split into admin-guide, driver-api and userspace-api books (a longstanding request from Jon); - The media Kconfig was reorganized, in order to make easier to select drivers and their dependencies; - The testing drivers now has a separate directory; - added a new driver for Rockchip Video Decoder IP; - The atomisp staging driver was resurrected. It is meant to work with 4 generations of cameras on Atom-based laptops, tablets and cell phones. So, it seems worth investing time to cleanup this driver and making it in good shape. - Added some V4L2 core ancillary routines to help with h264 codecs; - Added an ov2740 image sensor driver; - The si2157 gained support for Analog TV, which, in turn, added support for some cx231xx and cx23885 boards to also support analog standards; - Added some V4L2 controls (V4L2_CID_CAMERA_ORIENTATION and V4L2_CID_CAMERA_SENSOR_ROTATION) to help identifying where the camera is located at the device; - VIDIOC_ENUM_FMT was extended to support MC-centric devices; - Lots of drivers improvements and cleanups. * tag 'media/v5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (503 commits) media: Documentation: media: Refer to mbus format documentation from CSI-2 docs media: s5k5baf: Replace zero-length array with flexible-array media: i2c: imx219: Drop <linux/clk-provider.h> and <linux/clkdev.h> media: i2c: Add ov2740 image sensor driver media: ov8856: Implement sensor module revision identification media: ov8856: Add devicetree support media: dt-bindings: ov8856: Document YAML bindings media: dvb-usb: Add Cinergy S2 PCIe Dual Port support media: dvbdev: Fix tuner->demod media controller link media: dt-bindings: phy: phy-rockchip-dphy-rx0: move rockchip dphy rx0 bindings out of staging media: staging: dt-bindings: phy-rockchip-dphy-rx0: remove non-used reg property media: atomisp: unify the version for isp2401 a0 and b0 versions media: atomisp: update TODO with the current data media: atomisp: adjust some code at sh_css that could be broken media: atomisp: don't produce errs for ignored IRQs media: atomisp: print IRQ when debugging media: atomisp: isp_mmu: don't use kmem_cache media: atomisp: add a notice about possible leak resources media: atomisp: disable the dynamic and reserved pools media: atomisp: turn on camera before setting it ...
2020-06-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: "More mm/ work, plenty more to come Subsystems affected by this patch series: slub, memcg, gup, kasan, pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs, thp, mmap, kconfig" * akpm: (131 commits) arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined riscv: support DEBUG_WX mm: add DEBUG_WX support drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid() powerpc/mm: drop platform defined pmd_mknotpresent() mm: thp: don't need to drain lru cache when splitting and mlocking THP hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs sparc32: register memory occupied by kernel as memblock.memory include/linux/memblock.h: fix minor typo and unclear comment mm, mempolicy: fix up gup usage in lookup_node tools/vm/page_owner_sort.c: filter out unneeded line mm: swap: memcg: fix memcg stats for huge pages mm: swap: fix vmstats for huge pages mm: vmscan: limit the range of LRU type balancing mm: vmscan: reclaim writepage is IO cost mm: vmscan: determine anon/file pressure balance at the reclaim root mm: balance LRU lists based on relative thrashing mm: only count actual rotations as LRU reclaim cost ...
2020-06-03ext4: avoid unnecessary transaction starts during writebackJan Kara
ext4_writepages() currently works in a loop like: start a transaction scan inode for pages to write map and submit these pages stop the transaction This loop results in starting transaction once more than is needed because in the last iteration we start a transaction only to scan the inode and find there are no pages to write. This can be significant increase in number of transaction starts for single-extent files or files that have all blocks already mapped. Furthermore we already know from previous iteration whether there are more pages to write or not. So propagate the information from mpage_prepare_extent_to_map() and avoid unnecessary looping in case there are no more pages to write. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200525081215.29451-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: don't block for O_DIRECT if IOCB_NOWAIT is setJens Axboe
Running with some debug patches to detect illegal blocking triggered the extend/unaligned condition in ext4. If ext4 needs to extend the file (and hence go to buffered IO), or if the app is doing unaligned IO, then ext4 asks the iomap code to wait for IO completion. If the caller asked for no-wait semantics by setting IOCB_NOWAIT, then ext4 should return -EAGAIN instead. Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/76152096-2bbb-7682-8fce-4cb498bcd909@kernel.dk Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: remove the access_ok() check in ext4_ioctl_get_es_cacheChristoph Hellwig
access_ok just checks we are fed a proper user pointer. We also do that in copy_to_user itself, so no need to do this early. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200523073016.2944131-10-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03fs: remove the access_ok() check in ioctl_fiemapChristoph Hellwig
access_ok just checks we are fed a proper user pointer. We also do that in copy_to_user itself, so no need to do this early. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200523073016.2944131-9-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03fs: handle FIEMAP_FLAG_SYNC in fiemap_prepChristoph Hellwig
By moving FIEMAP_FLAG_SYNC handling to fiemap_prep we ensure it is handled once instead of duplicated, but can still be done under fs locks, like xfs/iomap intended with its duplicate handling. Also make sure the error value of filemap_write_and_wait is propagated to user space. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200523073016.2944131-8-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03fs: move fiemap range validation into the file systems instancesChristoph Hellwig
Replace fiemap_check_flags with a fiemap_prep helper that also takes the inode and mapped range, and performs the sanity check and truncation previously done in fiemap_check_range. This way the validation is inside the file system itself and thus properly works for the stacked overlayfs case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200523073016.2944131-7-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03iomap: fix the iomap_fiemap prototypeChristoph Hellwig
iomap_fiemap should take u64 start and len arguments, just like the ->fiemap prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200523073016.2944131-6-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03fs: move the fiemap definitions out of fs.hChristoph Hellwig
No need to pull the fiemap definitions into almost every file in the kernel build. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200523073016.2944131-5-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03fs: mark __generic_block_fiemap staticChristoph Hellwig
There is no caller left outside of ioctl.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200523073016.2944131-4-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: remove the call to fiemap_check_flags in ext4_fiemapChristoph Hellwig
iomap_fiemap already calls fiemap_check_flags first thing, so this additional check is redundant. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200523073016.2944131-3-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: split _ext4_fiemapChristoph Hellwig
The fiemap and EXT4_IOC_GET_ES_CACHE cases share almost no code, so split them into entirely separate functions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200523073016.2944131-2-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: fix fiemap size checks for bitmap filesChristoph Hellwig
Add an extra validation of the len parameter, as for ext4 some files might have smaller file size limits than others. This also means the redundant size check in ext4_ioctl_get_es_cache can go away, as all size checking is done in the shared fiemap handler. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200505154324.3226743-3-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03ext4: fix EXT4_MAX_LOGICAL_BLOCK macroRitesh Harjani
ext4 supports max number of logical blocks in a file to be 0xffffffff. (This is since ext4_extent's ee_block is __le32). This means that EXT4_MAX_LOGICAL_BLOCK should be 0xfffffffe (starting from 0 logical offset). This patch fixes this. The issue was seen when ext4 moved to iomap_fiemap API and when overlayfs was mounted on top of ext4. Since overlayfs was missing filemap_check_ranges(), so it could pass a arbitrary huge length which lead to overflow of map.m_len logic. This patch fixes that. Fixes: d3b6f23f7167 ("ext4: move ext4_fiemap to use iomap framework") Reported-by: syzbot+77fa5bdb65cc39711820@syzkaller.appspotmail.com Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200505154324.3226743-2-hch@lst.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-06-03add comment for ext4_dir_entry_2 file_type memberJonathan Grant
Signed-off-by: Jonathan Grant <jg@jguk.org> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/ad3290d5-86af-99c1-f9d5-cd1bab710429@jguk.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>