linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-08-23	via-velocity: Use of_device_get_match_data to simplify code	Tang Bin
	Retrieve OF match data, it's better and cleaner to use 'of_device_get_match_data' over 'of_match_device'. Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23	via-rhine: Use of_device_get_match_data to simplify code	Tang Bin
	Retrieve OF match data, it's better and cleaner to use 'of_device_get_match_data' over 'of_match_device'. Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23	Merge branch 'opp/fixes' of ↵	Rafael J. Wysocki
	git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp Pull regression fix for the operating performance points (OPP) framework for v5.15 from Viresh Kumar: "This fixes regression in the OPP core for a corner case." * 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: opp: core: Check for pending links before reading required_opp pointers
2021-08-23	Merge back cpufreq changes for v5.15.	Rafael J. Wysocki

2021-08-23	Merge branch 'asix-fixes'	David S. Miller
	Oleksij Rempel says: ==================== asix fixes changes v2: - rebase against current net - add one more fix for the ax88178 variant ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23	net: usb: asix: do not call phy_disconnect() for ax88178	Oleksij Rempel
	Fix crash on reboot on a system with ASIX AX88178 USB adapter attached to it: \| asix 1-1.4:1.0 eth0: unregister 'asix' usb-ci_hdrc.0-1.4, ASIX AX88178 USB 2.0 Ethernet \| 8<--- cut here --- \| Unable to handle kernel NULL pointer dereference at virtual address 0000028c \| pgd = 5ec93aee \| [0000028c] pgd=00000000 \| Internal error: Oops: 5 [#1] PREEMPT SMP ARM \| Modules linked in: \| CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.14.0-rc1-20210811-1 #4 \| Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) \| PC is at phy_disconnect+0x8/0x48 \| LR is at ax88772_unbind+0x14/0x20 \| [<80650d04>] (phy_disconnect) from [<80741aa4>] (ax88772_unbind+0x14/0x20) \| [<80741aa4>] (ax88772_unbind) from [<8074e250>] (usbnet_disconnect+0x48/0xd8) \| [<8074e250>] (usbnet_disconnect) from [<807655e0>] (usb_unbind_interface+0x78/0x25c) \| [<807655e0>] (usb_unbind_interface) from [<805b03a0>] (__device_release_driver+0x154/0x20c) \| [<805b03a0>] (__device_release_driver) from [<805b0478>] (device_release_driver+0x20/0x2c) \| [<805b0478>] (device_release_driver) from [<805af944>] (bus_remove_device+0xcc/0xf8) \| [<805af944>] (bus_remove_device) from [<805ab26c>] (device_del+0x178/0x4b0) \| [<805ab26c>] (device_del) from [<807634a4>] (usb_disable_device+0xcc/0x178) \| [<807634a4>] (usb_disable_device) from [<8075a060>] (usb_disconnect+0xd8/0x238) \| [<8075a060>] (usb_disconnect) from [<8075a02c>] (usb_disconnect+0xa4/0x238) \| [<8075a02c>] (usb_disconnect) from [<8075a02c>] (usb_disconnect+0xa4/0x238) \| [<8075a02c>] (usb_disconnect) from [<80af3520>] (usb_remove_hcd+0xa0/0x198) \| [<80af3520>] (usb_remove_hcd) from [<807902e0>] (host_stop+0x38/0xa8) \| [<807902e0>] (host_stop) from [<8078d9e4>] (ci_hdrc_remove+0x3c/0x118) \| [<8078d9e4>] (ci_hdrc_remove) from [<805b27ec>] (platform_remove+0x20/0x50) \| [<805b27ec>] (platform_remove) from [<805b03a0>] (__device_release_driver+0x154/0x20c) \| [<805b03a0>] (__device_release_driver) from [<805b0478>] (device_release_driver+0x20/0x2c) \| [<805b0478>] (device_release_driver) from [<805af944>] (bus_remove_device+0xcc/0xf8) \| [<805af944>] (bus_remove_device) from [<805ab26c>] (device_del+0x178/0x4b0) For this adapter we call ax88178_bind() and ax88772_unbind(), which is related to different chip version and different counter part bind() function. Since this chip is currently not ported to the PHYLIB, we do not need to call phy_disconnect() here. So, to fix this crash, we need to add ax88178_unbind(). Fixes: e532a096be0e ("net: usb: asix: ax88772: add phylib support") Reported-by: Robin van der Gracht <robin@protonic.nl> Tested-by: Robin van der Gracht <robin@protonic.nl> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23	net: usb: asix: ax88772: move embedded PHY detection as early as possible	Oleksij Rempel
	Some HW revisions need additional MAC configuration before the embedded PHY can be enabled. If this is not done, we won't be able to get response from the internal PHY. This issue was detected on chipcode == AX_AX88772_CHIPCODE variant, where ax88772_hw_reset() was executed with missing embd_phy flag. Fixes: e532a096be0e ("net: usb: asix: ax88772: add phylib support") Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-23	ASoC: intel: atom: Revert PCM buffer address setup workaround again	Takashi Iwai
	We worked around the breakage of PCM buffer setup by the commit 65ca89c2b12c ("ASoC: intel: atom: Fix breakage for PCM buffer address setup"), but this isn't necessary since the CONTINUOUS buffer type also sets runtime->dma_addr since commit f84ba106a018 ("ALSA: memalloc: Store snd_dma_buffer.addr for continuous pages, too"). Let's revert the change again. Link: https://lore.kernel.org/r/20210822072127.9786-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-23	Merge branch 'for-linus' into for-next	Takashi Iwai

2021-08-23	ALSA: doc: Fix indentation warning	Takashi Iwai
	Fix a trivial warning for the indentation by putting an empty line: Documentation/sound/alsa-configuration.rst:2258: WARNING: Unexpected indentation. Fixes: a39978ed6df1 ("ALSA: doc: Add the description of quirk_flags option for snd-usb-audio") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/r/20210823113518.30134-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-08-23	udf_get_extendedattr() had no boundary checks.	Stian Skjelstad
	When parsing the ExtendedAttr data, malicous or corrupt attribute length could cause kernel hangs and buffer overruns in some special cases. Link: https://lore.kernel.org/r/20210822093332.25234-1-stian.skjelstad@gmail.com Signed-off-by: Stian Skjelstad <stian.skjelstad@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2021-08-23	m68k: Fix asm register constraints for atomic ops	Geert Uytterhoeven
	Depending on register assignment by the compiler: {standard input}:3084: Error: operands mismatch -- statement `andl %a1,%d1' ignored {standard input}:3145: Error: operands mismatch -- statement `orl %a1,%d1' ignored {standard input}:3195: Error: operands mismatch -- statement `eorl %a1,%d1' ignored Indeed, the first operand must not be an address register. However, it can be an immediate value. Fix this by adjusting the register constraint from "g" (general purpose register) to "di" (data register or immediate). Fixes: e39d88ea3ce4a471 ("locking/atomic, arch/m68k: Implement atomic_fetch_{add,sub,and,or,xor}()") Fixes: d839bae4269aea46 ("locking,arch,m68k: Fold atomic_ops") Fixes: 1da177e4c3f41524 ("Linux-2.6.12-rc2") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20210809112903.3898660-1-geert@linux-m68k.org
2021-08-23	btrfs: zoned: fix ordered extent boundary calculation	Naohiro Aota
	btrfs_lookup_ordered_extent() is supposed to query the offset in a file instead of the logical address. Pass the file offset from submit_extent_page() to calc_bio_boundaries(). Also, calc_bio_boundaries() relies on the bio's operation flag, so move the call site after setting it. Fixes: 390ed29b817e ("btrfs: refactor submit_extent_page() to make bio and its flag tracing easier") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: do not do preemptive flushing if the majority is global rsv	Josef Bacik
	A common characteristic of the bug report where preemptive flushing was going full tilt was the fact that the vast majority of the free metadata space was used up by the global reserve. The hard 90% threshold would cover the majority of these cases, but to be even smarter we should take into account how much of the outstanding reservations are covered by the global block reserve. If the global block reserve accounts for the vast majority of outstanding reservations, skip preemptive flushing, as it will likely just cause churn and pain. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212185 Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: reduce the preemptive flushing threshold to 90%	Josef Bacik
	The preemptive flushing code was added in order to avoid needing to synchronously wait for ENOSPC flushing to recover space. Once we're almost full however we can essentially flush constantly. We were using 98% as a threshold to determine if we were simply full, however in practice this is a really high bar to hit. For example reports of systems running into this problem had around 94% usage and thus continued to flush. Fix this by lowering the threshold to 90%, which is a more sane value, especially for smaller file systems. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212185 CC: stable@vger.kernel.org # 5.12+ Fixes: 576fa34830af ("btrfs: improve preemptive background space flushing") Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: tree-log: check btrfs_lookup_data_extent return value	Marcos Paulo de Souza
	Function btrfs_lookup_data_extent calls btrfs_search_slot to verify if the EXTENT_ITEM exists in the extent tree. btrfs_search_slot can return values bellow zero if an error happened. Function replay_one_extent currently checks if the search found something (0 returned) and increments the reference, and if not, it seems to evaluate as 'not found'. Fix the condition by checking if the value was bellow zero and return early. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: avoid unnecessarily logging directories that had no changes	Filipe Manana
	There are several cases where when logging an inode we need to log its parent directories or logging subdirectories when logging a directory. There are cases however where we end up logging a directory even if it was not changed in the current transaction, no dentries added or removed since the last transaction. While this is harmless from a functional point of view, it is a waste time as it brings no advantage. One example where this is triggered is the following: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ mkdir /mnt/A $ mkdir /mnt/B $ mkdir /mnt/C $ touch /mnt/A/foo $ ln /mnt/A/foo /mnt/B/bar $ ln /mnt/A/foo /mnt/C/baz $ sync $ rm -f /mnt/A/foo $ xfs_io -c "fsync" /mnt/B/bar This last fsync ends up logging directories A, B and C, however we only need to log directory A, as B and C were not changed since the last transaction commit. So fix this by changing need_log_inode(), to return false in case the given inode is a directory and has a ->last_trans value smaller than the current transaction's ID. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped mount	Christian Brauner
	Now that we converted btrfs internally to account for idmapped mounts allow the creation of idmapped mounts on by setting the FS_ALLOW_IDMAP flag. We only need to raise this flag on the btrfs_root_fs_type filesystem since btrfs_mount_root() is ultimately responsible for allocating the superblock and is called into from btrfs_mount() associated with btrfs_fs_type. The conversion of the btrfs inode operations was straightforward. Regarding btrfs specific ioctls that perform checks based on inode permissions only those have been allowed that are not filesystem wide operations and hence can be reasonably charged against a specific mount. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: handle ACLs on idmapped mounts	Christian Brauner
	Make the ACL code idmapped mount aware. The POSIX default and POSIX access ACLs are the only ACLs other than some specific xattrs that take DAC permissions into account. On an idmapped mount they need to be translated according to the mount's userns. The main change is done to __btrfs_set_acl() which is responsible for translating POSIX ACLs to their final on-disk representation. The btrfs_init_acl() helper does not need to take the idmapped mount into account since it is called in the context of file creation operations (mknod, create, mkdir, symlink, tmpfile) and is used for btrfs_init_inode_security() to copy POSIX default and POSIX access permissions from the parent directory. These ACLs need to be inherited unmodified from the parent directory. This is identical to what we do for ext4 and xfs. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped INO_LOOKUP_USER ioctl	Christian Brauner
	The INO_LOOKUP_USER is an unprivileged version of the INO_LOOKUP ioctl and has the following restrictions. The main difference between the two is that INO_LOOKUP is filesystem wide operation wheres INO_LOOKUP_USER is scoped beneath the file descriptor passed with the ioctl. Specifically, INO_LOOKUP_USER must adhere to the following restrictions: - The caller must be privileged over each inode of each path component for the path they are trying to lookup. - The path for the subvolume the caller is trying to lookup must be reachable from the inode associated with the file descriptor passed with the ioctl. The second condition makes it possible to scope the lookup of the path to the mount identified by the file descriptor passed with the ioctl. This allows us to enable this ioctl on idmapped mounts. Specifically, this is possible because all child subvolumes of a parent subvolume are reachable when the parent subvolume is mounted. So if the user had access to open the parent subvolume or has been given the fd then they can lookup the path if they had access to it provided they were privileged over each path component. Note, the INO_LOOKUP_USER ioctl allows a user to learn the path and name of a subvolume even though they would otherwise be restricted from doing so via regular VFS-based lookup. So think about a parent subvolume with multiple child subvolumes. Someone could mount he parent subvolume and restrict access to the child subvolumes by overmounting them with empty directories. At this point the user can't traverse the child subvolumes and they can't open files in the child subvolumes. However, they can still learn the path of child subvolumes as long as they have access to the parent subvolume by using the INO_LOOKUP_USER ioctl. The underlying assumption here is that it's ok that the lookup ioctls can't really take mounts into account other than the original mount the fd belongs to during lookup. Since this assumption is baked into the original INO_LOOKUP_USER ioctl we can extend it to idmapped mounts. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped SUBVOL_SETFLAGS ioctl	Christian Brauner
	Setting flags on subvolumes or snapshots are core features of btrfs. The SUBVOL_SETFLAGS ioctl is especially important as it allows to make subvolumes and snapshots read-only or read-write. Allow setting flags on btrfs subvolumes and snapshots on idmapped mounts. This is a fairly straightforward operation since all the permission checking helpers are already capable of handling idmapped mounts. So we just need to pass down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls	Christian Brauner
	The SET_RECEIVED_SUBVOL ioctls are used to set information about a received subvolume. Make it possible to set information about a received subvolume on idmapped mounts. This is a fairly straightforward operation since all the permission checking helpers are already capable of handling idmapped mounts. So we just need to pass down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids	Christian Brauner
	So far we prevented the deletion of subvolumes and snapshots using subvolume ids possible with the BTRFS_SUBVOL_SPEC_BY_ID flag. This restriction is necessary on idmapped mounts as this allows filesystem wide subvolume and snapshot deletions and thus can escape the scope of what's exposed under the mount identified by the fd passed with the ioctl. Deletion by subvolume id works by looking for an alias of the parent of the subvolume or snapshot to be deleted. The parent alias can be anywhere in the filesystem. However, as long as the alias of the parent that is found is the same as the one identified by the file descriptor passed through the ioctl we can allow the deletion. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped SNAP_DESTROY ioctls	Christian Brauner
	Destroying subvolumes and snapshots are important features of btrfs. Both operations are available to unprivileged users if the filesystem has been mounted with the "user_subvol_rm_allowed" mount option. Allow subvolume and snapshot deletion on idmapped mounts. This is a fairly straightforward operation since all the permission checking helpers are already capable of handling idmapped mounts. So we just need to pass down the mount's userns. Subvolumes and snapshots can either be deleted by specifying their name or - if BTRFS_IOC_SNAP_DESTROY_V2 is used - by their subvolume or snapshot id if the BTRFS_SUBVOL_SPEC_BY_ID is set. This feature is blocked on idmapped mounts as this allows filesystem wide subvolume deletions and thus can escape the scope of what's exposed under the mount identified by the fd passed with the ioctl. This means that even the root or CAP_SYS_ADMIN capable user can't delete a subvolume via BTRFS_SUBVOL_SPEC_BY_ID. This is intentional. The root user is currently already subject to permission checks in btrfs_may_delete() including whether the inode's i_uid/i_gid of the directory the subvolume is located in have a mapping in the caller's idmapping. For this to fail isn't currently possible since a btrfs filesystem can't be mounted with a non-initial idmapping but it shows that even the root user would fail to delete a subvolume if the relevant inode isn't mapped in their idmapping. The idmapped mount case is the same in principle. This isn't a huge problem a root user wanting to delete arbitrary subvolumes can just always create another (even detached) mount without an idmapping attached. In addition, we will allow BTRFS_SUBVOL_SPEC_BY_ID for cases where the subvolume to delete is directly located under inode referenced by the fd passed for the ioctl() in a follow-up commit. Here is an example where a btrfs subvolume is deleted through a subvolume mount that does not expose the subvolume to be delete but it can still be deleted by using the subvolume id: /* Compile the following program as "delete_by_spec". / #define _GNU_SOURCE #include <fcntl.h> #include <inttypes.h> #include <linux/btrfs.h> #include <stdio.h> #include <stdlib.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <unistd.h> static int rm_subvolume_by_id(int fd, uint64_t subvolid) { struct btrfs_ioctl_vol_args_v2 args = {}; int ret; args.flags = BTRFS_SUBVOL_SPEC_BY_ID; args.subvolid = subvolid; ret = ioctl(fd, BTRFS_IOC_SNAP_DESTROY_V2, &args); if (ret < 0) return -1; return 0; } int main(int argc, char argv[]) { int subvolid = 0; if (argc < 3) exit(1); fprintf(stderr, "Opening %s\n", argv[1]); int fd = open(argv[1], O_CLOEXEC \| O_DIRECTORY); if (fd < 0) exit(2); subvolid = atoi(argv[2]); fprintf(stderr, "Deleting subvolume with subvolid %d\n", subvolid); int ret = rm_subvolume_by_id(fd, subvolid); if (ret < 0) exit(3); exit(0); } #include <stdio.h>" #include <stdlib.h>" #include <linux/btrfs.h" truncate -s 10G btrfs.img mkfs.btrfs btrfs.img export LOOPDEV=$(sudo losetup -f --show btrfs.img) mount ${LOOPDEV} /mnt sudo chown $(id -u):$(id -g) /mnt btrfs subvolume create /mnt/A btrfs subvolume create /mnt/B/C # Get subvolume id via: sudo btrfs subvolume show /mnt/A # Save subvolid SUBVOLID=<nr> sudo umount /mnt sudo mount ${LOOPDEV} -o subvol=B/C,user_subvol_rm_allowed /mnt ./delete_by_spec /mnt ${SUBVOLID} With idmapped mounts this can potentially be used by users to delete subvolumes/snapshots they would otherwise not have access to as the idmapping would be applied to an inode that is not exposed in the mount of the subvolume. The fact that this is a filesystem wide operation suggests it might be a good idea to expose this under a separate ioctl that clearly indicates this. In essence, the file descriptor passed with the ioctl is merely used to identify the filesystem on which to operate when BTRFS_SUBVOL_SPEC_BY_ID is used. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls	Christian Brauner
	Creating subvolumes and snapshots is one of the core features of btrfs and is even available to unprivileged users. Make it possible to use subvolume and snapshot creation on idmapped mounts. This is a fairly straightforward operation since all the permission checking helpers are already capable of handling idmapped mounts. So we just need to pass down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: check whether fsgid/fsuid are mapped during subvolume creation	Christian Brauner
	When a new subvolume is created btrfs currently doesn't check whether the fsgid/fsuid of the caller actually have a mapping in the user namespace attached to the filesystem. The VFS always checks this to make sure that the caller's fsgid/fsuid can be represented on-disk. This is most relevant for filesystems that can be mounted inside user namespaces but it is in general a good hardening measure to prevent unrepresentable gid/uid from being written to disk. Since we want to support idmapped mounts for btrfs ioctls to create subvolumes in follow-up patches this becomes important since we want to make sure the fsgid/fsuid of the caller as mapped according to the idmapped mount can be represented on-disk. Simply add the missing fsuidgid_has_mapping() line from the VFS may_create() version to btrfs_may_create(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped permission inode op	Christian Brauner
	Enable btrfs_permission() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped setattr inode op	Christian Brauner
	Enable btrfs_setattr() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped tmpfile inode op	Christian Brauner
	Enable btrfs_tmpfile() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped symlink inode op	Christian Brauner
	Enable btrfs_symlink() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped mkdir inode op	Christian Brauner
	Enable btrfs_mkdir() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped create inode op	Christian Brauner
	Enable btrfs_create() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped mknod inode op	Christian Brauner
	Enable btrfs_mknod() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped getattr inode op	Christian Brauner
	Enable btrfs_getattr() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allow idmapped rename inode op	Christian Brauner
	Enable btrfs_rename() to handle idmapped mounts. This is just a matter of passing down the mount's userns. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: handle idmaps in btrfs_new_inode()	Christian Brauner
	Extend btrfs_new_inode() to take the idmapped mount into account when initializing a new inode. This is just a matter of passing down the mount's userns. The rest is taken care of in inode_init_owner(). This is a preliminary patch to make the individual btrfs inode operations idmapped mount aware. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	namei: add mapping aware lookup helper	Christian Brauner
	Various filesystems rely on the lookup_one_len() helper to lookup a single path component relative to a well-known starting point. Allow such filesystems to support idmapped mounts by adding a version of this helper to take the idmap into account when calling inode_permission(). This change is a required to let btrfs (and other filesystems) support idmapped mounts. Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: sysfs: document structures and their associated files	Anand Jain
	Sysfs file has grown big. It takes some time to locate the correct struct attribute to add new files. Create a table and map the struct attribute to its sysfs path. Also, fix the comment about the debug sysfs path. And add the comments to the attributes instead of attribute group, where sysfs file names are defined. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: fix NULL pointer dereference when deleting device by invalid id	Qu Wenruo
	[BUG] It's easy to trigger NULL pointer dereference, just by removing a non-existing device id: # mkfs.btrfs -f -m single -d single /dev/test/scratch1 \ /dev/test/scratch2 # mount /dev/test/scratch1 /mnt/btrfs # btrfs device remove 3 /mnt/btrfs Then we have the following kernel NULL pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 649 Comm: btrfs Not tainted 5.14.0-rc3-custom+ #35 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:btrfs_rm_device+0x4de/0x6b0 [btrfs] btrfs_ioctl+0x18bb/0x3190 [btrfs] ? lock_is_held_type+0xa5/0x120 ? find_held_lock.constprop.0+0x2b/0x80 ? do_user_addr_fault+0x201/0x6a0 ? lock_release+0xd2/0x2d0 ? __x64_sys_ioctl+0x83/0xb0 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae [CAUSE] Commit a27a94c2b0c7 ("btrfs: Make btrfs_find_device_by_devspec return btrfs_device directly") moves the "missing" device path check into btrfs_rm_device(). But btrfs_rm_device() itself can have case where it only receives @devid, with NULL as @device_path. In that case, calling strcmp() on NULL will trigger the NULL pointer dereference. Before that commit, we handle the "missing" case inside btrfs_find_device_by_devspec(), which will not check @device_path at all if @devid is provided, thus no way to trigger the bug. [FIX] Before calling strcmp(), also make sure @device_path is not NULL. Fixes: a27a94c2b0c7 ("btrfs: Make btrfs_find_device_by_devspec return btrfs_device directly") CC: stable@vger.kernel.org # 5.4+ Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: zoned: add asserts on splitting extent_map	Naohiro Aota
	We call split_zoned_em() on an extent_map on submitting a bio for it. Thus, we can assume the extent_map is PINNED, not LOGGING, and in the modified list. Add ASSERT()s to ensure the extent_maps after the split also has the proper flags set and are in the modified list. Suggested-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: zoned: fix block group alloc_offset calculation	Naohiro Aota
	alloc_offset is offset from the start of a block group and @offset is actually an address in logical space. Thus, we need to consider block_group->start when calculating them. Fixes: 011b41bffa3d ("btrfs: zoned: advance allocation pointer after tree log node") CC: stable@vger.kernel.org # 5.12+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: zoned: suppress reclaim error message on EAGAIN	Naohiro Aota
	btrfs_relocate_chunk() can fail with -EAGAIN when e.g. send operations are running. The message can fail btrfs/187 and it's unnecessary because we anyway add it back to the reclaim list. btrfs_reclaim_bgs_work() `-> btrfs_relocate_chunk() `-> btrfs_relocate_block_group() `-> reloc_chunk_start() `-> if (fs_info->send_in_progress) `-> return -EAGAIN CC: stable@vger.kernel.org # 5.13+ Fixes: 18bb8bbf13c1 ("btrfs: zoned: automatically reclaim zones") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: zoned: allow disabling of zone auto reclaim	Johannes Thumshirn
	Automatically reclaiming dirty zones might not always be desired for all workloads, especially as there are currently still some rough edges with the relocation code on zoned filesystems. Allow disabling zone auto reclaim on a per filesystem basis by writing 0 as the threshold value. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: update comment at log_conflicting_inodes()	Filipe Manana
	A comment at log_conflicting_inodes() mentions that we check the inode's logged_trans field instead of using btrfs_inode_in_log() because the field last_log_commit is not updated when we log that an inode exists and the inode has the full sync flag (BTRFS_INODE_NEEDS_FULL_SYNC) set. The part about the full sync flag is not true anymore since commit 9acc8103ab594f ("btrfs: fix unpersisted i_size on fsync after expanding truncate"), so update the comment to not mention that part anymore. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: remove no longer needed full sync flag check at inode_logged()	Filipe Manana
	Now that we are checking if the inode's logged_trans is 0 to detect the possibility of the inode having been evicted and reloaded, the test for the full sync flag (BTRFS_INODE_NEEDS_FULL_SYNC) is no longer needed at tree-log.c:inode_logged(). Its purpose was to detect the possibility of a previous eviction as well, since when an inode is loaded the full sync flag is always set on it (and only cleared after the inode is logged). So just remove the check and update the comment. The check for the inode's logged_trans being 0 was added recently by the patch with the subject "btrfs: eliminate some false positives when checking if inode was logged". Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: remove unnecessary NULL check for the new inode during rename exchange	Filipe Manana
	At the very end of btrfs_rename_exchange(), in case an error happened, we are checking if 'new_inode' is NULL, but that is not needed since during a rename exchange, unlike regular renames, 'new_inode' can never be NULL, and if it were, we would have a crashed much earlier when we dereference it multiple times. So remove the check because it is not necessary and because it is causing static checkers to emit a warning. I probably introduced the check by copy-pasting similar code from btrfs_rename(), where 'new_inode' can be NULL, in commit 86e8aa0e772cab ("Btrfs: unpin logs if rename exchange operation fails"). Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allocate backref_ctx on stack in find_extent_clone	Goldwyn Rodrigues
	Instead of using kmalloc() to allocate backref_ctx, allocate backref_ctx on stack. The size is reasonably small. sizeof(backref_ctx) = 48 Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allocate btrfs_ioctl_defrag_range_args on stack	Goldwyn Rodrigues
	Instead of using kmalloc() to allocate btrfs_ioctl_defrag_range_args, allocate btrfs_ioctl_defrag_range_args on stack, the size is reasonably small and ioctls are called in process context. sizeof(btrfs_ioctl_defrag_range_args) = 48 Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allocate btrfs_ioctl_quota_rescan_args on stack	Goldwyn Rodrigues
	Instead of using kmalloc() to allocate btrfs_ioctl_quota_rescan_args, allocate btrfs_ioctl_quota_rescan_args on stack, the size is reasonably small and ioctls are called in process context. sizeof(btrfs_ioctl_quota_rescan_args) = 64 Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-08-23	btrfs: allocate file_ra_state on stack in readahead_cache	Goldwyn Rodrigues
	Instead of allocating file_ra_state using kmalloc, allocate on stack. sizeof(struct readahead) = 32 bytes. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>