linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-01-22	be2net: restore properly promisc mode after queues reconfiguration	Ivan Vecera
	The commit 622190669403 ("be2net: Request RSS capability of Rx interface depending on number of Rx rings") modified be_update_queues() so the IFACE (HW representation of the netdevice) is destroyed and then re-created. This causes a regression because potential promiscuous mode is not restored properly during be_open() because the driver thinks that the HW has promiscuous mode already enabled. Note that Lancer is not affected by this bug because RX-filter flags are disabled during be_close() for this chipset. Cc: Sathya Perla <sathya.perla@broadcom.com> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com> Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Cc: Somnath Kotur <somnath.kotur@broadcom.com> Fixes: 622190669403 ("be2net: Request RSS capability of Rx interface depending on number of Rx rings") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	net: igmp: fix source address check for IGMPv3 reports	Felix Fietkau
	Commit "net: igmp: Use correct source address on IGMPv3 reports" introduced a check to validate the source address of locally generated IGMPv3 packets. Instead of checking the local interface address directly, it uses inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the local subnet (or equal to the point-to-point address if used). This breaks for point-to-point interfaces, so check against ifa->ifa_local directly. Cc: Kevin Cernekee <cernekee@chromium.org> Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports") Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	gso: validate gso_type in GSO handlers	Willem de Bruijn
	Validate gso_type during segmentation as SKB_GSO_DODGY sources may pass packets where the gso_type does not match the contents. Syzkaller was able to enter the SCTP gso handler with a packet of gso_type SKB_GSO_TCPV4. On entry of transport layer gso handlers, verify that the gso_type matches the transport protocol. Fixes: 90017accff61 ("sctp: Add GSO support") Link: http://lkml.kernel.org/r/<001a1137452496ffc305617e5fe0@google.com> Reported-by: syzbot+fee64147a25aecd48055@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	net: qdisc_pkt_len_init() should be more robust	Eric Dumazet
	Without proper validation of DODGY packets, we might very well feed qdisc_pkt_len_init() with invalid GSO packets. tcp_hdrlen() might access out-of-bound data, so let's use skb_header_pointer() and proper checks. Whole story is described in commit d0c081b49137 ("flow_dissector: properly cap thoff field") We have the goal of validating DODGY packets earlier in the stack, so we might very well revert this fix in the future. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Jason Wang <jasowang@redhat.com> Reported-by: syzbot+9da69ebac7dddd804552@syzkaller.appspotmail.com Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	Merge branch 'ibmvnic-reset-behavior-fixes'	David S. Miller
	John Allen says: ==================== ibmvnic: Reset behavior fixes This patchset fixes a number of issues related to ibmvnic reset uncovered from testing new Power9 machines with Everglades adapters and the new functionality to change mtu and other parameters in the driver. Changes since v1: -In patch 1/3, added the line to free the long term buffers before allocating a new one. This change inadvertently uncovered the problem that the number of queues can change after a failover as well. To fix this, we check whether or not the number of queues has changed in do_reset and if they have, we do a full release and init of the queues. -In patch 1/3, added variables to the adapter struct to track how many rx/tx pools have actually been allocated and modify the release pools routines to use these values rather than the possibly incorrect req_rx/tx_queues values. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	ibmvnic: Allocate and request vpd in init_resources	John Allen
	In reset events in which our memory allocations need to be reallocated, VPD data is being freed, but never reallocated. This can cause issues if we later attempt to access that memory or reset and attempt to free the memory. This patch moves the allocation of the VPD data to init_resources so that it will be symmetrically freed during release resources. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	ibmvnic: Revert to previous mtu when unsupported value requested	John Allen
	If we request an unsupported mtu value, the vnic server will suggest a different value. Currently we take the suggested value without question and login with that value. However, the behavior doesn't seem completely sane as attempting to change the mtu to some specific value will change the mtu to some completely different value most of the time. This patch fixes the issue by logging in with the previously used mtu value and printing an error message saying that the given mtu is unsupported. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	ibmvnic: Modify buffer size and number of queues on failover	John Allen
	Using newer backing devices can cause the required padding at the end of buffer as well as the number of queues to change after a failover. Since we currently assume that these values never change, after a failover to a backing device with different capabilities, we can get errors from the vnic server, attempt to free long term buffers that are no longer there, or not free long term buffers that should be freed. This patch resolves the issue by checking whether any of these values change, and if so perform the necessary re-allocations. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	rds: tcp: compute m_ack_seq as offset from ->write_seq	Sowmini Varadhan
	rds-tcp uses m_ack_seq to track the tcp ack# that indicates that the peer has received a rds_message. The m_ack_seq is used in rds_tcp_is_acked() to figure out when it is safe to drop the rds_message from the RDS retransmit queue. The m_ack_seq must be calculated as an offset from the right edge of the in-flight tcp buffer, i.e., it should be based on the ->write_seq, not the ->snd_nxt. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	usbnet: silence an unnecessary warning	Oliver Neukum
	That a kevent could not be scheduled is not an error. Such handlers must be able to deal with multiple events anyway. As the successful scheduling of a work is a debug event, make the failure debug priority, too. V2: coding style Signed-off-by: Oliver Neukum <oneukum@suse.com> Reported-by: Cristian Caravena <caravena@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	Merge branch 'cxgb4-tc-flower-offload-fixes'	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf 2018-01-18 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix a divide by zero due to wrong if (src_reg == 0) check in 64-bit mode. Properly handle this in interpreter and mask it also generically in verifier to guard against similar checks in JITs, from Eric and Alexei. 2) Fix a bug in arm64 JIT when tail calls are involved and progs have different stack sizes, from Daniel. 3) Reject stores into BPF context that are not expected BPF_STX \| BPF_MEM variant, from Daniel. 4) Mark dst reg as unknown on {s,u}bounds adjustments when the src reg has derived bounds from dead branches, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	cxgb4: fix endianness for vlan value in cxgb4_tc_flower	Kumar Sanghvi
	Don't change endianness when assigning vlan value in cxgb4_tc_flower code when processing flow match parameters. The value gets converted to network order as part of filtering code in set_filter_wr. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	cxgb4: set filter type to 1 for ETH_P_IPV6	Kumar Sanghvi
	For ethtype_key = ETH_P_IPV6, set filter type as 1 in cxgb4_tc_flower code when processing flow match parameters. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22	mm, page_vma_mapped: Introduce pfn_in_hpage()	Kirill A. Shutemov
	The new helper would check if the pfn belongs to the page. For huge pages it checks if the PFN is within range covered by the huge page. The helper is used in check_pte(). The original code the helper replaces had two call to page_to_pfn(). page_to_pfn() is relatively costly. Although current GCC is able to optimize code to have one call, it's better to do this explicitly. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-22	Input: xpad - add support for PDP Xbox One controllers	Mark Furneaux
	Adds support for the current lineup of Xbox One controllers from PDP (Performance Designed Products). These controllers are very picky with their initialization sequence and require an additional 2 packets before they send any input reports. Signed-off-by: Mark Furneaux <mark@furneaux.ca> Reviewed-by: Cameron Gutman <aicommander@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-01-22	Input: stmfts,s6sy671 - add SPDX identifier	Andi Shyti
	Replace the original license statement with the SPDX identifier. Update also the copyright owner adding myself as co-owner of the copyright. Signed-off-by: Andi Shyti <andi.shyti@samsung.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-01-22	btrfs: set the total_devices in device_list_add()	Anand Jain
	There is no other parent for device_list_add() except for btrfs_scan_one_device(), which would set btrfs_fs_devices::total_devices if device_list_add is successful and this can be done with in device_list_add() itself. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: move pr_info into device_list_add	Anand Jain
	Commit 60999ca4b403 ("btrfs: make device scan less noisy") adds return value 1 to device_list_add(), so that parent function can call pr_info only when new device is added. Move the pr_info() part into device_list_add() so that this function can be kept simple. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: make btrfs_free_stale_devices() to match the path	Anand Jain
	The btrfs_free_stale_devices() is updated to match for the given device path and delete it. (It searches for only unmounted list of devices.) Also drop the comment about different path being used for the same device, since now we will have cli to clean any device that's not a concern any more. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: rename btrfs_free_stale_devices() arg to skip_dev	Anand Jain
	No functional changes. Rename btrfs_free_stale_devices() arg to skip_dev, so that it reflects what that arg for. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: make btrfs_free_stale_devices() argument optional	Anand Jain
	This updates btrfs_free_stale_devices() helper function to delete all unmouted devices, when arg is NULL. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: make btrfs_free_stale_device() to iterate all stales	Anand Jain
	Let the list iterator iterate further and find other stale devices and delete it. This is in preparation to add support for user land request-able stale devices cleanup. Also rename btrfs_free_stale_device() to btrfs_free_stale_devices(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: no need to check for btrfs_fs_devices::seeding	Anand Jain
	There is no need to check for btrfs_fs_devices::seeding when we have checked for btrfs_fs_devices::opened, because we can't sprout without its seed FS being opened. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	RDMA/cma: Update RoCE multicast routines to use net namespace	Parav Pandit
	rdma_dev_addr contains the net namespace pointer, while referring bound_dev_if of the rdma_dev_addr, refer to the net namespace of rdma_cm_id stored in rdma_dev_addr. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-22	RDMA/cma: Update cma_validate_port to honor net namespace	Parav Pandit
	cma_validate_port uses rdma_dev_addr to validate the port of the cm_id. It needs to honor the net namespace which is setup during cm_id creation when finding netdevice. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-22	RDMA/cma: Refactor to access multiple fields of rdma_dev_addr	Parav Pandit
	Pass the rdma_cm_id so that multiple fields of the rdma_dev_addr structure can be accessed, instead of passing each individual fields. This is needed to access some additional fields in followup patches. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-22	RDMA/cma: Check existence of netdevice during port validation	Parav Pandit
	If valid netdevice is not found for RoCE, GID table should not be searched with NULL netdevice. Doing so causes the search routines to ignore the netdev argument and may match the wrong GID table entry if the netdev is deleted. Fixes: abae1b71dd37 ("IB/cma: cma_validate_port should verify the port and netdevice") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-22	btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it	Nikolay Borisov
	No functional changes, just makes the code more readable Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: noinline merge_extent_mapping	Liu Bo
	In order to debug subtle bugs around merge_extent_mapping(), perf probe can be used to check the arguments, but sometimes merge_extent_mapping() got inlined by compiler and couldn't be probed. This is adding noinline attribute to merge_extent_mapping(). Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping	Liu Bo
	This is a subtle case, so in order to understand the problem, it'd be good to know the content of existing and em when any error occurs. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: extent map selftest: dio write vs dio read	Liu Bo
	This test case simulates the racy situation of dio write vs dio read, and see if btrfs_get_extent() would return -EEXIST. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: extent map selftest: buffered write vs dio read	Liu Bo
	This test case simulates the racy situation of buffered write vs dio read, and see if btrfs_get_extent() would return -EEXIST. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: add extent map selftests	Liu Bo
	We've observed that btrfs_get_extent() and merge_extent_mapping() could return -EEXIST in several cases, and they are caused by some racy condition, e.g dio read vs dio write, which makes the problem very tricky to reproduce. This adds extent map selftests in order to simulate those racy situations. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> [ minor string adjustments ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: move extent map specific code to extent_map.c	Liu Bo
	These helpers are extent map specific, move them to extent_map.c. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: add helper for em merge logic	Liu Bo
	This is a prepare work for the following extent map selftest, which runs tests against em merge logic. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: fix unexpected EEXIST from btrfs_get_extent	Liu Bo
	This fixes a corner case that is caused by a race of dio write vs dio read/write. Here is how the race could happen. Suppose that no extent map has been loaded into memory yet. There is a file extent [0, 32K), two jobs are running concurrently against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio read from [0, 4K) or [4K, 8K). t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K). ------------------------------------------------------ t1 t2 btrfs_get_blocks_direct() btrfs_get_blocks_direct() -> btrfs_get_extent() -> btrfs_get_extent() -> lookup_extent_mapping() -> add_extent_mapping() -> lookup_extent_mapping() # load [0, 32K) -> btrfs_new_extent_direct() -> btrfs_drop_extent_cache() # split [0, 32K) and # drop [8K, 32K) -> add_extent_mapping() # add [8K, 32K) -> add_extent_mapping() # handle -EEXIST when adding # [0, 32K) ------------------------------------------------------ About how t2(dio read/write) runs into -EEXIST: a) add_extent_mapping() gets -EEXIST for adding em [0, 32k), b) search_extent_mapping() then returns [0, 8k) as the existing em, even though start == existing->start, em is [0, 32k) so that extent_map_end(em) > extent_map_end(existing), i.e. 32k > 8k, c) then it goes thru merge_extent_mapping() which tries to add a [8k, 8k) (with a length 0) and returns -EEXIST as [8k, 32k) is already in tree, d) so btrfs_get_extent() ends up returning -EEXIST to dio read/write, which is confusing applications. Here I conclude all the possible situations, 1) start < existing->start +-----------+em+-----------+ +--prev---+ \| +-------------+ \| \| \| \| \| \| \| +---------+ + +---+existing++ ++ + \| + start 2) start == existing->start +------------em------------+ \| +-------------+ \| \| \| \| \| + +----existing-+ + \| \| + start 3) start > existing->start && start < (existing->start + existing->len) +------------em------------+ \| +-------------+ \| \| \| \| \| + +----existing-+ + \| \| + start 4) start >= (existing->start + existing->len) +-----------+em+-----------+ \| +-------------+ \| +--next---+ \| \| \| \| \| \| + +---+existing++ + +---------+ + \| + start As we can see, it turns out that if start is within existing em (front inclusive), then the existing em should be returned as is, otherwise, we try our best to merge candidate em with sibling ems to form a larger em (in order to reduce the total number of em). Reported-by: David Vallender <david.vallender@landmark.co.uk> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: fix incorrect block_len in merge_extent_mapping	Liu Bo
	%block_len could be checked on deciding if two em are mergeable. merge_extent_mapping() has only added the front pad if the front part of em gets truncated, but it's possible that the end part gets truncated. For both compressed extent and inline extent, em->block_len is not adjusted accordingly, and for regular extent, em->block_len always equals to em->len, hence this sets em->block_len with em->len. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: Remove unused readahead spinlock	Matthew Wilcox
	The reada_lock in struct btrfs_device was only initialised, and not actually used. That's good because there's another lock also called reada_lock in the btrfs_fs_info that was quite heavily used. Remove this one. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: raid56: fix race between merge_bio and rbio_orig_end_io	Liu Bo
	Before rbio_orig_end_io() goes to free rbio, rbio may get merged with more bios from other rbios and rbio->bio_list becomes non-empty, in that case, these newly merged bios don't end properly. Once unlock_stripe() is done, rbio->bio_list will not be updated any more and we can call bio_endio() on all queued bios. It should only happen in error-out cases, the normal path of recover and full stripe write have already set RBIO_RMW_LOCKED_BIT to disable merge before doing IO, so rbio_orig_end_io() called by them doesn't have the above issue. Reported-by: Jérôme Carretero <cJ-ko@zougloub.eu> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: do not cache rbio pages if using raid6 recover	Liu Bo
	Since raid6 recover tries all possible combinations of failed stripes, - when raid6 rebuild algorithm is used, i.e. raid6_datap_recov() and raid6_2data_recov(), it may change the in-memory content of failed stripes, if such a raid bio is cached, a later raid write rmw or recover can steal @stripe_pages from it instead of reading from disks, such that it carries the wrong content to do write rmw or recovery and ends up with corruption or recovery failures. - when raid5 rebuild algorithm is used, i.e. xor, raid bio can be cached because the only failed stripe which contains @rbio->bio_pages gets modified, others remain the same so that their in-memory content is consistent with their on-disk content. This adds a check to skip caching rbio if using raid6 recover. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: raid56: iterate raid56 internal bio with bio_for_each_segment_all	Liu Bo
	Bio iterated by set_bio_pages_uptodate() is raid56 internal one, so it will never be a BIO_CLONED bio, and since this is called by end_io functions, bio->bi_iter.bi_size is zero, we mustn't use bio_for_each_segment() as that is a no-op if bi_size is zero. Fixes: 6592e58c6b68e61f003a01ba29a3716e7e2e9484 ("Btrfs: fix write corruption due to bio cloning on raid5/6") Cc: <stable@vger.kernel.org> # v4.12-rc6+ Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: correct wrong comment about magic number of index_cnt	Su Yue
	There is no function named btrfs_get_inode_index_count. Explanation for magic number index_cnt=2 in btrfs_new_inode() is actually located in btrfs_set_inode_index_count(). So replace 'btrfs_get_inode_index_count' in the comment by 'btrfs_set_inode_index_count'. Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: Make btrfs_inode_rsv_release static	Nikolay Borisov
	It's not used outside of extent-tree so there is no reason to not be static. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: cleanup btrfs_free_stale_device() usage	Anand Jain
	We call btrfs_free_stale_device() only when we alloc a new struct btrfs_device (ret=1), so move it closer to where we alloc the new device. Also drop the comments. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: tree-check: reduce stack consumption in check_dir_item	David Sterba
	I've noticed that the updated item checker stack consumption increased dramatically in 542f5385e20cf97447 ("btrfs: tree-checker: Add checker for dir item") tree-checker.c:check_leaf +552 (176 -> 728) The array is 255 bytes long, dynamic allocation would slow down the sanity checks so it's more reasonable to keep it on-stack. Moving the variable to the scope of use reduces the stack usage again tree-checker.c:check_leaf -264 (728 -> 464) Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: use correct string length in DEV_INFO ioctl	Xiongfeng Wang
	gcc-8 reports: fs/btrfs/ioctl.c: In function 'btrfs_ioctl': ./include/linux/string.h:245:9: warning: '__builtin_strncpy' specified bound 1024 equals destination size [-Wstringop-truncation] We need one less byte or call strlcpy() to make it a nul-terminated string. This is done on the next line anyway, but we want to avoid the warning. Signed-off-by: Xiongfeng Wang <xiongfeng.wang@linaro.org> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: fail mount when sb flag is not in BTRFS_SUPER_FLAG_SUPP	Anand Jain
	It appears from the original commit [1] that there isn't any design specific reason not to fail the mount instead of just warning. This patch will change it to fail. [1] commit 319e4d0661e5323c9f9945f0f8fb5905e5fe74c3 btrfs: Enhance super validation check Fixes: 319e4d0661e5323 ("btrfs: Enhance super validation check") Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: add support for SUPER_FLAG_CHANGING_FSID	Anand Jain
	The UUID change by btrfstune sets SUPER_FLAG_CHANGING_FSID and resets it only when changing fsid is complete. Its not a good idea to mount the device anything in between, reading metadata blocks would fail with UUID mismatch. This patch doesn't add SUPER_FLAG_CHANGING_FSID into BTRFS_SUPER_FLAG_SUPP list, so mount will fail (along with the fix in the next patch) when SUPER_FLAG_CHANGING_FSID is set. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	btrfs: define SUPER_FLAG_METADUMP_V2	Anand Jain
	btrfs-progs uses super flag bit BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34). So just define that in kernel so that we know its been used. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-22	Btrfs: avoid losing data raid profile when deleting a device	Liu Bo
	We've avoided data losing raid profile when doing balance, but it turns out that deleting a device could also result in the same problem. Say we have 3 disks, and they're created with '-d raid1' profile. - We have chunk P (the only data chunk on the empty btrfs). - Suppose that chunk P's two raid1 copies reside in disk A and disk B. - Now, 'btrfs device remove disk B' btrfs_rm_device() -> btrfs_shrink_device() -> btrfs_relocate_chunk() #relocate any chunk on disk B to other places. - Chunk P will be removed and a new chunk will be created to hold those data, but as chunk P is the only one holding raid1 profile, after it goes away, the new chunk will be created as single profile which is our default profile. This fixes the problem by creating an empty data chunk before relocating the data chunk. Metadata/System chunk are supposed to have non-zero bytes all the time so their raid profile is preserved. Reported-by: James Alandt <James.Alandt@wdc.com> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>