linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2019-02-27	scsi: csiostor: drop serial_number usage	Hannes Reinecke
	Use request tag instead of the serial number when printing out logging messages. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: mvumi: use request tag instead of serial_number	Hannes Reinecke
	Use the request tag for logging instead of the scsi command serial number. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: dpt_i2o: remove serial number usage	Hannes Reinecke
	Drop references to scsi_cmnd->serial_number. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: st: osst: Remove negative constant left-shifts	Iustin Pop
	Negative constant left-shift is undefined behaviour in the C standard, and as such newer versions of clang (at least) warn against it. GCC supports it for a long time, but it would be better to remove it and rely on defined behaviour. My understanding is "~(-1 << N)" in 2's complement is intended to generate a bit pattern of zeroes ending with N '1' bits. The same can be achieved by "(1 << N) - 1" in a well-defined way, so switch to it to remove the warning. Tested: building a kernel with generic SCSI tape, and checking basic operations (mt status, mt eject) on a real LTO unit. Cannot test the osst driver. Signed-off-by: Iustin Pop <iustin@k1024.org> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	mmc: cqhci: Fix a tiny potential memory leak on error condition	Alamy Liu
	Free up the allocated memory in the case of error return The value of mmc_host->cqe_enabled stays 'false'. Thus, cqhci_disable (mmc_cqe_ops->cqe_disable) won't be called to free the memory. Also, cqhci_disable() seems to be designed to disable and free all resources, not suitable to handle this corner case. Fixes: a4080225f51d ("mmc: cqhci: support for command queue enabled host") Signed-off-by: Alamy Liu <alamy.liu@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-27	mmc: cqhci: fix space allocated for transfer descriptor	Alamy Liu
	There is not enough space being allocated when DCMD is disabled. CQE_DCMD is not necessary to be enabled when CQE is enabled. (Software could halt CQE to send command) In the case that CQE_DCMD is not enabled, it still needs to allocate space for data transfer. For instance: CQE_DCMD is enabled: 31 slots space (one slot used by DCMD) CQE_DCMD is disabled: 32 slots space Fixes: a4080225f51d ("mmc: cqhci: support for command queue enabled host") Signed-off-by: Alamy Liu <alamy.liu@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-02-27	scsi: ufs-bsg: Allow reading descriptors	Avri Altman
	Add this functionality, placing the descriptor being read in the actual data buffer in the bio. That is, for both read and write descriptors query upiu, we are using the job's request_payload. This in turn, is mapped back in user land to the applicable sg_io_v4 xferp: dout_xferp for write descriptor, and din_xferp for read descriptor. Signed-off-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Evan Green <evgreen@chromium.org> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: ufs: Allow reading descriptor via raw upiu	Avri Altman
	Allow to read descriptors via raw upiu. This in fact was forbidden just as a precaution, as ufs-bsg actually enforces which functionality is supported. Signed-off-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Evan Green <evgreen@chromium.org> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: ufs-bsg: Change the calling convention for write descriptor	Avri Altman
	When we had a write descriptor query upiu, we appended the descriptor right after the bsg request. This was fine as the bsg driver allows to allocate whatever buffer we needed in its job request. Still, the proper way to deliver payload, however small (we only write config descriptors of 144 bytes), is by using the job request payload data buffer. So change this ABI now, while ufs-bsg is still new, and nobody is actually using it. Signed-off-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Evan Green <evgreen@chromium.org> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: ufs: Remove unused device quirks	Marc Gonzalez
	The UFSHC driver defines a few quirks that are not used anywhere: UFS_DEVICE_QUIRK_BROKEN_LCC UFS_DEVICE_NO_VCCQ UFS_DEVICE_QUIRK_NO_LINK_OFF UFS_DEVICE_NO_FASTAUTO Let's remove them. Acked-by: Avri Altman <avri.altman@wdc.com> Acked-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Evan Green <evgreen@chromium.org> Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	Revert "scsi: ufs: disable vccq if it's not needed by UFS device"	Marc Gonzalez
	This reverts commit 60f0187031c05e04cbadffb62f557d0ff3564490. There was one conflict in drivers/scsi/ufs/ufshcd.c <<<<<<< HEAD /* Init check for device descriptor sizes */ ufshcd_init_desc_sizes(hba); ret = ufs_get_device_desc(hba, &card); if (ret) { dev_err(hba->dev, "%s: Failed getting device info. err = %d\n", __func__, ret); goto out; } ufs_fixup_device_setup(hba, &card); ufshcd_tune_unipro_params(hba); ret = ufshcd_set_vccq_rail_unused(hba, (hba->dev_quirks & UFS_DEVICE_NO_VCCQ) ? true : false); if (ret) goto out; ======= ufs_advertise_fixup_device(hba); >>>>>>> parent of 60f0187031c0... scsi: ufs: disable vccq if it's not needed by UFS device Resolution: keep HEAD, and delete the ufshcd_set_vccq_rail_unused() call and corresponding error-handling code. Clean up loose ends in a follow-up patch. 60f0187031c0 introduced a small power optimization: ignore the vccq load specified in the UFSHC DT node when said host controller is connected to specific Flash chips (currently, Samsung and Hynix). Unfortunately, this optimization breaks UFS on systems where vccq powers not only the Flash chip, but the host controller as well, such as APQ8098 MEDIABOX or MTP8998: [ 3.929877] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr: opcode 0x04 for idn 13 failed, index 0, err = -11 [ 5.433815] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr: opcode 0x04 for idn 13 failed, index 0, err = -11 [ 6.937771] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr: opcode 0x04 for idn 13 failed, index 0, err = -11 [ 6.937866] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr_retry: query attribute, idn 13, failed with error -11 after 3 retires [ 6.946412] ufshcd-qcom 1da4000.ufshc: ufshcd_disable_auto_bkops: failed to enable exception event -11 [ 6.957972] ufshcd-qcom 1da4000.ufshc: dme-peer-get: attr-id 0x1587 failed 3 retries [ 6.967181] ufshcd-qcom 1da4000.ufshc: dme-peer-get: attr-id 0x1586 failed 3 retries [ 6.975025] ufshcd-qcom 1da4000.ufshc: ufshcd_get_max_pwr_mode: invalid max pwm tx gear read = 0 [ 6.982755] ufshcd-qcom 1da4000.ufshc: ufshcd_probe_hba: Failed getting max supported power mode [ 8.505770] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag: Sending flag query for idn 3 failed, err = -11 [ 10.009807] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag: Sending flag query for idn 3 failed, err = -11 [ 11.513766] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag: Sending flag query for idn 3 failed, err = -11 [ 11.513861] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag_retry: query attribute, opcode 5, idn 3, failed with error -11 after 3 retires [ 13.049807] ufshcd-qcom 1da4000.ufshc: __ufshcd_query_descriptor: opcode 0x01 for idn 8 failed, index 0, err = -11 [ 14.553768] ufshcd-qcom 1da4000.ufshc: __ufshcd_query_descriptor: opcode 0x01 for idn 8 failed, index 0, err = -11 [ 16.057767] ufshcd-qcom 1da4000.ufshc: __ufshcd_query_descriptor: opcode 0x01 for idn 8 failed, index 0, err = -11 [ 16.057872] ufshcd-qcom 1da4000.ufshc: ufshcd_read_desc_param: Failed reading descriptor. desc_id 8, desc_index 0, param_offset 0, ret -11 [ 16.067109] ufshcd-qcom 1da4000.ufshc: ufshcd_init_icc_levels: Failed reading power descriptor.len = 98 ret = -11 [ 37.073787] ufshcd-qcom 1da4000.ufshc: link startup failed 1 In my opinion, the rationale for the original patch is questionable. If neither the UFSHC, nor the Flash chip, require any load from vccq, then that power rail should simply not be specified at all in the DT. Working around that fact in the driver is detrimental, as evidenced by the failure to initialize the host controller on MSM8998. Acked-by: Avri Altman <avri.altman@wdc.com> Acked-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	scsi: core: Avoid that system resume triggers a kernel warning	Bart Van Assche
	scsi_device_quiesce() and scsi_device_resume() are called during system-wide suspend and resume. scsi_device_quiesce() only succeeds for SCSI devices that are in one of the RUNNING, OFFLINE or TRANSPORT_OFFLINE states (see also scsi_set_device_state()). This patch avoids that the following warning is triggered when resuming a system for which quiescing a SCSI device failed: WARNING: CPU: 2 PID: 11303 at drivers/scsi/scsi_lib.c:2600 scsi_device_resume+0x4f/0x58 CPU: 2 PID: 11303 Comm: kworker/u8:70 Not tainted 5.0.0-rc1+ #50 Hardware name: LENOVO 80E3/Lancer 5B2, BIOS A2CN45WW(V2.13) 08/04/2016 Workqueue: events_unbound async_run_entry_fn Call Trace: scsi_dev_type_resume+0x2e/0x60 async_run_entry_fn+0x32/0xd8 process_one_work+0x1f4/0x420 worker_thread+0x28/0x3c0 kthread+0x118/0x130 ret_from_fork+0x22/0x40 Cc: Przemek Socha <soprwa@gmail.com> Reported-by: Przemek Socha <soprwa@gmail.com> Fixes: 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably") # v4.15 Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-27	kbuild: move ".config not found!" message from Kconfig to Makefile	Masahiro Yamada
	If you run "make" in a pristine source tree, currently Kbuild will start to build Kconfig to let it show the error message. It would be more straightforward to check it in Makefile and let it fail immediately. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing	Masahiro Yamada
	If include/config/auto.conf.cmd is lost for some reasons, it is not self-healing, so the top Makefile misses to run syncconfig. Move include/config/auto.conf.cmd to the target side. I used a pattern rule instead of a normal rule here although it is a bit gross. If the rule were written with a normal rule like this, include/config/auto.conf \ include/config/auto.conf.cmd \ include/config/tristate.conf: $(KCONFIG_CONFIG) $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig ... syncconfig would be executed per target. Using a pattern rule makes sure that syncconfig is executed just once because Make assumes the recipe will create all of the targets. Here is a quote from the GNU Make manual [1]: "Pattern rules may have more than one target. Unlike normal rules, this does not act as many different rules with the same prerequisites and recipe. If a pattern rule has multiple targets, make knows that the rule's recipe is responsible for making all of the targets. The recipe is executed only once to make all the targets. When searching for a pattern rule to match a target, the target patterns of a rule other than the one that matches the target in need of a rule are incidental: make worries only about giving a recipe and prerequisites to the file presently in question. However, when this file's recipe is run, the other targets are marked as having been updated themselves." [1]: https://www.gnu.org/software/make/manual/html_node/Pattern-Intro.html Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	block: optimize blk_bio_segment_split for single-page bvec	Ming Lei
	Introduce a fast path for single-page bvec IO, then we can avoid to call bvec_split_segs() unnecessarily. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-27	block: optimize __blk_segment_map_sg() for single-page bvec	Ming Lei
	Introduce a fast path for single-page bvec IO, then blk_bvec_map_sg() can be avoided. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-27	block: introduce bvec_nth_page()	Ming Lei
	Single-page bvec can often be seen in small BS workloads, so introduce bvec_nth_page() for avoiding to call nth_page() unnecessarily, which looks not cheap. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-27	btrfs: move ulist allocation out of transaction in quota enable	David Sterba
	The allocation happens with GFP_KERNEL after a transaction has been started, this can potentially cause deadlock if reclaim tries to get the memory by flushing filesystem data. The fs_info::qgroup_ulist is not used during transaction start when quotas are not enabled. The status bit BTRFS_FS_QUOTA_ENABLED is set later in btrfs_quota_enable so it's safe to move it before the transaction start. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-27	btrfs: save drop_progress if we drop refs at all	Josef Bacik
	Previously we only updated the drop_progress key if we were in the DROP_REFERENCE stage of snapshot deletion. This is because the UPDATE_BACKREF stage checks the flags of the blocks it's converting to FULL_BACKREF, so if we go over a block we processed before it doesn't matter, we just don't do anything. The problem is in do_walk_down() we will go ahead and drop the roots reference to any blocks that we know we won't need to walk into. Given subvolume A and snapshot B. The root of B points to all of the nodes that belong to A, so all of those nodes have a refcnt > 1. If B did not modify those blocks it'll hit this condition in do_walk_down if (!wc->update_ref \|\| generation <= root->root_key.offset) goto skip; and in "goto skip" we simply do a btrfs_free_extent() for that bytenr that we point at. Now assume we modified some data in B, and then took a snapshot of B and call it C. C points to all the nodes in B, making every node the root of B points to have a refcnt > 1. This assumes the root level is 2 or higher. We delete snapshot B, which does the above work in do_walk_down, free'ing our ref for nodes we share with A that we didn't modify. Now we hit a node we _did_ modify, thus we own. We need to walk down into this node and we set wc->stage == UPDATE_BACKREF. We walk down to level 0 which we also own because we modified data. We can't walk any further down and thus now need to walk up and start the next part of the deletion. Now walk_up_proc is supposed to put us back into DROP_REFERENCE, but there's an exception to this if (level < wc->shared_level) goto out; we are at level == 0, and our shared_level == 1. We skip out of this one and go up to level 1. Since path->slots[1] < nritems we path->slots[1]++ and break out of walk_up_tree to stop our transaction and loop back around. Now in btrfs_drop_snapshot we have this snippet if (wc->stage == DROP_REFERENCE) { level = wc->level; btrfs_node_key(path->nodes[level], &root_item->drop_progress, path->slots[level]); root_item->drop_level = level; } our stage == UPDATE_BACKREF still, so we don't update the drop_progress key. This is a problem because we would have done btrfs_free_extent() for the nodes leading up to our current position. If we crash or unmount here and go to remount we'll start over where we were before and try to free our ref for blocks we've already freed, and thus abort() out. Fix this by keeping track of the last place we dropped a reference for our block in do_walk_down. Then if wc->stage == UPDATE_BACKREF we know we'll start over from a place we meant to, and otherwise things continue to work as they did before. I have a complicated reproducer for this problem, without this patch we'll fail to fsck the fs when replaying the log writes log. With this patch we can replay the whole log without any fsck or mount failures. The steps to reproduce this easily are sort of tricky, I had to add a couple of debug patches to the kernel in order to make it easy, basically I just needed to make sure we did actually commit the transaction every time we finished a walk_down_tree/walk_up_tree combo. The reproducer: 1) Creates a base subvolume. 2) Creates 100k files in the subvolume. 3) Snapshots the base subvolume (snap1). 4) Touches files 5000-6000 in snap1. 5) Snapshots snap1 (snap2). 6) Deletes snap1. I do this with dm-log-writes, and then replay to every FUA in the log and fsck the fs. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> [ copy reproducer steps ] Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-27	btrfs: check for refs on snapshot delete resume	Josef Bacik
	There's a bug in snapshot deletion where we won't update the drop_progress key if we're in the UPDATE_BACKREF stage. This is a problem because we could drop refs for blocks we know don't belong to ours. If we crash or umount at the right time we could experience messages such as the following when snapshot deletion resumes BTRFS error (device dm-3): unable to find ref byte nr 66797568 parent 0 root 258 owner 1 offset 0 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 16052 at fs/btrfs/extent-tree.c:7108 __btrfs_free_extent.isra.78+0x62c/0xb30 [btrfs] CPU: 3 PID: 16052 Comm: umount Tainted: G W OE 5.0.0-rc4+ #147 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011 RIP: 0010:__btrfs_free_extent.isra.78+0x62c/0xb30 [btrfs] RSP: 0018:ffffc90005cd7b18 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88842fade680 RSI: ffff88842fad6b18 RDI: ffff88842fad6b18 RBP: ffffc90005cd7bc8 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000001 R11: ffffffff822696b8 R12: 0000000003fb4000 R13: 0000000000000001 R14: 0000000000000102 R15: ffff88819c9d67e0 FS: 00007f08bb138fc0(0000) GS:ffff88842fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f5d861ea0 CR3: 00000003e99fe000 CR4: 00000000000006e0 Call Trace: ? _raw_spin_unlock+0x27/0x40 ? btrfs_merge_delayed_refs+0x356/0x3e0 [btrfs] __btrfs_run_delayed_refs+0x75a/0x13c0 [btrfs] ? join_transaction+0x2b/0x460 [btrfs] btrfs_run_delayed_refs+0xf3/0x1c0 [btrfs] btrfs_commit_transaction+0x52/0xa50 [btrfs] ? start_transaction+0xa6/0x510 [btrfs] btrfs_sync_fs+0x79/0x1c0 [btrfs] sync_filesystem+0x70/0x90 generic_shutdown_super+0x27/0x120 kill_anon_super+0x12/0x30 btrfs_kill_super+0x16/0xa0 [btrfs] deactivate_locked_super+0x43/0x70 deactivate_super+0x40/0x60 cleanup_mnt+0x3f/0x80 __cleanup_mnt+0x12/0x20 task_work_run+0x8b/0xc0 exit_to_usermode_loop+0xce/0xd0 do_syscall_64+0x20b/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe To fix this simply mark dead roots we read from disk as DEAD and then set the walk_control->restarted flag so we know we have a restarted deletion. From here whenever we try to drop refs for blocks we check to verify our ref is set on them, and if it is not we skip it. Once we find a ref that is set we unset walk_control->restarted since the tree should be in a normal state from then on, and any problems we run into from there are different issues. I tested this with an existing broken fs and my reproducer that creates a broken fs and it fixed both file systems. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-27	kbuild: simplify single target rules	Masahiro Yamada
	The dependency will be checked anyway after Kbuild descends into a sub-directory. Skip object/source dependency checks in top Makefile. VPATH can be simpler since the top Makefile no longer checks the presence of the source file, which is located in in the external module directory. One good thing is, it can compile an object from a generated source file. $ make crypto/rsapubkey.asn1.o ... ASN.1 crypto/rsapubkey.asn1.c CC crypto/rsapubkey.asn1.o Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	kbuild: remove empty rules for makefiles	Masahiro Yamada
	The previous commit made 'MAKEFLAGS += -rR' effective in the top Makefile regardless of O= option, GNU Make versions. The top Makefile does not need to cancel implicit rules for makefiles. There is still one place where an empty rule is useful. Since -rR is effective only after sub-make, GNU Make would try implicit rules to update the top Makefile. Although it is not a big overhead, cancel it just in case. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	kbuild: make -r/-R effective in top Makefile for old Make versions	Masahiro Yamada
	Adding -rR to MAKEFLAGS is important because we do not want to be bothered by built-in implicit rules or variables. One problem that used to exist in older GNU Make versions is MAKEFLAGS += -rR ... does not become effective in the current Makefile. When you are building with O= option, it becomes effective in the top Makefile since it recurses via 'sub-make' target. Otherwise, the top Makefile tries implicit rules. That is why we explicitly add empty rules for Makefiles, but we often miss to do that. In fact, adding -d option to older GNU Make versions shows it is trying a bunch of implicit pattern rules. Considering target file `scripts/Makefile.kcov'. Looking for an implicit rule for `scripts/Makefile.kcov'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.o'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.c'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.cc'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.C'. ... This issue was fixed by GNU Make commit 58dae243526b ("[Savannah #20501] Handle adding -r/-R to MAKEFLAGS in the makefile"). So, it is no longer a problem if you use GNU Make 4.0 or later. However, older versions are still widely used. So, I decided to patch the kernel Makefile to invoke sub-make regardless of O= option. This will allow further cleanups. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	kbuild: move tools_silent to a more relevant place	Masahiro Yamada
	This would disturb the change the sub-make part. Move it near the tools/ target. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig	Masahiro Yamada
	Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched various false positives: - commit e74fc973b6e5 ("Turn off -Wmaybe-uninitialized when building with -Os") turned off this option for -Os. - commit 815eb71e7149 ("Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for CONFIG_PROFILE_ALL_BRANCHES - commit a76bcf557ef4 ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"") turned off this option for GCC < 4.9 Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903 I think this looks better by shifting the logic from Makefile to Kconfig. Link: https://github.com/ClangBuiltLinux/linux/issues/350 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com>
2019-02-27	kbuild: refactor cc-cross-prefix implementation	Masahiro Yamada
	- $(word 1, <text>) is equivalent to $(firstword <text>) - hardcode "gcc" instead of $(CC) - minimize the shell script part A little more notes in case $(filter-out -%, ...) is not clear. arch/mips/Makefile passes prefixes depending on the configuration. CROSS_COMPILE := $(call cc-cross-prefix, $(tool-archpref)-linux- \ $(tool-archpref)-linux-gnu- $(tool-archpref)-unknown-linux-gnu-) In the Kconfig stage (e.g. when you run 'make defconfig'), neither CONFIG_32BIT nor CONFIG_64BIT is defined. So, $(tool-archpref) is empty. As a result, "-linux -linux-gnu- -unknown-linux-gnu" is passed into cc-cross-prefix. The command 'which' assumes arguments starting with a hyphen as command options, then emits the following messages: Illegal option -l Illegal option -l Illegal option -u I think it is strange to define CROSS_COMPILE depending on the CONFIG options since you need to feed $(CC) to Kconfig, but it is how MIPS Makefile currently works. Anyway, it would not hurt to filter-out invalid strings beforehand. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	kbuild: hardcode genksyms path and remove GENKSYMS variable	Masahiro Yamada
	The genksyms source was integrated into the kernel tree in 2003. I do not expect anybody still using the external /sbin/genksyms. Kbuild does not need to provide the ability to override GENKSYMS. Let's remove the GENKSYMS variable, and use the hardcoded path. Since it occurred in the pre-git era, I attached the commit message in case somebody is interested in the historical background. \| Author: Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de> \| Date: Wed Feb 19 04:17:28 2003 -0600 \| \| kbuild: [PATCH] put genksyms in scripts dir \| \| This puts genksyms into scripts/genksyms/. \| \| genksyms used to be maintained externally, though the only possible user \| was the kernel build. Moving it into the kernel sources makes it easier to \| keep it uptodate, like for example updating it to generate linker scripts \| directly instead of postprocessing the generated header file fragments \| with sed, as we do currently. \| \| Also, genksyms does not handle __typeof__, which needs to be fixed since \| some of the exported symbol in the kernel are defined using __typeof__. \| \| (Rusty Russell/me) Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-02-27	scripts/gdb: refactor rules for symlink creation	Masahiro Yamada
	gdb-scripts is not a real object, but (ab)used like a phony target. Rewrite the code in a more Kbuild-ish way. Add symlinks to extra-y and use if_changed. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
2019-02-27	kbuild: create symlink to vmlinux-gdb.py in scripts_gdb target	Masahiro Yamada
	It is weird to create gdb stuff as a side-effect of vmlinux. Move it to a more relevant place. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
2019-02-27	scripts/gdb: do not descend into scripts/gdb from scripts	Masahiro Yamada
	Currently, Kbuild descends from scripts/Makefile to scripts/gdb/Makefile just for creating symbolic links, but it does not need to do it so early. Merge the two descending paths to simplify the code. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
2019-02-27	kbuild: remove unimportant comments from ./Kbuild	Masahiro Yamada
	Every time we add/remove a target, we need to touch the header part, including renumbering. This is not so important information. Numbering targets is rather misleading because they are not necessarily generated in this order. For example, 1) and 2) can be executed simultaneously when the -j option is given. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
2019-02-27	scripts/gdb: delay generation of gdb constants.py	Masahiro Yamada
	scripts/gdb/linux/constants.py is never used in the kernel build process. There is no good reason to create it so early. Get it out of the 'prepare' stage. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
2019-02-27	Merge tag 'vfio-ccw-20190227' of ↵	Martin Schwidefsky
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features Pull vfio-ccw from Cornelia Huck with the following changes: - Further fixes in TIC handling.
2019-02-27	powerpc/fsl: Fix the flush of branch predictor.	Christophe Leroy
	The commit identified below adds MC_BTB_FLUSH macro only when CONFIG_PPC_FSL_BOOK3E is defined. This results in the following error on some configs (seen several times with kisskb randconfig_defconfig) arch/powerpc/kernel/exceptions-64e.S:576: Error: Unrecognized opcode: `mc_btb_flush' make[3]: * [scripts/Makefile.build:367: arch/powerpc/kernel/exceptions-64e.o] Error 1 make[2]: * [scripts/Makefile.build:492: arch/powerpc/kernel] Error 2 make[1]: * [Makefile:1043: arch/powerpc] Error 2 make: * [Makefile:152: sub-make] Error 2 This patch adds a blank definition of MC_BTB_FLUSH for other cases. Fixes: 10c5e83afd4a ("powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)") Cc: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-27	Btrfs: fix deadlock between clone/dedupe and rename	Filipe Manana
	Reflinking (clone/dedupe) and rename are operations that operate on two inodes and therefore need to lock them in the same order to avoid ABBA deadlocks. It happens that Btrfs' reflink implementation always locked them in a different order from VFS's lock_two_nondirectories() helper, which is used by the rename code in VFS, resulting in ABBA type deadlocks. Btrfs' locking order: static void btrfs_double_inode_lock(struct inode inode1, struct inode inode2) { if (inode1 < inode2) swap(inode1, inode2); inode_lock_nested(inode1, I_MUTEX_PARENT); inode_lock_nested(inode2, I_MUTEX_CHILD); } VFS's locking order: void lock_two_nondirectories(struct inode inode1, struct inode inode2) { if (inode1 > inode2) swap(inode1, inode2); if (inode1 && !S_ISDIR(inode1->i_mode)) inode_lock(inode1); if (inode2 && !S_ISDIR(inode2->i_mode) && inode2 != inode1) inode_lock_nested(inode2, I_MUTEX_NONDIR2); } Fix this by killing the btrfs helper function that does the double inode locking and replace it with VFS's helper lock_two_nondirectories(). Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: 416161db9b63e3 ("btrfs: offline dedupe") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-27	Btrfs: fix corruption reading shared and compressed extents after hole punching	Filipe Manana
	In the past we had data corruption when reading compressed extents that are shared within the same file and they are consecutive, this got fixed by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents") and by commit 808f80b46790f ("Btrfs: update fix for read corruption of compressed and shared extents"). However there was a case that was missing in those fixes, which is when the shared and compressed extents are referenced with a non-zero offset. The following shell script creates a reproducer for this issue: #!/bin/bash mkfs.btrfs -f /dev/sdc &> /dev/null mount -o compress /dev/sdc /mnt/sdc # Create a file with 3 consecutive compressed extents, each has an # uncompressed size of 128Kb and a compressed size of 4Kb. for ((i = 1; i <= 3; i++)); do head -c 4096 /dev/zero for ((j = 1; j <= 31; j++)); do head -c 4096 /dev/zero \| tr '\0' "\377" done done > /mnt/sdc/foobar sync echo "Digest after file creation: $(md5sum /mnt/sdc/foobar)" # Clone the first extent into offsets 128K and 256K. xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar sync echo "Digest after cloning: $(md5sum /mnt/sdc/foobar)" # Punch holes into the regions that are already full of zeroes. xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar sync echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)" echo "Dropping page cache..." sysctl -q vm.drop_caches=1 echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)" umount /dev/sdc When running the script we get the following output: Digest after file creation: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar linked 131072/131072 bytes at offset 131072 128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec) linked 131072/131072 bytes at offset 262144 128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec) Digest after cloning: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Digest after hole punching: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Dropping page cache... Digest after hole punching: fba694ae8664ed0c2e9ff8937e7f1484 /mnt/sdc/foobar This happens because after reading all the pages of the extent in the range from 128K to 256K for example, we read the hole at offset 256K and then when reading the page at offset 260K we don't submit the existing bio, which is responsible for filling all the page in the range 128K to 256K only, therefore adding the pages from range 260K to 384K to the existing bio and submitting it after iterating over the entire range. Once the bio completes, the uncompressed data fills only the pages in the range 128K to 256K because there's no more data read from disk, leaving the pages in the range 260K to 384K unfilled. It is just a slightly different variant of what was solved by commit 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents"). Fix this by forcing a bio submit, during readpages(), whenever we find a compressed extent map for a page that is different from the extent map for the previous page or has a different starting offset (in case it's the same compressed extent), instead of the extent map's original start offset. A test case for fstests follows soon. Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: 808f80b46790f ("Btrfs: update fix for read corruption of compressed and shared extents") Fixes: 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents") Cc: stable@vger.kernel.org # 4.3+ Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-02-27	powerpc/powernv: Make opal log only readable by root	Jordan Niethe
	Currently the opal log is globally readable. It is kernel policy to limit the visibility of physical addresses / kernel pointers to root. Given this and the fact the opal log may contain this information it would be better to limit the readability to root. Fixes: bfc36894a48b ("powerpc/powernv: Add OPAL message log interface") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Stewart Smith <stewart@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-27	netfilter: nft_set_hash: remove nft_hash_key()	Pablo Neira Ayuso
	hashtable is never used for 2-byte keys, remove nft_hash_key(). Fixes: e240cd0df481 ("netfilter: nf_tables: place all set backends in one single module") Reported-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nft_set_hash: bogus element self comparison from deactivation path	Pablo Neira Ayuso
	Use the element from the loop iteration, not the same element we want to deactivate otherwise this branch always evaluates true. Fixes: 6c03ae210ce3 ("netfilter: nft_set_hash: add non-resizable hashtable implementation") Reported-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nft_set_hash: fix lookups with fixed size hash on big endian	Pablo Neira Ayuso
	Call jhash_1word() for the 4-bytes key case from the insertion and deactivation path, otherwise big endian arch set lookups fail. Fixes: 446a8268b7f5 ("netfilter: nft_set_hash: add lookup variant for fixed size hashtable") Reported-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: remove unneeded switch fall-through	Li RongQing
	Empty case is fine and does not switch fall-through Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: conntrack: avoid same-timeout update	Florian Westphal
	No need to dirty a cache line if timeout is unchanged. Also, WARN() is useless here: we crash on 'skb->len' access if skb is NULL. Last, ct->timeout is u32, not 'unsigned long' so adapt the function prototype accordingly. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: remove nf_nat_l3proto.h and nf_nat_core.h	Florian Westphal
	The l3proto name is gone, its header file is the last trace. While at it, also remove nf_nat_core.h, its very small and all users include nf_nat.h too. before: text data bss dec hex filename 22948 1612 4136 28696 7018 nf_nat.ko after removal of l3proto register/unregister functions: text data bss dec hex filename 22196 1516 4136 27848 6cc8 nf_nat.ko checkpatch complains about overly long lines, but line breaks do not make things more readable and the line length gets smaller here, not larger. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: remove l3proto struct	Florian Westphal
	All l3proto function pointers have been removed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: remove csum_recalc hook	Florian Westphal
	We can now use direct calls. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: remove csum_update hook	Florian Westphal
	We can now use direct calls. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: remove l3 manip_pkt hook	Florian Westphal
	We can now use direct calls. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: remove nf_nat_l4proto.h	Florian Westphal
	after ipv4/6 nat tracker merge, there are no external callers, so make last function static and remove the header. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: merge nf_nat_ipv4,6 into nat core	Florian Westphal
	before: text data bss dec hex filename 16566 1576 4136 22278 5706 nf_nat.ko 3598 844 0 4442 115a nf_nat_ipv6.ko 3187 844 0 4031 fbf nf_nat_ipv4.ko after: text data bss dec hex filename 22948 1612 4136 28696 7018 nf_nat.ko ... with ipv4/v6 nat now provided directly via nf_nat.ko. Also changes: ret = nf_nat_ipv4_fn(priv, skb, state); if (ret != NF_DROP && ret != NF_STOLEN && into if (ret != NF_ACCEPT) return ret; everywhere. The nat hooks never should return anything other than ACCEPT or DROP (and the latter only in rare error cases). The original code uses multi-line ANDing including assignment-in-if: if (ret != NF_DROP && ret != NF_STOLEN && !(IPCB(skb)->flags & IPSKB_XFRM_TRANSFORMED) && (ct = nf_ct_get(skb, &ctinfo)) != NULL) { I removed this while moving, breaking those in separate conditionals and moving the assignments into extra lines. checkpatch still generates some warnings: 1. Overly long lines (of moved code). Breaking them is even more ugly. so I kept this as-is. 2. use of extern function declarations in a .c file. This is necessary evil, we must call nf_nat_l3proto_register() from the nat core now. All l3proto related functions are removed later in this series, those prototypes are then removed as well. v2: keep empty nf_nat_ipv6_csum_update stub for CONFIG_IPV6=n case. v3: remove IS_ENABLED(NF_NAT_IPV4/6) tests, NF_NAT_IPVx toggles are removed here. v4: also get rid of the assignments in conditionals. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-27	netfilter: nat: move nlattr parse and xfrm session decode to core	Florian Westphal
	None of these functions calls any external functions, moving them allows to avoid both the indirection and a need to export these symbols. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>