linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2019-04-25	crypto: ccree - fix mem leak on error path	Gilad Ben-Yossef
	Fix a memory leak on the error path of IV generation code. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - fix typo in debugfs error path	Gilad Ben-Yossef
	Fix a typo in debugfs interface error path which can result in a panic following a memory allocation failure. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - remove special handling of chained sg	Gilad Ben-Yossef
	We were handling chained scattergather lists with specialized code needlessly as the regular sg APIs handle them just fine. The code handling this also had an (unused) code path with a use-before-init error, flagged by Coverity. Remove all special handling of chained sg and leave their handling to the regular sg APIs. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - use proper callback completion api	Gilad Ben-Yossef
	Use proper hash callback completion API instead of open coding it. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - fix backlog notifications	Gilad Ben-Yossef
	We were doing backlog notification callbacks via a cipher/hash/aead request structure cast to the base structure, which may or may not work based on how the structure is laid in memory and is not safe. Fix it by delegating the backlog notification to the appropriate internal callbacks which are type aware. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - add CID and PID support	Gilad Ben-Yossef
	The new HW uses a new standard product and component ID registers replacing the old ad-hoc version and signature gister schemes. Update the driver to support the new HW ID registers. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - read next IV from HW	Gilad Ben-Yossef
	We were computing the next IV in software instead of reading it from HW on the premise that this can be quicker due to the small size of IVs but this proved to be much more hassle and bug ridden than expected. Move to reading the next IV as computed by the HW. This fixes a number of issue with next IV being wrong for OFB, CTS-CBC and probably most of the other ciphers as well. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - adapt CPP descriptor to new HW	Gilad Ben-Yossef
	Adapt the CPP descriptor to new HW interface. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - add SM4 protected keys support	Gilad Ben-Yossef
	Add the registration for the SM4 based policy protected keys ciphers. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - add remaining logic for CPP	Gilad Ben-Yossef
	Add the missing logic to set usage policy protections for keys. This enables key policy protection for AES. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - add CPP completion handling	Gilad Ben-Yossef
	Add the logic needed to track and report CPP operation rejection. The new logic will be used by the CPP feature introduced later. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - add support for sec disabled mode	Gilad Ben-Yossef
	Add support for the Security Disabled mode under which only pure cryptographic functionality is enabled and protected keys services are unavailable. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - move MLLI desc. before key load	Gilad Ben-Yossef
	Refactor to move the descriptor copying the MLLI line to SRAM to before the key loading descriptor in preparation to the introduction of CPP later on. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: ccree - move key load desc. before flow desc.	Gilad Ben-Yossef
	Refactor the descriptor setup code in order to move the key loading descriptor to one before last position. This has no effect on current functionality but is needed for later support of Content Protection Policy keys. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: sun4i-ss - fallback when length is not multiple of blocksize	Corentin Labbe
	sun4i-ss does not handle requests when length are not a multiple of blocksize. This patch adds a fallback for that case. Fixes: 6298e948215f ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: sun4i-ss - Fix invalid calculation of hash end	Corentin Labbe
	When nbytes < 4, end is wronlgy set to a negative value which, due to uint, is then interpreted to a large value leading to a deadlock in the following code. This patch fix this problem. Fixes: 6298e948215f ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: sun4i-ss - remove ivsize from ECB	Corentin Labbe
	ECB algos does not need IV. Fixes: 6298e948215f ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: sun4i-ss - Handle better absence/presence of IV	Corentin Labbe
	This patch remove the test against areq->info since sun4i-ss could work without it (ECB). Fixes: 6298e948215f ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: cavium/nitrox - Fix HW family part name format	Nagadheeraj Rottela
	This patch fixes the NITROX-V family part name format. The fix includes ZIP core performance (feature option) in the part name to differentiate various HW devices. The complete HW part name format is mentioned below Part name: CNN55<core option>-<freq>BG676-<feature option>-<rev> Signed-off-by: Nagadheeraj Rottela <rnagadheeraj@marvell.com> Reviewed-by: Srikanth Jampala <jsrikanth@marvell.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: caam/jr - update gcm detection logic	Horia Geantă
	GCM detection logic has to change for two reasons: -some CAAM instantiations with Era < 10, even though they have AES LP, they now support GCM mode -Era 10 upwards, there is a dedicated bit in AESA_VERSION[AESA_MISC] field for GCM support For Era 9 and earlier, all AES accelerator versions support GCM, except for AES LP (CHAVID_LS[AESVID]=3) with revision CRNR[AESRN] < 8. For Era 10 and later, bit 9 of the AESA_VERSION register should be used to detect GCM support in AES accelerator. Note: caam/qi and caam/qi2 are drivers for QI (Queue Interface), which is used in DPAA-based SoCs; for now, we rely on CAAM having an AES HP and this AES accelerator having support for GCM. Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Reviewed-by: Iuliana Prodan <iuliana.prodan@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: caam - fix spelling mistake "cannote" -> "cannot"	Colin Ian King
	There is a spelling mistake in an error message in the qi_error_list array. Fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: shash - remove shash_desc::flags	Eric Biggers
	The flags field in 'struct shash_desc' never actually does anything. The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP. However, no shash algorithm ever sleeps, making this flag a no-op. With this being the case, inevitably some users who can't sleep wrongly pass MAY_SLEEP. These would all need to be fixed if any shash algorithm actually started sleeping. For example, the shash_ahash_() functions, which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP from the ahash API to the shash API. However, the shash functions are called under kmap_atomic(), so actually they're assumed to never sleep. Even if it turns out that some users do need preemption points while hashing large buffers, we could easily provide a helper function crypto_shash_update_large() which divides the data into smaller chunks and calls crypto_shash_update() and cond_resched() for each chunk. It's not necessary to have a flag in 'struct shash_desc', nor is it necessary to make individual shash algorithms aware of this at all. Therefore, remove shash_desc::flags, and document that the crypto_shash_() functions can be called from any context. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	crypto: nx - don't abuse shash MAY_SLEEP flag	Eric Biggers
	The nx driver uses the MAY_SLEEP flag in shash_desc::flags as an indicator to not retry sending the operation to the hardware as many times before returning -EBUSY. This is bogus because (1) that's not what the MAY_SLEEP flag is for, and (2) the shash API doesn't allow failing if the hardware is busy anyway. For now, just make it always retry the larger number of times. This doesn't actually fix this driver, but it at least makes it not use the shash_desc::flags field anymore. Then this field can be removed, as no other drivers use it. Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25	PM / Domains: Allow OF lookup for multi PM domain case from ->attach_dev()	Ulf Hansson
	A genpd provider that uses the ->attach_dev() callback to look up resources for a device fails to do so when the device has multiple PM domains attached. That is because when genpd invokes the ->attach_dev() callback, it passes the allocated virtual device as the in-parameter. To address this problem, simply assign the dev->of_node for the virtual device, based upon the original device's OF node. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-25	PM / Domains: Don't kfree() the virtual device in the error path	Ulf Hansson
	It's not correct to call kfree(dev) when device_register(dev) has failed. Fix this by calling put_device(dev) instead. Fixes: 3c095f32a92b ("PM / Domains: Add support for multi PM domains per device to genpd") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-04-25	gpio: Fix gpiochip_add_data_with_key() error path	Geert Uytterhoeven
	The err_remove_chip block is too coarse, and may perform cleanup that must not be done. E.g. if of_gpiochip_add() fails, of_gpiochip_remove() is still called, causing: OF: ERROR: Bad of_node_put() on /soc/gpio@e6050000 CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 5.1.0-rc2-koelsch+ #407 Hardware name: Generic R-Car Gen2 (Flattened Device Tree) Workqueue: events deferred_probe_work_func [<c020ec74>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14) [<c020ae58>] (show_stack) from [<c07c1224>] (dump_stack+0x7c/0x9c) [<c07c1224>] (dump_stack) from [<c07c5a80>] (kobject_put+0x94/0xbc) [<c07c5a80>] (kobject_put) from [<c0470420>] (gpiochip_add_data_with_key+0x8d8/0xa3c) [<c0470420>] (gpiochip_add_data_with_key) from [<c0473738>] (gpio_rcar_probe+0x1d4/0x314) [<c0473738>] (gpio_rcar_probe) from [<c052fca8>] (platform_drv_probe+0x48/0x94) and later, if a GPIO consumer tries to use a GPIO from a failed controller: WARNING: CPU: 0 PID: 1 at lib/refcount.c:156 kobject_get+0x38/0x4c refcount_t: increment on 0; use-after-free. Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc2-koelsch+ #407 Hardware name: Generic R-Car Gen2 (Flattened Device Tree) [<c020ec74>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14) [<c020ae58>] (show_stack) from [<c07c1224>] (dump_stack+0x7c/0x9c) [<c07c1224>] (dump_stack) from [<c0221580>] (__warn+0xd0/0xec) [<c0221580>] (__warn) from [<c02215e0>] (warn_slowpath_fmt+0x44/0x6c) [<c02215e0>] (warn_slowpath_fmt) from [<c07c58fc>] (kobject_get+0x38/0x4c) [<c07c58fc>] (kobject_get) from [<c068b3ec>] (of_node_get+0x14/0x1c) [<c068b3ec>] (of_node_get) from [<c0686f24>] (of_find_node_by_phandle+0xc0/0xf0) [<c0686f24>] (of_find_node_by_phandle) from [<c0686fbc>] (of_phandle_iterator_next+0x68/0x154) [<c0686fbc>] (of_phandle_iterator_next) from [<c0687fe4>] (__of_parse_phandle_with_args+0x40/0xd0) [<c0687fe4>] (__of_parse_phandle_with_args) from [<c0688204>] (of_parse_phandle_with_args_map+0x100/0x3ac) [<c0688204>] (of_parse_phandle_with_args_map) from [<c0471240>] (of_get_named_gpiod_flags+0x38/0x380) [<c0471240>] (of_get_named_gpiod_flags) from [<c046f864>] (gpiod_get_from_of_node+0x24/0xd8) [<c046f864>] (gpiod_get_from_of_node) from [<c0470aa4>] (devm_fwnode_get_index_gpiod_from_child+0xa0/0x144) [<c0470aa4>] (devm_fwnode_get_index_gpiod_from_child) from [<c05f425c>] (gpio_keys_probe+0x418/0x7bc) [<c05f425c>] (gpio_keys_probe) from [<c052fca8>] (platform_drv_probe+0x48/0x94) Fix this by splitting the cleanup block, and adding a missing call to gpiochip_irqchip_remove(). Fixes: 28355f81969962cf ("gpio: defer probe if pinctrl cannot be found") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-04-25	drm/vmwgfx: Fix dma API layer violation	Thomas Hellstrom
	Remove the check for IOMMU presence since it was considered a layer violation. This means we have no reliable way to destinguish between coherent hardware IOMMU DMA address translations and incoherent SWIOTLB DMA address translations, which we can't handle. So always presume the former. This means that if anybody forces SWIOTLB without also setting the vmw_force_coherent=1 vmwgfx option, driver operation will fail, like it will on most other graphics drivers. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-04-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: "Just the usual assortment of small'ish fixes: 1) Conntrack timeout is sometimes not initialized properly, from Alexander Potapenko. 2) Add a reasonable range limit to tcp_min_rtt_wlen to avoid undefined behavior. From ZhangXiaoxu. 3) des1 field of descriptor in stmmac driver is initialized with the wrong variable. From Yue Haibing. 4) Increase mlxsw pci sw reset timeout a little bit more, from Ido Schimmel. 5) Match IOT2000 stmmac devices more accurately, from Su Bao Cheng. 6) Fallback refcount fix in TLS code, from Jakub Kicinski. 7) Fix max MTU check when using XDP in mlx5, from Maxim Mikityanskiy. 8) Fix recursive locking in team driver, from Hangbin Liu. 9) Fix tls_set_device_offload_Rx() deadlock, from Jakub Kicinski. 10) Don't use napi_alloc_frag() outside of softiq context of socionext driver, from Ilias Apalodimas. 11) MAC address increment overflow in ncsi, from Tao Ren. 12) Fix a regression in 8K/1M pool switching of RDS, from Zhu Yanjun. 13) ipv4_link_failure has to validate the headers that are actually there because RAW sockets can pass in arbitrary garbage, from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) ipv4: add sanity checks in ipv4_link_failure() net/rose: fix unbound loop in rose_loopback_timer() rxrpc: fix race condition in rxrpc_input_packet() net: rds: exchange of 8K and 1M pool net: vrf: Fix operation not supported when set vrf mac net/ncsi: handle overflow when incrementing mac address net: socionext: replace napi_alloc_frag with the netdev variant on init net: atheros: fix spelling mistake "underun" -> "underrun" spi: ST ST95HF NFC: declare missing of table spi: Micrel eth switch: declare missing of table net: stmmac: move stmmac_check_ether_addr() to driver probe netfilter: fix nf_l4proto_log_invalid to log invalid packets netfilter: never get/set skb->tstamp netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON Documentation: decnet: remove reference to CONFIG_DECNET_ROUTE_FWMARK dt-bindings: add an explanation for internal phy-mode net/tls: don't leak IV and record seq when offload fails net/tls: avoid potential deadlock in tls_set_device_offload_rx() selftests/net: correct the return value for run_afpackettests team: fix possible recursive locking when add slaves ...
2019-04-24	net: vrf: Fix operation not supported when set vrf mac	Miaohe Lin
	Vrf device is not able to change mac address now because lack of ndo_set_mac_address. Complete this in case some apps need to do this. Reported-by: Hui Wang <wanghui104@huawei.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-24	net: ieee802154: fix missing checks for regmap_update_bits	Kangjie Lu
	regmap_update_bits could fail and deserves a check. The patch adds the checks and if it fails, returns its error code upstream. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2019-04-24	bcache: avoid potential memleak of list of journal_replay(s) in the ↵	Shenghui Wang
	CACHE_SYNC branch of run_cache_set In the CACHE_SYNC branch of run_cache_set(), LIST_HEAD(journal) is used to collect journal_replay(s) and filled by bch_journal_read(). If all goes well, bch_journal_replay() will release the list of jounal_replay(s) at the end of the branch. If something goes wrong, code flow will jump to the label "err:" and leave the list unreleased. This patch will release the list of journal_replay(s) in the case of error detected. v1 -> v2: * Move the release code to the location after label 'err:' to simply the change. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: fix wrong usage use-after-freed on keylist in out_nocoalesce branch ↵	Shenghui Wang
	of btree_gc_coalesce Elements of keylist should be accessed before the list is freed. Move bch_keylist_free() calling after the while loop to avoid wrong content accessed. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: fix failure in journal relplay	Tang Junhui
	journal replay failed with messages: Sep 10 19:10:43 ceph kernel: bcache: error on bb379a64-e44e-4812-b91d-a5599871a3b1: bcache: journal entries 2057493-2057567 missing! (replaying 2057493-2076601), disabling caching The reason is in journal_reclaim(), when discard is enabled, we send discard command and reclaim those journal buckets whose seq is old than the last_seq_now, but before we write a journal with last_seq_now, the machine is restarted, so the journal with the last_seq_now is not written to the journal bucket, and the last_seq_wrote in the newest journal is old than last_seq_now which we expect to be, so when we doing replay, journals from last_seq_wrote to last_seq_now are missing. It's hard to write a journal immediately after journal_reclaim(), and it harmless if those missed journal are caused by discarding since those journals are already wrote to btree node. So, if miss seqs are started from the beginning journal, we treat it as normal, and only print a message to show the miss journal, and point out it maybe caused by discarding. Patch v2 add a judgement condition to ignore the missed journal only when discard enabled as Coly suggested. (Coly Li: rebase the patch with other changes in bch_journal_replay()) Signed-off-by: Tang Junhui <tang.junhui.linux@gmail.com> Tested-by: Dennis Schridde <devurandom@gmx.net> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: improve bcache_reboot()	Coly Li
	This patch tries to release mutex bch_register_lock early, to give chance to stop cache set and bcache device early. This patch also expends time out of stopping all bcache device from 2 seconds to 10 seconds, because stopping writeback rate update worker may delay for 5 seconds, 2 seconds is not enough. After this patch applied, stopping bcache devices during system reboot or shutdown is very hard to be observed any more. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: add comments for closure_fn to be called in closure_queue()	Coly Li
	Add code comments to explain which call back function might be called for the closure_queue(). This is an effort to make code to be more understandable for readers. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: Add comments for blkdev_put() in registration code path	Coly Li
	Add comments to explain why in register_bcache() blkdev_put() won't be called in two location. Add comments to explain why blkdev_put() must be called in register_cache() when cache_alloc() failed. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: add error check for calling register_bdev()	Coly Li
	This patch adds return value to register_bdev(). Then if failure happens inside register_bdev(), its caller register_bcache() may detect and handle the failure more properly. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: return error immediately in bch_journal_replay()	Coly Li
	When failure happens inside bch_journal_replay(), calling cache_set_err_on() and handling the failure in async way is not a good idea. Because after bch_journal_replay() returns, registering code will continue to execute following steps, and unregistering code triggered by cache_set_err_on() is running in same time. First it is unnecessary to handle failure and unregister cache set in an async way, second there might be potential race condition to run register and unregister code for same cache set. So in this patch, if failure happens in bch_journal_replay(), we don't call cache_set_err_on(), and just print out the same error message to kernel message buffer, then return -EIO immediately caller. Then caller can detect such failure and handle it in synchrnozied way. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: add comments for kobj release callback routine	Coly Li
	Bcache has several routines to release resources in implicit way, they are called when the associated kobj released. This patch adds code comments to notice when and which release callback will be called, - When dc->disk.kobj released: void bch_cached_dev_release(struct kobject kobj) - When d->kobj released: void bch_flash_dev_release(struct kobject kobj) - When c->kobj released: void bch_cache_set_release(struct kobject kobj) - When ca->kobj released void bch_cache_release(struct kobject kobj) Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: add failure check to run_cache_set() for journal replay	Coly Li
	Currently run_cache_set() has no return value, if there is failure in bch_journal_replay(), the caller of run_cache_set() has no idea about such failure and just continue to execute following code after run_cache_set(). The internal failure is triggered inside bch_journal_replay() and being handled in async way. This behavior is inefficient, while failure handling inside bch_journal_replay(), cache register code is still running to start the cache set. Registering and unregistering code running as same time may introduce some rare race condition, and make the code to be more hard to be understood. This patch adds return value to run_cache_set(), and returns -EIO if bch_journal_rreplay() fails. Then caller of run_cache_set() may detect such failure and stop registering code flow immedidately inside register_cache_set(). If journal replay fails, run_cache_set() can report error immediately to register_cache_set(). This patch makes the failure handling for bch_journal_replay() be in synchronized way, easier to understand and debug, and avoid poetential race condition for register-and-unregister in same time. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim()	Coly Li
	In journal_reclaim() ja->cur_idx of each cache will be update to reclaim available journal buckets. Variable 'int n' is used to count how many cache is successfully reclaimed, then n is set to c->journal.key by SET_KEY_PTRS(). Later in journal_write_unlocked(), a for_each_cache() loop will write the jset data onto each cache. The problem is, if all jouranl buckets on each cache is full, the following code in journal_reclaim(), 529 for_each_cache(ca, c, iter) { 530 struct journal_device ja = &ca->journal; 531 unsigned int next = (ja->cur_idx + 1) % ca->sb.njournal_buckets; 532 533 / No space available on this device */ 534 if (next == ja->discard_idx) 535 continue; 536 537 ja->cur_idx = next; 538 k->ptr[n++] = MAKE_PTR(0, 539 bucket_to_sector(c, ca->sb.d[ja->cur_idx]), 540 ca->sb.nr_this_dev); 541 } 542 543 bkey_init(k); 544 SET_KEY_PTRS(k, n); If there is no available bucket to reclaim, the if() condition at line 534 will always true, and n remains 0. Then at line 544, SET_KEY_PTRS() will set KEY_PTRS field of c->journal.key to 0. Setting KEY_PTRS field of c->journal.key to 0 is wrong. Because in journal_write_unlocked() the journal data is written in following loop, 649 for (i = 0; i < KEY_PTRS(k); i++) { 650-671 submit journal data to cache device 672 } If KEY_PTRS field is set to 0 in jouranl_reclaim(), the journal data won't be written to cache device here. If system crahed or rebooted before bkeys of the lost journal entries written into btree nodes, data corruption will be reported during bcache reload after rebooting the system. Indeed there is only one cache in a cache set, there is no need to set KEY_PTRS field in journal_reclaim() at all. But in order to keep the for_each_cache() logic consistent for now, this patch fixes the above problem by not setting 0 KEY_PTRS of journal key, if there is no bucket available to reclaim. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: move definition of 'int ret' out of macro read_bucket()	Coly Li
	'int ret' is defined as a local variable inside macro read_bucket(). Since this macro is called multiple times, and following patches will use a 'int ret' variable in bch_journal_read(), this patch moves definition of 'int ret' from macro read_bucket() to range of function bch_journal_read(). Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: fix a race between cache register and cacheset unregister	Liang Chen
	There is a race between cache device register and cache set unregister. For an already registered cache device, register_bcache will call bch_is_open to iterate through all cachesets and check every cache there. The race occurs if cache_set_free executes at the same time and clears the caches right before ca is dereferenced in bch_is_open_cache. To close the race, let's make sure the clean up work is protected by the bch_register_lock as well. This issue can be reproduced as follows, while true; do echo /dev/XXX> /sys/fs/bcache/register ; done& while true; do echo 1> /sys/block/XXX/bcache/set/unregister ; done & and results in the following oops, [ +0.000053] BUG: unable to handle kernel NULL pointer dereference at 0000000000000998 [ +0.000457] #PF error: [normal kernel read fault] [ +0.000464] PGD 800000003ca9d067 P4D 800000003ca9d067 PUD 3ca9c067 PMD 0 [ +0.000388] Oops: 0000 [#1] SMP PTI [ +0.000269] CPU: 1 PID: 3266 Comm: bash Not tainted 5.0.0+ #6 [ +0.000346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014 [ +0.000472] RIP: 0010:register_bcache+0x1829/0x1990 [bcache] [ +0.000344] Code: b0 48 83 e8 50 48 81 fa e0 e1 10 c0 0f 84 a9 00 00 00 48 89 c6 48 89 ca 0f b7 ba 54 04 00 00 4c 8b 82 60 0c 00 00 85 ff 74 2f <49> 3b a8 98 09 00 00 74 4e 44 8d 47 ff 31 ff 49 c1 e0 03 eb 0d [ +0.000839] RSP: 0018:ffff92ee804cbd88 EFLAGS: 00010202 [ +0.000328] RAX: ffffffffc010e190 RBX: ffff918b5c6b5000 RCX: ffff918b7d8e0000 [ +0.000399] RDX: ffff918b7d8e0000 RSI: ffffffffc010e190 RDI: 0000000000000001 [ +0.000398] RBP: ffff918b7d318340 R08: 0000000000000000 R09: ffffffffb9bd2d7a [ +0.000385] R10: ffff918b7eb253c0 R11: ffffb95980f51200 R12: ffffffffc010e1a0 [ +0.000411] R13: fffffffffffffff2 R14: 000000000000000b R15: ffff918b7e232620 [ +0.000384] FS: 00007f955bec2740(0000) GS:ffff918b7eb00000(0000) knlGS:0000000000000000 [ +0.000420] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000801] CR2: 0000000000000998 CR3: 000000003cad6000 CR4: 00000000001406e0 [ +0.000837] Call Trace: [ +0.000682] ? _cond_resched+0x10/0x20 [ +0.000691] ? __kmalloc+0x131/0x1b0 [ +0.000710] kernfs_fop_write+0xfa/0x170 [ +0.000733] __vfs_write+0x2e/0x190 [ +0.000688] ? inode_security+0x10/0x30 [ +0.000698] ? selinux_file_permission+0xd2/0x120 [ +0.000752] ? security_file_permission+0x2b/0x100 [ +0.000753] vfs_write+0xa8/0x1a0 [ +0.000676] ksys_write+0x4d/0xb0 [ +0.000699] do_syscall_64+0x3a/0xf0 [ +0.000692] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: Clean up bch_get_congested()	George Spelvin
	There are a few nits in this function. They could in theory all be separate patches, but that's probably taking small commits too far. 1) I added a brief comment saying what it does. 2) I like to declare pointer parameters "const" where possible for documentation reasons. 3) It uses bitmap_weight(&rand, BITS_PER_LONG) to compute the Hamming weight of a 32-bit random number (giving a random integer with mean 16 and variance 8). Passing by reference in a 64-bit variable is silly; just use hweight32(). 4) Its helper function fract_exp_two is unnecessarily tangled. Gcc can optimize the multiply by (1 << x) to a shift, but it can be written in a much more straightforward way at the cost of one more bit of internal precision. Some analysis reveals that this bit is always available. This shrinks the object code for fract_exp_two(x, 6) from 23 bytes: 0000000000000000 <foo1>: 0: 89 f9 mov %edi,%ecx 2: c1 e9 06 shr $0x6,%ecx 5: b8 01 00 00 00 mov $0x1,%eax a: d3 e0 shl %cl,%eax c: 83 e7 3f and $0x3f,%edi f: d3 e7 shl %cl,%edi 11: c1 ef 06 shr $0x6,%edi 14: 01 f8 add %edi,%eax 16: c3 retq To 19: 0000000000000017 <foo2>: 17: 89 f8 mov %edi,%eax 19: 83 e0 3f and $0x3f,%eax 1c: 83 c0 40 add $0x40,%eax 1f: 89 f9 mov %edi,%ecx 21: c1 e9 06 shr $0x6,%ecx 24: d3 e0 shl %cl,%eax 26: c1 e8 06 shr $0x6,%eax 29: c3 retq (Verified with 0 <= frac_bits <= 8, 0 <= x < 16<<frac_bits; both versions produce the same output.) 5) And finally, the call to bch_get_congested() in check_should_bypass() is separated from the use of the value by multiple tests which could moot the need to compute it. Move the computation down to where it's needed. This also saves a local register to hold the computed value. Signed-off-by: George Spelvin <lkml@sdf.org> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: use kmemdup_nul for CACHED_LABEL buffer	Geliang Tang
	This patch uses kmemdup_nul to create a NUL-terminated string from dc->sb.label. This is better than open coding it. With this, we can move env[2] initialization into env[] array to make code more elegant. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: avoid clang -Wunintialized warning	Arnd Bergmann
	clang has identified a code path in which it thinks a variable may be unused: drivers/md/bcache/alloc.c:333:4: error: variable 'bucket' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] fifo_pop(&ca->free_inc, bucket); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop' #define fifo_pop(fifo, i) fifo_pop_front(fifo, (i)) ^~~~~~~~~~~~~~~~~~~~~~~~~ drivers/md/bcache/util.h:189:6: note: expanded from macro 'fifo_pop_front' if (_r) { \ ^~ drivers/md/bcache/alloc.c:343:46: note: uninitialized use occurs here allocator_wait(ca, bch_allocator_push(ca, bucket)); ^~~~~~ drivers/md/bcache/alloc.c:287:7: note: expanded from macro 'allocator_wait' if (cond) \ ^~~~ drivers/md/bcache/alloc.c:333:4: note: remove the 'if' if its condition is always true fifo_pop(&ca->free_inc, bucket); ^ drivers/md/bcache/util.h:219:27: note: expanded from macro 'fifo_pop' #define fifo_pop(fifo, i) fifo_pop_front(fifo, (i)) ^ drivers/md/bcache/util.h:189:2: note: expanded from macro 'fifo_pop_front' if (_r) { \ ^ drivers/md/bcache/alloc.c:331:15: note: initialize the variable 'bucket' to silence this warning long bucket; ^ This cannot happen in practice because we only enter the loop if there is at least one element in the list. Slightly rearranging the code makes this clearer to both the reader and the compiler, which avoids the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: fix inaccurate result of unused buckets	Guoju Fang
	To get the amount of unused buckets in sysfs_priority_stats, the code count the buckets which GC_SECTORS_USED is zero. It's correct and should not be overwritten by the count of buckets which prio is zero. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	bcache: fix crashes stopping bcache device before read miss done	Guoju Fang
	The bio from upper layer is considered completed when bio_complete() returns. In most scenarios bio_complete() is called in search_free(), but when read miss happens, the bio_compete() is called when backing device reading completed, while the struct search is still in use until cache inserting finished. If someone stops the bcache device just then, the device may be closed and released, but after cache inserting finished the struct search will access a freed struct cached_dev. This patch add the reference of bcache device before bio_complete() when read miss happens, and put it after the search is not used. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-24	Input: synaptics-rmi4 - fix possible double free	Pan Bian
	The RMI4 function structure has been released in rmi_register_function if error occurs. However, it will be released again in the function rmi_create_function, which may result in a double-free bug. Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-04-24	RDMA/ucontext: Fix regression with disassociate	Jason Gunthorpe
	When this code was consolidated the intention was that the VMA would become backed by anonymous zero pages after the zap_vma_pte - however this very subtly relied on setting the vm_ops = NULL and clearing the VM_SHARED bits to transform the VMA into an anonymous VMA. Since the vm_ops was removed this broke. Now userspace gets a SIGBUS if it touches the vma after disassociation. Instead of converting the VMA to anonymous provide a fault handler that puts a zero'd page into the VMA when user-space touches it after disassociation. Cc: stable@vger.kernel.org Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>