summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2018-12-13acpi/nfit, libnvdimm: Store dimm id as a member to struct nvdimmDave Jiang
The generated dimm id is needed for the sysfs attribute as well as being used as the identifier/description for the security key. Since it's constant and should never change, store it as a member of struct nvdimm. As nvdimm_create() continues to grow parameters relative to NFIT driver requirements, do not require other implementations to keep pace. Introduce __nvdimm_create() to carry the new parameters and keep nvdimm_create() with the long standing default api. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-12-13ACPI / tables: add DSDT AmlCode new declaration name supportWang Dongsheng
A new naming rule was added in ACPICA version 20180427 changing the DSDT AML code name from "AmlCode" to "dsdt_aml_code". That change was made by commit 83b2fa943ba8 "ACPICA: iASL: Enhance the -tc option (create AML hex file in C)". Tested: ACPICA release version 20180427+. ARM64: QCOM QDF2400 GCC: 4.8.5 20150623 Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: change coding style to match ACPICA, no functional changeErik Schmauss
This commit alters the coding style of the following commit to match ACPICA to keep divergences between Linux and ACPICA at a minimum. This is not intended to result in functional changes. ae6b3e54aa52cd29965b8e4e47000ed2c5d78eb8 Author: Hans de Goede <hdegoede@redhat.com> Date: Sun Nov 18 20:25:35 2018 +0100 ACPICA: Fix handling of buffer-size in acpi_ex_write_data_to_field() Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: Debug output: Add option to display method/object evaluationBob Moore
Adds entry/exit messages for all objects that are evaluated. Works for the kernel-level code as well as acpiexec. The "-eo" flag enables acpiexec to display these messages. The messages are very useful when debugging the flow of table initialization. Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: disassembler: disassemble OEMx tables as AMLErik Schmauss
Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: Add "Windows 2018.2" string in the _OSI supportJung-uk Kim
Signed-off-by: Jung-uk Kim <jkim@free_BSD.org> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: Expressions in package elements are not supportedBob Moore
Return AE_SUPPORT if encountered, fixes a previous fault if encountered. Note: Other ACPI implementations do not support this type of construct. Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: Update buffer-to-string conversionsBob Moore
Add "0x" prefix for hex values. Provides compatibility with other ACPI implementations. Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: add comments, no functional changeBob Moore
Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: Remove defines that use deprecated flagErik Schmauss
This commit removes the use of ACPI_NO_METHOD_EXECUTE flag Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ACPICA: Add "Windows 2018" string in the _OSI supportBob Moore
Latest windows string. Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-13ubi: Do not drop UBI device reference before usingPan Bian
The UBI device reference is dropped but then the device is used as a parameter of ubi_err. The bug is introduced in changing ubi_err's behavior. The old ubi_err does not require a UBI device as its first parameter, but the new one does. Fixes: 32608703310 ("UBI: Extend UBI layer debug/messaging capabilities") Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2018-12-13ubi: Put MTD device after it is not usedPan Bian
The MTD device reference is dropped via put_mtd_device, however its field ->index is read and passed to ubi_msg. To fix this, the patch moves the reference dropping after calling ubi_msg. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2018-12-13nvme: fix kernel paging oopsSagi Grimberg
free the controller discard_page correctly. Fixes: cb5b7262b011 ("nvme: provide fallback for discard alloc failure") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13dma-mapping: bypass indirect calls for dma-directChristoph Hellwig
Avoid expensive indirect calls in the fast path DMA mapping operations by directly calling the dma_direct_* ops if we are using the directly mapped DMA operations. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13vmd: use the proper dma_* APIs instead of direct methods callsChristoph Hellwig
With the bypass support for the direct mapping we might not always have methods to call, so use the proper APIs instead. The only downside is that we will create two dma-debug entries for each mapping if CONFIG_DMA_DEBUG is enabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13swiotlb: remove dma_mark_cleanChristoph Hellwig
Instead of providing a special dma_mark_clean hook just for ia64, switch ia64 to use the normal arch_sync_dma_for_cpu hooks instead. This means that we now also set the PG_arch_1 bit for pages in the swiotlb buffer, which isn't stricly needed as we will never execute code out of the swiotlb buffer, but otherwise harmless. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13swiotlb: remove SWIOTLB_MAP_ERRORChristoph Hellwig
We can use DMA_MAPPING_ERROR instead, which already maps to the same value. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13ACPI / scan: Refactor _CCA enforcementRobin Murphy
Rather than checking the DMA attribute at each callsite, just pass it through for acpi_dma_configure() to handle directly. That can then deal with the relatively exceptional DEV_DMA_NOT_SUPPORTED case by explicitly installing dummy DMA ops instead of just skipping setup entirely. This will then free up the dev->dma_ops == NULL case for some valuable fastpath optimisations. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13dma-mapping: move dma_get_required_mask to kernel/dmaChristoph Hellwig
dma_get_required_mask should really be with the rest of the DMA mapping implementation instead of in drivers/base as a lone outlier. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal fixes from Eduardo Valentin: "Fixes for STM and HISI thermal drivers" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: stm32: Fix stm_thermal_read_factory_settings thermal: stm32: read factory settings inside stm_thermal_prepare thermal/drivers/hisi: Fix number of sensors on hi3660 thermal/drivers/hisi: Fix wrong platform_get_irq_byname()
2018-12-13Merge branch 'nvme-4.21' of git://git.infradead.org/nvme into for-4.21/blockJens Axboe
Pull NVMe updates from Christoph: "Here is the second large chunk of nvme updates for 4.21: - host and target support for NVMe over TCP (Sagi Grimberg, Roy Shterman, Solganik Alexander) - error log page support in target (Chaitanya Kulkarni) plus small fixes and improvements from Jens Axboe and Chengguang Xu." * 'nvme-4.21' of git://git.infradead.org/nvme: (33 commits) nvme-rdma: support separate queue maps for read and write nvme-tcp: support separate queue maps for read and write nvme-fabrics: allow user to set nr_write_queues for separate queue maps nvme-fabrics: add missing nvmf_ctrl_options documentation blk-mq-rdma: pass in queue map to blk_mq_rdma_map_queues nvmet: update smart log with num err log entries nvmet: add error log page cmd handler nvmet: add error log support for file backend nvmet: add error log support for bdev backend nvmet: add error log support for admin-cmd nvmet: add error log support for rdma backend nvmet: add error log support for fabrics-cmd nvmet: add error log support in the core nvmet: add interface to update error-log page nvmet: add error-log definitions nvme: add error log page slot definition nvme: remove nvme_common command cdw10 array nvmet: remove unused variable nvme: provide fallback for discard alloc failure nvme: add __exit annotation ...
2018-12-13Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channelsDexuan Cui
Before 98f4c651762c, we returned zeros for unopened channels. With 98f4c651762c, we started to return random on-stack values. We'd better return -EINVAL instead. Fixes: 98f4c651762c ("hv: move ringbuffer bus attributes to dev_groups") Cc: stable@vger.kernel.org Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-13x86, hyperv: remove PCI dependencySinan Kaya
Need to be able to boot without PCI devices present. Signed-off-by: Sinan Kaya <okaya@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-14Merge branch 'vmwgfx-fixes-4.20' of ↵Dave Airlie
git://people.freedesktop.org/~thomash/linux into drm-fixes One regression fix for avoiding kernel OOM, one cleanup return fix. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181213122815.10581-1-thellstrom@vmware.com
2018-12-13USB: serial: option: add Telit LN940 seriesJörgen Storvist
Added USB serial option driver support for Telit LN940 series cellular modules. Covering both QMI and MBIM modes. usb-devices output (0x1900): T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 21 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1900 Rev=03.10 S: Manufacturer=Telit S: Product=Telit LN940 Mobile Broadband S: SerialNumber=0123456789ABCDEF C: #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option usb-devices output (0x1901): T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 20 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=1901 Rev=03.10 S: Manufacturer=Telit S: Product=Telit LN940 Mobile Broadband S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim Signed-off-by: Jörgen Storvist <jorgen.storvist@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2018-12-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Doug Ledford: "We have 5 small fixes for this pull request. One is a performance regression, so not necessarily strictly a fix, but it was small and reasonable and claimed to avoid thrashing in the scheduler, so I took it. The remaining are all legitimate fixes that match the "we take fixes any time" criteria. Summary: - One performance regression for hfi1 - One kasan fix for hfi1 - A couple mlx5 fixes - A core oops fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/core: Fix oops in netdev_next_upper_dev_rcu() IB/mlx5: Block DEVX umem from the non applicable cases IB/mlx5: Fix implicit ODP interrupted page fault IB/hfi1: Fix an out-of-bounds access in get_hw_stats IB/hfi1: Fix a latency issue for small messages
2018-12-13Merge tag 'mmc-v4.20-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull mmc fixes from Ulf Hansson: "MMC core: - Fixup RPMB requests to use mrq->sbc when sending CMD23 MMC host: - omap: Fix broken MMC/SD on OMAP15XX/OMAP5910/OMAP310 - sdhci-omap: Fix DCRC error handling during tuning - sdhci: Fixup the timeout check window for clock and reset" * tag 'mmc-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci: fix the timeout check window for clock and reset mmc: sdhci-omap: Fix DCRC error handling during tuning MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310 mmc: core: use mrq->sbc when sending CMD23 for RPMB
2018-12-14Merge tag 'vmwgfx-next-2018-12-13' of ↵Dave Airlie
git://people.freedesktop.org/~thomash/linux into drm-next Pull request of 2018-12-13 Two minor fixes for next pull. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181213130848.3080-1-thellstrom@vmware.com
2018-12-13irqchip/gic-v3: Add quirk for msm8996 broken registersSrinivas Kandagatla
Access to GICR_WAKER is restricted on msm8996 SoC in Hypervisor. Its been more than 2+ years of wait for this to be fixed, which has no hopes to be fixed. This change was introduced for the "lead device" on msm8996 platform. It looks like all publicly available msm8996 and other Qualcomm SoCs have this implementation. So add a quirk to not access this register on msm8996. With this quirk MSM8996 can at least boot out of mainline, which can help community to work with boards based on MSM8996 and other SoCs with have this restrictions. This Quirk is based on device tree compatible string. Without this patch Qualcomm DB820c board reboots when GICR_WAKER is accessed. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-13irqchip/gic: Add support to device tree based quirksSrinivas Kandagatla
This patch adds support to device tree based quirks based on device tree compatible string. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-12-13Merge branch 'yaml-bindings-for-v4.21' into dt/nextRob Herring
2018-12-13regmap: irq: handle HW using separate rising/falling edge interruptsBartosz Golaszewski
Some interrupt controllers use separate bits for controlling rising and falling edge interrupts in the mask register i.e. they have one interrupt for rising edge and one for falling. We already handle the case where we have a single interrupt in the mask register and a separate type configuration register. Add a new switch to regmap_irq_chip which tells the framework to use the mask_base address for configuring the edge of the interrupts that define type_falling/rising_mask values. For such interrupts we never update the type_base bits. For interrupts that don't define type masks or their regmap irq chip doesn't set the type_in_mask to true everything stays the same. Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-13mfd: axp20x: use explicit bit definesOlliver Schinagl
The AXP20X_OFF define is an actual specific bit, define it as such. Signed-off-by: Olliver Schinagl <oliver@schinagl.nl> Signed-off-by: Priit Laes <plaes@plaes.org> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-13mfd: axp20x: Clean up included headersOlliver Schinagl
Add the bitops.h header as we need it, alphabetize header order. Signed-off-by: Olliver Schinagl <oliver@schinagl.nl> Signed-off-by: Priit Laes <plaes@plaes.org> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-13regulator: axp20x: add software based soft_start for AXP209 LDO3Olliver Schinagl
In the past, there have been words on various lists that if LDO3 is disabled in u-boot, but enabled in the DTS, the axp209 driver would fail to continue/hang. Several enable/disable patches have been issues to devicetree's in both the kernel and u-boot to address this issue. What really happened however, was that the AXP209 shuts down without a notice and without setting an interrupt. This is caused when LDO3 gets overloaded, for example with large capacitors on the LDO3 output. Normally, we would expect that AXP209 would source 200 mA as per datasheet and set and trigger an interrupt when being overloaded. For some reason however, this does not happen. As a work-around, we use the soft-start constraint of the regulator node to first bring up the LDO3 to the lowest possible voltage and then enable the LDO. After that, we can set the requested voltage as usual. Combining this setting with the regulator-ramp-delay allows LDO3 to enable voltage slowly and staggered, potentially reducing overall inrush current. Signed-off-by: Olliver Schinagl <oliver@schinagl.nl> Signed-off-by: Priit Laes <plaes@plaes.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-13regulator: axp20x: add support for set_ramp_delay for AXP209Olliver Schinagl
The AXP209 supports ramping up voltages on several regulators such as DCDC2 and LDO3. This patch adds preliminary support for the regulator-ramp-delay property for these 2 regulators. Note that the voltage ramp only works when regulator is already enabled. E.g. when going from say 0.7 V to 3.6 V. When turning on the regulator, no voltage ramp is performed in hardware. What this means, is that if the bootloader brings up the voltage at 0.7 V, the ramp delay property is properly applied. If however, the bootloader leaves the power off, no ramp delay is applied when the power is enabled by the regulator framework. Signed-off-by: Olliver Schinagl <oliver@schinagl.nl> Signed-off-by: Priit Laes <plaes@plaes.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-12-13Merge branch 'topic/axp20x' of ↵Mark Brown
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator into regulator-4.21
2018-12-13bcache: print number of keys in trace_bcache_journal_writeGuoju Fang
Sometimes flush journal may be very frequent, so it's useful to dump number of keys every time write journal. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: set writeback_percent in a flexible rangeColy Li
Because CUTOFF_WRITEBACK is defined as 40, so before the changes of dynamic cutoff writeback values, writeback_percent is limited to [0, CUTOFF_WRITEBACK]. Any value larger than CUTOFF_WRITEBACK will be fixed up to 40. Now cutof writeback limit is a dynamic value bch_cutoff_writeback, so the range of writeback_percent can be a more flexible range as [0, bch_cutoff_writeback]. The flexibility is, it can be expended to a larger or smaller range than [0, 40], depends on how value bch_cutoff_writeback is specified. The default value is still strongly recommended to most of users for most of workloads. But for people who want to do research on bcache writeback perforamnce tuning, they may have chance to specify more flexible writeback_percent in range [0, 70]. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: make cutoff_writeback and cutoff_writeback_sync tunableColy Li
Currently the cutoff writeback and cutoff writeback sync thresholds are defined by CUTOFF_WRITEBACK (40) and CUTOFF_WRITEBACK_SYNC (70) as static values. Most of time these they work fine, but when people want to do research on bcache writeback mode performance tuning, there is no chance to modify the soft and hard cutoff writeback values. This patch introduces two module parameters bch_cutoff_writeback_sync and bch_cutoff_writeback which permit people to tune the values when loading bcache.ko. If they are not specified by module loading, current values CUTOFF_WRITEBACK_SYNC and CUTOFF_WRITEBACK will be used as default and nothing changes. When people want to tune this two values, - cutoff_writeback can be set in range [1, 70] - cutoff_writeback_sync can be set in range [1, 90] - cutoff_writeback always <= cutoff_writeback_sync The default values are strongly recommended to most of users for most of workloads. Anyway, if people wants to take their own risk to do research on new writeback cutoff tuning for their own workload, now they can make it. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: add MODULE_DESCRIPTION informationColy Li
This patch moves MODULE_AUTHOR and MODULE_LICENSE to end of super.c, and add MODULE_DESCRIPTION("Bcache: a Linux block layer cache"). This is preparation for adding module parameters. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: option to automatically run gc thread after writebackColy Li
The option gc_after_writeback is disabled by default, because garbage collection will discard SSD data which drops cached data. Echo 1 into /sys/fs/bcache/<UUID>/internal/gc_after_writeback will enable this option, which wakes up gc thread when writeback accomplished and all cached data is clean. This option is helpful for people who cares writing performance more. In heavy writing workload, all cached data can be clean only happens when writeback thread cleans all cached data in I/O idle time. In such situation a following gc running may help to shrink bcache B+ tree and discard more clean data, which may be helpful for future writing requests. If you are not sure whether this is helpful for your own workload, please leave it as disabled by default. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: introduce force_wake_up_gc()Coly Li
Garbage collection thread starts to work when c->sectors_to_gc is negative value, otherwise nothing will happen even the gc thread is woken up by wake_up_gc(). force_wake_up_gc() sets c->sectors_to_gc to -1 before calling wake_up_gc(), then gc thread may have chance to run if no one else sets c->sectors_to_gc to a positive value before gc_should_run(). This routine can be called where the gc thread is woken up and required to run in force. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: cannot set writeback_running via sysfs if no writeback kthread createdShenghui Wang
"echo 1 > writeback_running" marks writeback_running even if no writeback kthread created as "d_strtoul(writeback_running)" will simply set dc-> writeback_running without checking the existence of dc->writeback_thread. Add check for setting writeback_running via sysfs: if no writeback kthread available, reject setting to 1. v2 -> v3: * Make message on wrong assignment more clear. * Print name of bcache device instead of name of backing device. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: do not mark writeback_running too earlyShenghui Wang
A fresh backing device is not attached to any cache_set, and has no writeback kthread created until first attached to some cache_set. But bch_cached_dev_writeback_init run " dc->writeback_running = true; WARN_ON(test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)); " for any newly formatted backing devices. For a fresh standalone backing device, we can get something like following even if no writeback kthread created: ------------------------ /sys/block/bcache0/bcache# cat writeback_running 1 /sys/block/bcache0/bcache# cat writeback_rate_debug rate: 512.0k/sec dirty: 0.0k target: 0.0k proportional: 0.0k integral: 0.0k change: 0.0k/sec next io: -15427384ms The none ZERO fields are misleading as no alive writeback kthread yet. Set dc->writeback_running false as no writeback thread created in bch_cached_dev_writeback_init(). We have writeback thread created and woken up in bch_cached_dev_writeback _start(). Set dc->writeback_running true before bch_writeback_queue() called, as a writeback thread will check if dc->writeback_running is true before writing back dirty data, and hung if false detected. After the change, we can get the following output for a fresh standalone backing device: ----------------------- /sys/block/bcache0/bcache$ cat writeback_running 0 /sys/block/bcache0/bcache# cat writeback_rate_debug rate: 0.0k/sec dirty: 0.0k target: 0.0k proportional: 0.0k integral: 0.0k change: 0.0k/sec next io: 0ms v1 -> v2: Set dc->writeback_running before bch_writeback_queue() called, Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: update comment in sysfs.cShenghui Wang
We have struct cached_dev allocated by kzalloc in register_bcache(), which initializes all the fields of cached_dev with 0s. And commit ce4c3e19e520 ("bcache: Replace bch_read_string_list() by __sysfs_match_string()") has remove the string "default". Update the comment. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: update comment for bch_data_insertShenghui Wang
commit 220bb38c21b8 ("bcache: Break up struct search") introduced changes to struct search and s->iop. bypass/bio are fields of struct data_insert_op now. Update the comment. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: do not check if debug dentry is ERR or NULL explicitly on removeShenghui Wang
debugfs_remove and debugfs_remove_recursive will check if the dentry pointer is NULL or ERR, and will do nothing in that case. Remove the check in cache_set_free and bch_debug_init. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-13bcache: add comment for cache_set->fill_iterShenghui Wang
We have the following define for btree iterator: struct btree_iter { size_t size, used; #ifdef CONFIG_BCACHE_DEBUG struct btree_keys *b; #endif struct btree_iter_set { struct bkey *k, *end; } data[MAX_BSETS]; }; We can see that the length of data[] field is static MAX_BSETS, which is defined as 4 currently. But a btree node on disk could have too many bsets for an iterator to fit on the stack - maybe far more that MAX_BSETS. Have to dynamically allocate space to host more btree_iter_sets. bch_cache_set_alloc() will make sure the pool cache_set->fill_iter can allocate an iterator equipped with enough room that can host (sb.bucket_size / sb.block_size) btree_iter_sets, which is more than static MAX_BSETS. bch_btree_node_read_done() will use that pool to allocate one iterator, to host many bsets in one btree node. Add more comment around cache_set->fill_iter to make code less confusing. Signed-off-by: Shenghui Wang <shhuiw@foxmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>