summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-08-05watchdog: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200707171121.GA13472@embeddedor Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: f71808e_wdt: do stricter parameter validationAhmad Fatoum
We check the f71862fg_pin module parameter every time a watchdog device for the f71862fg is opened, but the parameter can't change at runtime. If we move the check to the start of init: - We catch userspace passing invalid, but unused, values - We check the condition only once - We simplify the code Do so. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200611191750.28096-6-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: f71808e_wdt: clear watchdog timeout occurred flagAhmad Fatoum
The flag indicating a watchdog timeout having occurred normally persists till Power-On Reset of the Fintek Super I/O chip. The user can clear it by writing a `1' to the bit. The driver doesn't offer a restart method, so regular system reboot might not reset the Super I/O and if the watchdog isn't enabled, we won't touch the register containing the bit on the next boot. In this case all subsequent regular reboots will be wrongly flagged by the driver as being caused by the watchdog. Fix this by having the flag cleared after read. This is also done by other drivers like those for the i6300esb and mpc8xxx_wdt. Fixes: b97cb21a4634 ("watchdog: f71808e_wdt: Fix WDTMOUT_STS register read") Cc: stable@vger.kernel.org Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200611191750.28096-5-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: f71808e_wdt: remove use of wrong watchdog_info optionAhmad Fatoum
The flags that should be or-ed into the watchdog_info.options by drivers all start with WDIOF_, e.g. WDIOF_SETTIMEOUT, which indicates that the driver's watchdog_ops has a usable set_timeout. WDIOC_SETTIMEOUT was used instead, which expands to 0xc0045706, which equals: WDIOF_FANFAULT | WDIOF_EXTERN1 | WDIOF_PRETIMEOUT | WDIOF_ALARMONLY | WDIOF_MAGICCLOSE | 0xc0045000 These were so far indicated to userspace on WDIOC_GETSUPPORT. As the driver has not yet been migrated to the new watchdog kernel API, the constant can just be dropped without substitute. Fixes: 96cb4eb019ce ("watchdog: f71808e_wdt: new watchdog driver for Fintek F71808E and F71882FG") Cc: stable@vger.kernel.org Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200611191750.28096-4-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.optionsAhmad Fatoum
The driver supports populating bootstatus with WDIOF_CARDRESET, but so far userspace couldn't portably determine whether absence of this flag meant no watchdog reset or no driver support. Or-in the bit to fix this. Fixes: b97cb21a4634 ("watchdog: f71808e_wdt: Fix WDTMOUT_STS register read") Cc: stable@vger.kernel.org Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200611191750.28096-3-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05docs: watchdog: codify ident.options as superset of possible status flagsAhmad Fatoum
The FIXME comment has been in-tree since the very first git commit. The described behavior has been since relied on by some userspace, e.g. the util-linux wdctl command and has been ignored by some kernelspace, like the f71808e_wdt driver. The functionality is useful to have to be able to differentiate between a driver that doesn't support WDIOF_CARDRESET and one that does, but hasn't had a watchdog reset, thus drop the FIXME to encourage drivers adopting this convention. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200611191750.28096-2-a.fatoum@pengutronix.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05dt-bindings: watchdog: Add compatible for QCS404, SC7180, SDM845, SM8150Sai Prakash Ranjan
Add missing compatible for watchdog timer on QCS404, SC7180, SDM845 and SM8150 SoCs. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/09da1ba319dc4a27ef4e4e177e67e68f1cb4f35b.1593112534.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05dt-bindings: watchdog: Convert QCOM watchdog timer bindings to YAMLSai Prakash Ranjan
Convert QCOM watchdog timer bindings to DT schema format using json-schema. Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/77b47aad9d17cb4ec8359bdbbb30c33d818117e6.1593112534.git.saiprakash.ranjan@codeaurora.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: dw_wdt: Add DebugFS filesSerge Semin
For the sake of the easier device-driver debug procedure, we added a DebugFS file with the controller registers state. It's available only if kernel is configured with DebugFS support. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-8-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: dw_wdt: Add pre-timeouts supportSerge Semin
DW Watchdog can rise an interrupt in case if IRQ request mode is enabled and timer reaches the zero value. In this case the IRQ lane is left pending until either the next watchdog kick event (watchdog restart) or until the WDT_EOI register is read or the device/system reset. This interface can be used to implement the pre-timeout functionality optionally provided by the Linux kernel watchdog devices. IRQ mode provides a two stages timeout interface. It means the IRQ is raised when the counter reaches zero, while the system reset occurs only after subsequent timeout if the timer restart is not performed. Due to this peculiarity the pre-timeout value is actually set to the achieved hardware timeout, while the real watchdog timeout is considered to be twice as much of it. This applies a significant limitation on the pre-timeout values, so current implementation supports either zero value, which disables the pre-timeout events, or non-zero values, which imply the pre-timeout to be at least half of the current watchdog timeout. Note that we ask the interrupt controller to detect the rising-edge pre-timeout interrupts to prevent the high-level-IRQs flood, since if the pre-timeout happens, the IRQ lane will be left pending until it's cleared by the timer restart. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-7-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: dw_wdt: Support devices with asynch clocksSerge Semin
DW Watchdog IP core can be synthesised with asynchronous timer/APB clocks support (WDT_ASYNC_CLK_MODE_ENABLE == 1). In this case separate clock signals are supposed to be used to feed watchdog timer and APB interface of the device. Currently the driver supports the synchronous mode only. Since there is no way to determine which mode was actually activated for device from its registers, we have to rely on the platform device configuration data. If optional "pclk" clock source is supplied, we consider the device working in asynchronous mode, otherwise the driver falls back to the synchronous configuration. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-6-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: dw_wdt: Support devices with non-fixed TOP valuesSerge Semin
In case if the DW Watchdog IP core is synthesised with WDT_USE_FIX_TOP == false, the TOP interval indexes make the device to load a custom periods to the counter. These periods are hardwired at the IP synthesis stage and can be within [2^8, 2^(WDT_CNT_WIDTH - 1)]. Alas their values can't be detected at runtime and must be somehow supplied to the driver so one could properly determine the watchdog timeout intervals. For this purpose we suggest to have a vendor- specific dts property "snps,watchdog-tops" utilized, which would provide an array of sixteen counter values. At device probe stage they will be used to initialize the watchdog device timeouts determined from the array values and current clocks source rate. In order to have custom TOP values supported the driver must be altered in the following way. First of all the fixed-top values ready-to-use array must be determined for compatibility with currently supported devices, which were synthesised with WDT_USE_FIX_TOP == true. It will be used if either fixed TOP feature is detected being enabled or no custom TOPs are fetched from the device dt node. Secondly at the probe stage we must initialize an array of the watchdog timeouts corresponding to the detected TOPs list and the reference clock rate. For generality the procedure of initialization is designed in a way to support the TOPs array with no limitations on the items order or value. Finally the watchdog period search methods should be altered to support the new timeouts data structure. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: linux-mips@vger.kernel.org Cc: devicetree@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-5-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05dt-bindings: watchdog: dw-wdt: Add watchdog TOPs array propertySerge Semin
In case if DW Watchdog IP core is built with WDT_USE_FIX_TOP == false, a custom timeout periods are used to preset the timer counter. In this case that periods should be specified in a new "snps,watchdog-tops" property of the DW watchdog dts node. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-mips@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-4-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05dt-bindings: watchdog: dw-wdt: Support devices with asynch clocksSerge Semin
DW Watchdog IP core can be synthesised with asynchronous timer/APB clocks support (WDT_ASYNC_CLK_MODE_ENABLE == 1). In this case separate clock signals are supposed to be used to feed watchdog timer and APB interface of the device. Let's update the DW Watchdog DT node schema so it would support the optional APB3 bus clock specified along with the mandatory watchdog timer reference clock. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-mips@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-3-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05dt-bindings: watchdog: Convert DW WDT binding to DT schemaSerge Semin
Modern device tree bindings are supposed to be created as YAML-files in accordance with dt-schema. This commit replaces the DW Watchdog legacy bare text bindings with YAML file. As before the binding states that the corresponding dts node is supposed to have a registers range, a watchdog timer references clock source, optional reset line and pre-timeout interrupt. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Alexey Malahov <Alexey.Malahov@baikalelectronics.ru> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-mips@vger.kernel.org Link: https://lore.kernel.org/r/20200530073557.22661-2-Sergey.Semin@baikalelectronics.ru Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05MAINTAINERS: rectify entry in ARM SMC WATCHDOG DRIVERLukas Bulwahn
Commit 5c24a28b4eb8 ("dt-bindings: watchdog: Add ARM smc wdt for mt8173 watchdog") added the new ARM SMC WATCHDOG DRIVER entry in MAINTAINERS, but slipped in a minor mistake. Luckily, ./scripts/get_maintainer.pl --self-test=patterns complains: warning: no file matches F: devicetree/bindings/watchdog/arm-smc-wdt.yaml Update file entry to intended file location. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Julius Werner <jwerner@chromium.org> Reviewed-by: Evan Benn <evanbenn@chromium.org> Link: https://lore.kernel.org/r/20200602052104.7795-1-lukas.bulwahn@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: Use kobj_to_dev() APIWang Qing
Use kobj_to_dev() API instead of container_of(). Signed-off-by: Wang Qing <wangqing@vivo.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/1591945384-14587-1-git-send-email-wangqing@vivo.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: bcm_kona_wdt: Use correct return value for bcm_kona_wdt_probe()Tiezhu Yang
When call function devm_platform_ioremap_resource(), we should use IS_ERR() to check the return value and return PTR_ERR() if failed. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/1590391864-308-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: sunxi_wdt: fix improper error exit codeMartin Wu
sunxi_wdt_probe() should return -ENOMEM when devm_kzalloc() fails. Signed-off-by: Martin Wu <wuyan@allwinnertech.com> Signed-off-by: Frank Lee <frank@allwinnertech.com> Acked-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200529094514.26374-1-frank@allwinnertech.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: test_bit() => watchdog_active()Bumsik Kim
Use the dedicated function watchdog_active() instead of the generic test_bit() function. It is done using the following Coccinelle script: @@ identifier wdd; @@ - test_bit(WDOG_ACTIVE, &wdd->status) + watchdog_active(wdd) Signed-off-by: Bumsik Kim <k.bumsik@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200529012428.84684-1-k.bumsik@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05docs: watchdog: mlx-wdt: Add description of new watchdog type 3Michael Shych
Add documentation with details of new type of Mellanox watchdog driver. Signed-off-by: Michael Shych <michaelsh@mellanox.com> Reviewed-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200504141427.17685-5-michaelsh@mellanox.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05watchdog: mlx-wdt: support new watchdog type with longer timeout periodMichael Shych
New programmable logic device can have watchdog type 3 implementation. It's same as Type 2 with extended maximum timeout period. Maximum timeout is up-to 65535 sec. Type 3 HW watchdog implementation can exist on all Mellanox systems. It is differentiated by WD capability bit. Signed-off-by: Michael Shych <michaelsh@mellanox.com> Reviewed-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20200504141427.17685-4-michaelsh@mellanox.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05platform/x86: mlx-platform: support new watchdog type with longer timeoutMichael Shych
Add verification of WD capability in order to distinguish between the existing WD types and new type, implemented in CPLD. Add configuration for a new WD type. Change access mode for watchdog registers. Signed-off-by: Michael Shych <michaelsh@mellanox.com> Reviewed-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20200504141427.17685-3-michaelsh@mellanox.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05platform_data/mlxreg: support new watchdog type with longer timeout periodMichael Shych
Add new watchdog type 3 with longer timeout period. Extend size of health_cntr field that that can be used to init watchdog timeout period. Signed-off-by: Michael Shych <michaelsh@mellanox.com> Reviewed-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20200504141427.17685-2-michaelsh@mellanox.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2020-08-05iomap: fall back to buffered writes for invalidation failuresChristoph Hellwig
Failing to invalid the page cache means data in incoherent, which is a very bad state for the system. Always fall back to buffered I/O through the page cache if we can't invalidate mappings. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> # for ext4 Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> # for gfs2 Reviewed-by: Ritesh Harjani <riteshh@linux.ibm.com>
2020-08-05xfs: use ENOTBLK for direct I/O to buffered I/O fallbackChristoph Hellwig
This is what the classic fs/direct-io.c implementation and thuse other file systems use. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-08-05iomap: Only invalidate page cache pages on direct IO writesDave Chinner
The historic requirement for XFS to invalidate cached pages on direct IO reads has been lost in the twisty pages of history - it was inherited from Irix, which implemented page cache invalidation on read as a method of working around problems synchronising page cache state with uncached IO. XFS has carried this ever since. In the initial linux ports it was necessary to get mmap and DIO to play "ok" together and not immediately corrupt data. This was the state of play until the linux kernel had infrastructure to track unwritten extents and synchronise page faults with allocations and unwritten extent conversions (->page_mkwrite infrastructure). IOws, the page cache invalidation on DIO read was necessary to prevent trivial data corruptions. This didn't solve all the problems, though. There were peformance problems if we didn't invalidate the entire page cache over the file on read - we couldn't easily determine if the cached pages were over the range of the IO, and invalidation required taking a serialising lock (i_mutex) on the inode. This serialising lock was an issue for XFS, as it was the only exclusive lock in the direct Io read path. Hence if there were any cached pages, we'd just invalidate the entire file in one go so that subsequent IOs didn't need to take the serialising lock. This was a problem that prevented ranged invalidation from being particularly useful for avoiding the remaining coherency issues. This was solved with the conversion of i_mutex to i_rwsem and the conversion of the XFS inode IO lock to use i_rwsem. Hence we could now just do ranged invalidation and the performance problem went away. However, page cache invalidation was still needed to serialise sub-page/sub-block zeroing via direct IO against buffered IO because bufferhead state attached to the cached page could get out of whack when direct IOs were issued. We've removed bufferheads from the XFS code, and we don't carry any extent state on the cached pages anymore, and so this problem has gone away, too. IOWs, it would appear that we don't have any good reason to be invalidating the page cache on DIO reads anymore. Hence remove the invalidation on read because it is unnecessary overhead, not needed to maintain coherency between mmap/buffered access and direct IO anymore, and prevents anyone from using direct IO reads from intentionally invalidating the page cache of a file. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-08-05xfs: delete duplicated words + other fixesRandy Dunlap
Delete repeated words in fs/xfs/. {we, that, the, a, to, fork} Change "it it" to "it is" in one location. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> To: linux-fsdevel@vger.kernel.org Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: linux-xfs@vger.kernel.org Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-08-05ceph: handle zero-length feature mask in session messagesJeff Layton
Most session messages contain a feature mask, but the MDS will routinely send a REJECT message with one that is zero-length. Commit 0fa8263367db ("ceph: fix endianness bug when handling MDS session feature bits") fixed the decoding of the feature mask, but failed to account for the MDS sending a zero-length feature mask. This causes REJECT message decoding to fail. Skip trying to decode a feature mask if the word count is zero. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/46823 Fixes: 0fa8263367db ("ceph: fix endianness bug when handling MDS session feature bits") Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Patrick Donnelly <pdonnell@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-08-05vDPA: dont change vq irq after DRIVER_OKZhu Lingshan
IRQ of a vq is not expected to be changed in a DRIVER_OK ~ !DRIVER_OK period for irq offloading purposes. Place this comment at the side of bus ops get_vq_irq than in set_status in vhost_vdpa. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/r/20200804102123.69978-1-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_pci_modern: Fix the comment of virtio_pci_find_capability()Liao Pingfang
Fix the comment of virtio_pci_find_capability() by adding missing comment for the last parameter: bars. Fixes: 59a5b0f7bf74 ("virtio-pci: alloc only resources actually used.") Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Link: https://lore.kernel.org/r/1596455545-43556-1-git-send-email-wang.yi59@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-08-05irqbypass: do not start cons/prod when failed connectZhu Lingshan
If failed to connect, there is no need to start consumer nor producer. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Suggested-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731065533.4144-7-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05ifcvf: implement vdpa_config_ops.get_vq_irq()Zhu Lingshan
This commit implemented vdpa_config_ops.get_vq_irq() in ifcvf, and initialized vq irq to -EINVAL. So that ifcvf can report irq number of a vq, or -EINVAL if the vq is not assigned an irq number. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Suggested-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731065533.4144-6-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05vhost_vdpa: implement IRQ offloading in vhost_vdpaZhu Lingshan
This patch introduce a set of functions for setup/unsetup and update irq offloading respectively by register/unregister and re-register the irq_bypass_producer. With these functions, this commit can setup/unsetup irq offloading through setting DRIVER_OK/!DRIVER_OK, and update irq offloading through SET_VRING_CALL. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Suggested-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731065533.4144-5-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05vDPA: add get_vq_irq() in vdpa_config_opsZhu Lingshan
This commit adds a new function get_vq_irq() in struct vdpa_config_ops, which will return the irq number of a virtqueue. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Suggested-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731065533.4144-4-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05kvm: detect assigned device via irqbypass managerZhu Lingshan
vDPA devices has dedicated backed hardware like passthrough-ed devices. Then it is possible to setup irq offloading to vCPU for vDPA devices. Thus this patch tries to manipulated assigned device counters by kvm_arch_start/end_assignment() in irqbypass manager, so that assigned devices could be detected in update_pi_irte() We will increase/decrease the assigned device counter in kvm/x86. Both vDPA and VFIO would go through this code path. Only X86 uses these counters and kvm_arch_start/end_assignment(), so this code path only affect x86 for now. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Suggested-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731065533.4144-3-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05vhost: introduce vhost_vring_callZhu Lingshan
This commit introduces struct vhost_vring_call which replaced raw struct eventfd_ctx *call_ctx in struct vhost_virtqueue. Besides eventfd_ctx, it contains a spin lock and an irq_bypass_producer in its structure. Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com> Suggested-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731065533.4144-2-lingshan.zhu@intel.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05vhost: Use flex_array_size() helper in copy_from_user()Gustavo A. R. Silva
Make use of the flex_array_size() helper to calculate the size of a flexible array member within an enclosing structure. This helper offers defense-in-depth against potential integer overflows, while at the same time makes it explicitly clear that we are dealing with a flexible array member. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20200731130956.GA30525@embeddedor Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05vdpasim: protect concurrent access to iommu iotlbMax Gurtovoy
Iommu iotlb can be accessed by different cores for performing IO using multiple virt queues. Add a spinlock to synchronize iotlb accesses. This could be easily reproduced when using more than 1 pktgen threads to inject traffic to vdpa simulator. Fixes: 2c53d0f64c06f("vdpasim: vDPA device simulator") Cc: stable@vger.kernel.org Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200731073822.13326-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05vhost: vdpa: remove per device feature whitelistJason Wang
We used to have a per device feature whitelist to filter out the unsupported virtio features. But this seems unnecessary since: - the main idea behind feature whitelist is to block control vq feature until we finalize the control virtqueue API. But the current vhost-vDPA uAPI is sufficient to support control virtqueue. For device that has hardware control virtqueue, the vDPA device driver can just setup the hardware virtqueue and let userspace to use hardware virtqueue directly. For device that doesn't have a control virtqueue, the vDPA device driver need to use e.g vringh to emulate a software control virtqueue. - we don't do it in virtio-vDPA driver So remove this limitation. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20200720085043.16485-1-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_ring: Avoid loop when vq is broken in virtqueue_pollMao Wenan
The loop may exist if vq->broken is true, virtqueue_get_buf_ctx_packed or virtqueue_get_buf_ctx_split will return NULL, so virtnet_poll will reschedule napi to receive packet, it will lead cpu usage(si) to 100%. call trace as below: virtnet_poll virtnet_receive virtqueue_get_buf_ctx virtqueue_get_buf_ctx_packed virtqueue_get_buf_ctx_split virtqueue_napi_complete virtqueue_poll //return true virtqueue_napi_schedule //it will reschedule napi to fix this, return false if vq is broken in virtqueue_poll. Signed-off-by: Mao Wenan <wenan.mao@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1596354249-96204-1-git-send-email-wenan.mao@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2020-08-05virtio_net: use LE accessors for speed/duplexMichael S. Tsirkin
Speed and duplex config fields depend on VIRTIO_NET_F_SPEED_DUPLEX which being 63>31 depends on VIRTIO_F_VERSION_1. Accordingly, use LE accessors for these fields. Reported-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_config: drop LE option from config spaceMichael S. Tsirkin
All drivers now use virtio_cread/write_le for LE config space fields. Drop LE option from virtio_cread/write, only leaving the option to access transitional fields. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio-iommu: convert to LE accessorsMichael S. Tsirkin
Virtio iommu is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_mem: convert to LE accessorsMichael S. Tsirkin
Virtio mem is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05drm/virtio: convert to LE accessorsMichael S. Tsirkin
Virtgpu is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_pmem: convert to LE accessorsMichael S. Tsirkin
Virtio pmem is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_crypto: convert to LE accessorsMichael S. Tsirkin
Virtio crypto is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_fs: convert to LE accessorsMichael S. Tsirkin
Virtio fs is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-08-05virtio_input: convert to LE accessorsMichael S. Tsirkin
Virtio input is modern-only. Use LE accessors for config space. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>