summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2018-07-19PCI: Unify try slot and bus reset APISinan Kaya
Drivers are expected to call pci_try_reset_slot() or pci_try_reset_bus() by querying if a system supports hotplug or not. A survey showed that most drivers don't do this and we are leaking hotplug capability to the user. Hide pci_try_slot_reset() from drivers and embed into pci_try_bus_reset(). Change pci_try_reset_bus() parameter from struct pci_bus to struct pci_dev. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-19PCI: Hide pci_reset_bridge_secondary_bus() from driversSinan Kaya
Rename pci_reset_bridge_secondary_bus() to pci_bridge_secondary_bus_reset() and move the declaration from linux/pci.h to drivers/pci.h to be used internally in PCI directory only. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-20sched/clock: Move sched clock initialization and merge with generic clockPavel Tatashin
sched_clock_postinit() initializes a generic clock on systems where no other clock is provided. This function may be called only after timekeeping_init(). Rename sched_clock_postinit to generic_clock_inti() and call it from sched_clock_init(). Move the call for sched_clock_init() until after time_init(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-23-pasha.tatashin@oracle.com
2018-07-20timekeeping: Replace read_boot_clock64() with ↵Pavel Tatashin
read_persistent_wall_and_boot_offset() If architecture does not support exact boot time, it is challenging to estimate boot time without having a reference to the current persistent clock value. Yet, it cannot read the persistent clock time again, because this may lead to math discrepancies with the caller of read_boot_clock64() who have read the persistent clock at a different time. This is why it is better to provide two values simultaneously: the persistent clock value, and the boot time. Replace read_boot_clock64() with: read_persistent_wall_and_boot_offset(wall_time, boot_offset) Where wall_time is returned by read_persistent_clock() And boot_offset is wall_time - boot time, which defaults to 0. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: steven.sistare@oracle.com Cc: daniel.m.jordan@oracle.com Cc: linux@armlinux.org.uk Cc: schwidefsky@de.ibm.com Cc: heiko.carstens@de.ibm.com Cc: john.stultz@linaro.org Cc: sboyd@codeaurora.org Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: peterz@infradead.org Cc: prarit@redhat.com Cc: feng.tang@intel.com Cc: pmladek@suse.com Cc: gnomes@lxorguk.ukuu.org.uk Cc: linux-s390@vger.kernel.org Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180719205545.16512-16-pasha.tatashin@oracle.com
2018-07-19PCI/AER: Define aer_stats structure for AER capable devicesRajat Jain
Define a structure to hold the AER statistics. There are 2 groups of statistics: dev_* counters that are to be collected for all AER capable devices and rootport_* counters that are collected for all (AER capable) rootports only. Allocate and free this structure when device is added or released (thus counters survive the lifetime of the device). Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-19PCI/AER: Move internal declarations to drivers/pci/pci.hRajat Jain
Since pci_aer_init() and pci_no_aer() are used only internally, move their declarations to the PCI internal header file. Also, no one cares about return value of pci_aer_init(), so make it void. Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-19mtd: rawnand: Expose _notsupp() helpers for raw page accessorsBoris Brezillon
Some implementations simply can't disable their ECC engine. Expose helpers returning -ENOTSUPP so that the caller knows that raw accesses are not supported instead of silently falling back to non-raw accessors. Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2018-07-19mtd: adapt misleading comment in mtd_oob_ops structureMiquel Raynal
A comment in the kernel doc of the mtd_oob_ops structure tells that it is not possible to write more than one page with OOB. This is actually true for only a few MTD devices like 'onenand' but it is definitely not a general limitation. While this would benefit to be handled elsewhere either by the MTD layer or by the limited drivers, let's update this comment to reflect the reality. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-07-19Merge tag 'pci-v4.18-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix crashes that happen when PHY drivers are left disabled in the V3 Semiconductor, MediaTek, Faraday, Aardvark, DesignWare, Versatile, and X-Gene host controller drivers (Sergei Shtylyov) - Fix a NULL pointer dereference in the endpoint library configfs support (Kishon Vijay Abraham I) - Fix a race condition in Hyper-V IRQ handling (Dexuan Cui) * tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: v3-semi: Fix I/O space page leak PCI: mediatek: Fix I/O space page leak PCI: faraday: Fix I/O space page leak PCI: aardvark: Fix I/O space page leak PCI: designware: Fix I/O space page leak PCI: versatile: Fix I/O space page leak PCI: xgene: Fix I/O space page leak PCI: OF: Fix I/O space page leak PCI: endpoint: Fix NULL pointer dereference error when CONFIGFS is disabled PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg()
2018-07-19pci-epf-test/pci_endpoint_test: Add MSI-X supportGustavo Pimentel
Add MSI-X support and update driver documentation accordingly. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-07-19PCI: Update xxx_pcie_ep_raise_irq() and pci_epc_raise_irq() signaturesGustavo Pimentel
Change {cdns, dra7xx, artpec6, dw, rockchip}_pcie_ep_raise_irq() and pci_epc_raise_irq() signature, namely the interrupt_num variable type from u8 to u16 to accommodate 2048 maximum MSI-X interrupts. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Alan Douglas <adouglas@cadence.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-07-19PCI: endpoint: Add MSI-X interfacesGustavo Pimentel
Add PCI_EPC_IRQ_MSIX type. Add MSI-X callbacks signatures to the ops structure. Add sysfs interface for set/get MSI-X capability maximum number. Update documentation accordingly. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-07-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Lots of fixes, here goes: 1) NULL deref in qtnfmac, from Gustavo A. R. Silva. 2) Kernel oops when fw download fails in rtlwifi, from Ping-Ke Shih. 3) Lost completion messages in AF_XDP, from Magnus Karlsson. 4) Correct bogus self-assignment in rhashtable, from Rishabh Bhatnagar. 5) Fix regression in ipv6 route append handling, from David Ahern. 6) Fix masking in __set_phy_supported(), from Heiner Kallweit. 7) Missing module owner set in x_tables icmp, from Florian Westphal. 8) liquidio's timeouts are HZ dependent, fix from Nicholas Mc Guire. 9) Link setting fixes for sh_eth and ravb, from Vladimir Zapolskiy. 10) Fix NULL deref when using chains in act_csum, from Davide Caratti. 11) XDP_REDIRECT needs to check if the interface is up and whether the MTU is sufficient. From Toshiaki Makita. 12) Net diag can do a double free when killing TCP_NEW_SYN_RECV connections, from Lorenzo Colitti. 13) nf_defrag in ipv6 can unnecessarily hold onto dst entries for a full minute, delaying device unregister. From Eric Dumazet. 14) Update MAC entries in the correct order in ixgbe, from Alexander Duyck. 15) Don't leave partial mangles bpf program in jit_subprogs, from Daniel Borkmann. 16) Fix pfmemalloc SKB state propagation, from Stefano Brivio. 17) Fix ACK handling in DCTCP congestion control, from Yuchung Cheng. 18) Use after free in tun XDP_TX, from Toshiaki Makita. 19) Stale ipv6 header pointer in ipv6 gre code, from Prashant Bhole. 20) Don't reuse remainder of RX page when XDP is set in mlx4, from Saeed Mahameed. 21) Fix window probe handling of TCP rapair sockets, from Stefan Baranoff. 22) Missing socket locking in smc_ioctl(), from Ursula Braun. 23) IPV6_ILA needs DST_CACHE, from Arnd Bergmann. 24) Spectre v1 fix in cxgb3, from Gustavo A. R. Silva. 25) Two spots in ipv6 do a rol32() on a hash value but ignore the result. Fixes from Colin Ian King" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (176 commits) tcp: identify cryptic messages as TCP seq # bugs ptp: fix missing break in switch hv_netvsc: Fix napi reschedule while receive completion is busy MAINTAINERS: Drop inactive Vitaly Bordug's email net: cavium: Add fine-granular dependencies on PCI net: qca_spi: Fix log level if probe fails net: qca_spi: Make sure the QCA7000 reset is triggered net: qca_spi: Avoid packet drop during initial sync ipv6: fix useless rol32 call on hash ipv6: sr: fix useless rol32 call on hash net: sched: Using NULL instead of plain integer net: usb: asix: replace mii_nway_restart in resume path net: cxgb3_main: fix potential Spectre v1 lib/rhashtable: consider param->min_size when setting initial table size net/smc: reset recv timeout after clc handshake net/smc: add error handling for get_user() net/smc: optimize consumer cursor updates net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL. ipv6: ila: select CONFIG_DST_CACHE net: usb: rtl8150: demote allmulti message to dev_dbg() ...
2018-07-18net/mlx5: Fix QP fragmented buffer allocationTariq Toukan
Fix bad alignment of SQ buffer in fragmented QP allocation. It should start directly after RQ buffer ends. Take special care of the end case where the RQ buffer does not occupy a whole page. RQ size is a power of two, so would be the case only for small RQ sizes (RQ size < PAGE_SIZE). Fix wrong assignments for sqb->size (mistakenly assigned RQ size), and for npages value of RQ and SQ. Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: Better return types for CQE APITariq Toukan
Reduce sizes of return types. Use bool for binary indication. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: Add core support for double vlan push/pop steering actionJianbo Liu
As newer firmware supports double push/pop in a single FTE, we add core bits and extend vlan action logic for it. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: Expose MPEGC (Management PCIe General Configuration) structuresEran Ben Elisha
This patch exposes PRM layout for handling MPEGC (Management PCIe General Configuration). This will be used in the downstream patch for configuring MPEGC via the driver. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: FW tracer, add hardware structuresFeras Daoud
This change adds the infrastructure to mlx5 core fw tracer. It introduces the following 4 new registers: MLX5_REG_MTRC_CAP - Used to read tracer capabilities MLX5_REG_MTRC_CONF - Used to set tracer configurations MLX5_REG_MTRC_STDB - Used to query tracer strings database MLX5_REG_MTRC_CTRL - Used to control the tracer The capability of the tracing can be checked using mcam access register, therefore, the mcam access register interface will expose the tracer register. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net: Move skb decrypted field, avoid explicity copyStefano Brivio
Commit 784abe24c903 ("net: Add decrypted field to skb") introduced a 'decrypted' field that is explicitly copied on skb copy and clone. Move it between headers_start[0] and headers_end[0], so that we don't need to copy it explicitly as it's copied by the memcpy() in __copy_skb_header(). While at it, drop the assignment in __skb_clone(), it was already redundant. This doesn't change the size of sk_buff or cacheline boundaries. The 15-bits hole before tc_index becomes a 14-bits hole, and will be again a 15-bits hole when this change is merged with commit 8b7008620b84 ("net: Don't copy pfmemalloc flag in __copy_skb_header()"). v2: as reported by kbuild test robot (oops, I forgot to build with CONFIG_TLS_DEVICE it seems), we can't use CHECK_SKB_FIELD() on a bit-field member. Just drop the check for the moment being, perhaps we could think of some magic to also check bit-field members one day. Fixes: 784abe24c903 ("net: Add decrypted field to skb") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-18PCI: OF: Fix I/O space page leakSergei Shtylyov
When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. Introduce the devm_pci_remap_iospace() managed API and replace the pci_remap_iospace() call with it to fix the bug. Fixes: dbf9826d5797 ("PCI: generic: Convert to DT resource parsing API") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: split commit/updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-07-18Input: gpio_keys - add missing include to gpio_keys.hMatti Vaittinen
gpio_keys.h uses 'bool' - type which is defined in linux/types.h. Include this header. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-07-18Merge branch 'topic/vga_switcheroo' into for-nextTakashi Iwai
Pull the vga_switcheroo audio client fix. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-07-18Merge tag 'mvebu-arm-4.19-1' of git://git.infradead.org/linux-mvebu into ↵Olof Johansson
next/soc mvebu arm for 4.19 (part 1) - remove potential call from invalid context in boot_secondary - allow using CONFIG_FORTIFY_SOURCE in pmsu.c * tag 'mvebu-arm-4.19-1' of git://git.infradead.org/linux-mvebu: ARM: mvebu: convert secondary CPU clock sync to hotplug state ARM: mvebu: declare asm symbols as character arrays in pmsu.c Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-18spi: spi-bitbang: change flags from u8 to u16David Lechner
This changes the data type of the flags field in struct spi_bitbang from u8 to u16. This matches the size of the mode field of struct spi_device where these flags are also used. This is done in preparation of adding a new SPI mode flag that will be used with this field that would otherwise not fit in 8 bits. Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-07-18Merge tag 'pxa-for-4.19-v2' of https://github.com/rjarzmik/linux into next/socOlof Johansson
This is the pxa changes for 4.19 cycle : - the pxa architecture is ported to dma slavemap - some minor AC97 fixes - some minor board fixes * tag 'pxa-for-4.19-v2' of https://github.com/rjarzmik/linux: net: smc91x: remove the dmaengine compat need net: smc911x: remove the dmaengine compat need ARM: pxa: zylonite: use the new ac97 bus support ARM: pxa: add the missing AC97 clocks ARM: pxa: mioa701 convert to the new AC97 bus ARM: pxa: hx4700: fix the usb client ARM: pxa: change SSP DMA channels allocation ARM: pxa: remove the DMA IO resources dmaengine: pxa: document pxad_param ata: pata_pxa: remove the dmaengine compat need mtd: rawnand: marvell: remove the dmaengine compat need media: pxa_camera: remove the dmaengine compat need mmc: pxamci: remove the dmaengine compat need dmaengine: pxa: add a default requestor policy ARM: pxa: add dma slave map dmaengine: pxa: use a dma slave map Signed-off-by: Olof Johansson <olof@lixom.net>
2018-07-18regmap: add SCCB supportAkinobu Mita
This adds Serial Camera Control Bus (SCCB) support for regmap API that is intended to be used by some of Omnivision sensor drivers. The ov772x and ov9650 drivers are going to use this SCCB regmap API. The ov772x driver was previously only worked with the i2c controller drivers that support I2C_FUNC_PROTOCOL_MANGLING, because the ov772x device doesn't support repeated starts. After commit 0b964d183cbf ("media: ov772x: allow i2c controllers without I2C_FUNC_PROTOCOL_MANGLING"), reading ov772x register is replaced with issuing two separated i2c messages in order to avoid repeated start. Using this SCCB regmap hides the implementation detail. The ov9650 driver also issues two separated i2c messages to read the registers as the device doesn't support repeated start. So it can make use of this SCCB regmap. Cc: Mark Brown <broonie@kernel.org> Cc: Peter Rosin <peda@axentia.se> Cc: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Sylwester Nawrocki <s.nawrocki@samsung.com> Cc: Jacopo Mondi <jacopo+renesas@jmondi.org> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-07-18blkcg: Track DISCARD statistics and output them in cgroup io.statTejun Heo
Add tracking of REQ_OP_DISCARD ios to the per-cgroup io.stat. Two fields, dbytes and dios, to respectively count the total bytes and number of discards are added. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andy Newell <newella@fb.com> Cc: Michael Callahan <michaelcallahan@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-18block: Track DISCARD statistics and output them in stat and diskstatMichael Callahan
Add tracking of REQ_OP_DISCARD ios to the partition statistics and append them to the various stat files in /sys as well as /proc/diskstats. These are tracked with the same four stats as reads and writes: Number of discard ios completed. Number of discard ios merged Number of discard sectors completed Milliseconds spent on discard requests This is done via adding a new STAT_DISCARD define to genhd.h and then using it to index that stat field for discard requests. tj: Refreshed on top of v4.17 and other previous updates. Signed-off-by: Michael Callahan <michaelcallahan@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andy Newell <newella@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-18block: Add and use op_stat_group() for indexing disk_stat fields.Michael Callahan
Add and use a new op_stat_group() function for indexing partition stat fields rather than indexing them by rq_data_dir() or bio_data_dir(). This function works similarly to op_is_sync() in that it takes the request::cmd_flags or bio::bi_opf flags and determines which stats should et updated. In addition, the second parameter to generic_start_io_acct() and generic_end_io_acct() is now a REQ_OP rather than simply a read or write bit and it uses op_stat_group() on the parameter to determine the stat group. Note that the partition in_flight counts are not part of the per-cpu statistics and as such are not indexed via this function. It's now indexed by op_is_write(). tj: Refreshed on top of v4.17. Updated to pass around REQ_OP. Signed-off-by: Michael Callahan <michaelcallahan@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Matias Bjorling <mb@lightnvm.io> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Alasdair Kergon <agk@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-18block: Define and use STAT_READ and STAT_WRITEMichael Callahan
Add defines for STAT_READ and STAT_WRITE for indexing the partition stat entries. This clarifies some fs/ code which has hardcoded 1 for STAT_WRITE and will make it easier to extend the stats with additional fields. tj: Refreshed on top of v4.17. Signed-off-by: Michael Callahan <michaelcallahan@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-18block: Add part_stat_read_accum to read across field entries.Michael Callahan
Add a part_stat_read_accum macro to genhd.h to read and sum across field entries. For example to sum up the number read and write sectors completed. In addition to being ar reasonable cleanup by itself this will make it easier to add new stat fields in the future. tj: Refreshed on top of v4.17. Signed-off-by: Michael Callahan <michaelcallahan@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-18block: make bdev_ops->rw_page() take a REQ_OP instead of boolTejun Heo
c11f0c0b5bb9 ("block/mm: make bdev_ops->rw_page() take a bool for read/write") replaced @op with boolean @is_write, which limited the amount of information going into ->rw_page() and more importantly page_endio(), which removed the need to expose block internals to mm. Unfortunately, we want to track discards separately and @is_write isn't enough information. This patch updates bdev_ops->rw_page() to take REQ_OP instead but leaves page_endio() to take bool @is_write. This allows the block part of operations to have enough information while not leaking it to mm. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-18vfs: remove open_flags from d_real()Miklos Szeredi
Opening regular files on overlayfs is now handled via ovl_open(). Remove the now unused "open_flags" argument from d_op->d_real() and the d_real() helper. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18Revert "fsnotify: support overlayfs"Miklos Szeredi
This reverts commit f3fbbb079263bd29ae592478de6808db7e708267. Overlayfs now works correctly without adding hacks to fsnotify. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18Partially revert "locks: fix file locking on overlayfs"Miklos Szeredi
This partially reverts commit c568d68341be7030f5647def68851e469b21ca11. Overlayfs files will now automatically get the correct locks, no need to hack overlay support in VFS. It is a partial revert, because it leaves the locks_inode() calls in place and defines locks_inode() to file_inode(). We could revert those as well, but it would be unnecessary code churn and it makes sense to document that we are getting the inode for locking purposes. Don't revert MS_NOREMOTELOCK yet since that has been part of the userspace API for some time (though not in a useful way). Will try to remove internal flags later when the dust around the new mount API settles. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Jeff Layton <jlayton@kernel.org>
2018-07-18Revert "vfs: add flags to d_real()"Miklos Szeredi
This reverts commit 495e642939114478a5237a7d91661ba93b76f15a. No user of "flags" argument of d_real() remain. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18Revert "vfs: update ovl inode before relatime check"Miklos Szeredi
This reverts commit 598e3c8f72f5b77c84d2cb26cfd936ffb3cfdbaa. Overlayfs no longer relies on the vfs correct atime handling. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18Revert "ovl: fix relatime for directories"Miklos Szeredi
This reverts commit cd91304e7190b4c4802f8e413ab2214b233e0260. Overlayfs no longer relies on the vfs correct atime handling. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18vfs: export vfs_dedupe_file_range_one() to modulesMiklos Szeredi
This is needed by the stacked dedupe implementation in overlayfs. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18vfs: export vfs_ioctl() to modulesMiklos Szeredi
This is needed by the stacked ioctl implementation in overlayfs. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18vfs: make open_with_fake_path() not contribute to nr_filesMiklos Szeredi
Stacking file operations in overlay will store an extra open file for each overlay file opened. The overhead is just that of "struct file" which is about 256bytes, because overlay already pins an extra dentry and inode when the file is open, which add up to a much larger overhead. For fear of breaking working setups, don't start accounting the extra file. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2018-07-18Merge branch 'dedupe-cleanup' into overlayfs-nextMiklos Szeredi
Following series for stacking overlay files depends on this mini series.
2018-07-18bpf: offload: allow program and map sharing per-ASICJakub Kicinski
Allow programs and maps to be re-used across different netdevs, as long as they belong to the same struct bpf_offload_dev. Update the bpf_offload_prog_map_match() helper for the verifier and export a new helper for the drivers to use when checking programs at attachment time. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18bpf: offload: keep the offload state per-ASICJakub Kicinski
Create a higher-level entity to represent a device/ASIC to allow programs and maps to be shared between device ports. The extra work is required to make sure we don't destroy BPF objects as soon as the netdev for which they were loaded gets destroyed, as other ports may still be using them. When netdev goes away all of its BPF objects will be moved to other netdevs of the device, and only destroyed when last netdev is unregistered. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18bpf: offload: aggregate offloads per-deviceJakub Kicinski
Currently we have two lists of offloaded objects - programs and maps. Netdevice unregister notifier scans those lists to orphan objects associated with device being unregistered. This puts unnecessary (even if negligible) burden on all netdev unregister calls in BPF- -enabled kernel. The lists of objects may potentially get long making the linear scan even more problematic. There haven't been complaints about this mechanisms so far, but it is suboptimal. Instead of relying on notifiers, make the few BPF-capable drivers register explicitly for BPF offloads. The programs and maps will now be collected per-device not on a global list, and only scanned for removal when driver unregisters from BPF offloads. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18bpf: offload: rename bpf_offload_dev_match() to bpf_offload_prog_map_match()Jakub Kicinski
A set of new API functions exported for the drivers will soon use 'bpf_offload_dev_' as a prefix. Rename the bpf_offload_dev_match() which is internal to the core (used by the verifier) to avoid any confusion. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18bpf: bpf_prog_array_alloc() should return a generic non-rcu pointerRoman Gushchin
Currently the return type of the bpf_prog_array_alloc() is struct bpf_prog_array __rcu *, which is not quite correct. Obviously, the returned pointer is a generic pointer, which is valid for an indefinite amount of time and it's not shared with anyone else, so there is no sense in marking it as __rcu. This change eliminate the following sparse warnings: kernel/bpf/core.c:1544:31: warning: incorrect type in return expression (different address spaces) kernel/bpf/core.c:1544:31: expected struct bpf_prog_array [noderef] <asn:4>* kernel/bpf/core.c:1544:31: got void * kernel/bpf/core.c:1548:17: warning: incorrect type in return expression (different address spaces) kernel/bpf/core.c:1548:17: expected struct bpf_prog_array [noderef] <asn:4>* kernel/bpf/core.c:1548:17: got struct bpf_prog_array *<noident> kernel/bpf/core.c:1681:15: warning: incorrect type in assignment (different address spaces) kernel/bpf/core.c:1681:15: expected struct bpf_prog_array *array kernel/bpf/core.c:1681:15: got struct bpf_prog_array [noderef] <asn:4>* Fixes: 324bda9e6c5a ("bpf: multi program support for cgroup+bpf") Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-18integrity: prevent deadlock during digsig verification.Mikhail Kurinnoi
This patch aimed to prevent deadlock during digsig verification.The point of issue - user space utility modprobe and/or it's dependencies (ld-*.so, libz.so.*, libc-*.so and /lib/modules/ files) that could be used for kernel modules load during digsig verification and could be signed by digsig in the same time. First at all, look at crypto_alloc_tfm() work algorithm: crypto_alloc_tfm() will first attempt to locate an already loaded algorithm. If that fails and the kernel supports dynamically loadable modules, it will then attempt to load a module of the same name or alias. If that fails it will send a query to any loaded crypto manager to construct an algorithm on the fly. We have situation, when public_key_verify_signature() in case of RSA algorithm use alg_name to store internal information in order to construct an algorithm on the fly, but crypto_larval_lookup() will try to use alg_name in order to load kernel module with same name. 1) we can't do anything with crypto module work, since it designed to work exactly in this way; 2) we can't globally filter module requests for modprobe, since it designed to work with any requests. In this patch, I propose add an exception for "crypto-pkcs1pad(rsa,*)" module requests only in case of enabled integrity asymmetric keys support. Since we don't have any real "crypto-pkcs1pad(rsa,*)" kernel modules for sure, we are safe to fail such module request from crypto_larval_lookup(). In this way we prevent modprobe execution during digsig verification and avoid possible deadlock if modprobe and/or it's dependencies also signed with digsig. Requested "crypto-pkcs1pad(rsa,*)" kernel module name formed by: 1) "pkcs1pad(rsa,%s)" in public_key_verify_signature(); 2) "crypto-%s" / "crypto-%s-all" in crypto_larval_lookup(). "crypto-pkcs1pad(rsa," part of request is a constant and unique and could be used as filter. Signed-off-by: Mikhail Kurinnoi <viewizard@viewizard.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> include/linux/integrity.h | 13 +++++++++++++ security/integrity/digsig_asymmetric.c | 23 +++++++++++++++++++++++ security/security.c | 7 ++++++- 3 files changed, 42 insertions(+), 1 deletion(-)
2018-07-18evm: Don't deadlock if a crypto algorithm is unavailableMatthew Garrett
When EVM attempts to appraise a file signed with a crypto algorithm the kernel doesn't have support for, it will cause the kernel to trigger a module load. If the EVM policy includes appraisal of kernel modules this will in turn call back into EVM - since EVM is holding a lock until the crypto initialisation is complete, this triggers a deadlock. Add a CRYPTO_NOLOAD flag and skip module loading if it's set, and add that flag in the EVM case in order to fail gracefully with an error message instead of deadlocking. Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2018-07-18netfilter: nf_tables: take module reference when starting a batchFlorian Westphal
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>