summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2019-06-27xsk: Add API to check for available entries in FQMaxim Mikityanskiy
Add a function that checks whether the Fill Ring has the specified amount of descriptors available. It will be useful for mlx5e that wants to check in advance, whether it can allocate a bulk of RX descriptors, to get the best performance. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-27Merge tag 'blk-dim-v2' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mamameed says: ==================== Generic DIM From: Tal Gilboa and Yamin Fridman Implement net DIM over a generic DIM library, add RDMA DIM dim.h lib exposes an implementation of the DIM algorithm for dynamically-tuned interrupt moderation for networking interfaces. We want a similar functionality for other protocols, which might need to optimize interrupts differently. Main motivation here is DIM for NVMf storage protocol. Current DIM implementation prioritizes reducing interrupt overhead over latency. Also, in order to reduce DIM's own overhead, the algorithm might take some time to identify it needs to change profiles. While this is acceptable for networking, it might not work well on other scenarios. Here we propose a new structure to DIM. The idea is to allow a slightly modified functionality without the risk of breaking Net DIM behavior for netdev. We verified there are no degradations in current DIM behavior with the modified solution. Suggested solution: - Common logic is implemented in lib/dim/dim.c - Net DIM (existing) logic is implemented in lib/dim/net_dim.c, which uses the common logic in dim.c - Any new DIM logic will be implemented in "lib/dim/new_dim.c". This new implementation will expose modified versions of profiles, dim_step() and dim_decision(). - DIM API is declared in include/linux/dim.h for all implementations. Pros for this solution are: - Zero impact on existing net_dim implementation and usage - Relatively more code reuse (compared to two separate solutions) - Increased extensibility ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27mtd: spinand: Add initial support for Paragon PN26G0xAJeff Kletsky
Add initial support for Paragon Technology PN26G01Axxxxx and PN26G02Axxxxx SPI NAND Datasheets available at http://www.xtxtech.com/upfile/2016082517274590.pdf http://www.xtxtech.com/upfile/2016082517282329.pdf Signed-off-by: Jeff Kletsky <git-commits@allycomm.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: Add flag to indicate panic_writeKamal Dasu
Added a flag to indicate a panic_write so that low level drivers can use it to take required action where applicable, to ensure oops data gets written to assigned mtd device. Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: spinand: Add support for two-byte device IDsJeff Kletsky
The GigaDevice GD5F1GQ4UFxxG SPI NAND utilizes two-byte device IDs. http://www.gigadevice.com/datasheet/gd5f1gq4xfxxg/ Signed-off-by: Jeff Kletsky <git-commits@allycomm.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: spinand: Define macros for page-read ops with three-byte addressesJeff Kletsky
The GigaDevice GD5F1GQ4UFxxG SPI NAND utilizes three-byte addresses for its page-read ops. http://www.gigadevice.com/datasheet/gd5f1gq4xfxxg/ Signed-off-by: Jeff Kletsky <git-commits@allycomm.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: rawnand: gpmi: Implement exec_opSascha Hauer
The gpmi driver performance suffers from NAND operations being split in multiple small DMA transfers. This has been forced by the NAND layer in the former days, but now with exec_op we can use the controller as intended. With this patch gpmi_nfc_exec_op becomes the main entry point to NAND operations. Here all instructions are collected and chained as separate DMA transfers. In the end whole chain is fired and waited to be finished. gpmi_nfc_exec_op only does the hardware operations, bad block marker swapping and buffer scrambling is done by the callers. It's worth noting that the nand_*_op functions always take the buffer lengths for the data that the NAND chip actually transfers. When doing BCH we have to calculate the net data size from the raw data size in some places. This patch has been tested with 2048/64 and 2048/128 byte NAND on i.MX6q. mtd_oobtest, mtd_subpagetest and mtd_speedtest run without errors. nandbiterrs, nandpagetest and nandsubpagetest userspace tests from mtdutils run without errors and UBIFS can successfully be mounted. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27dmaengine: mxs: rename custom flagSascha Hauer
The mxs dma driver uses the flags parameter in dmaengine_prep_slave_sg() for custom flags, but still uses the dmaengine specific names of the flags. Do a little bit better and at least give the flag a custom name. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27dmaengine: mxs: Add header file to be shared with gpmi nand driverSascha Hauer
The mxs dma driver can do PIO transfers. A pointer to the PIO words to transfer is passed in the struct scatterlist * argument of dmaengine_prep_slave_sg(). It's quite ugly and non obvious to cast u32 * to struct scatterlist * each time when calling dmaengine_prep_slave_sg(), so add a static inline wrapper function to be called by the user along with a description what is going on. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: rawnand: export NAND operation tracerSascha Hauer
The NAND core has a NAND operation tracing function, but it can only be used by drivers using the generic option parser from the NAND core. Export the tracing function as a static inline function in rawnand.h so that drivers implementing exec_op directly do not have to write their own operation tracing. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: onenand: Add support for 8Gb datasize onenandJonathan Bakker
Used in several S5PV210-based Galaxy S devices, among them SGH-T959V, SGH-T959P, SGH-T839, and SPH-D700. Signed-off-by: Jonathan Bakker <xc-racer2@live.ca> Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27Merge branch '40GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-26 This series contains updates to ixgbe and i40e only. Mauro S. M. Rodrigues update the ixgbe driver to handle transceivers who comply with SFF-8472 but do not implement the Digital Diagnostic Monitoring (DOM) interface. Update the driver to check the necessary bits to see if DOM is implemented before trying to read the additional 256 bytes in the EEPROM for DOM data. Young Xiao fixes a potential divide by zero issue in ixgbe driver. Aleksandr fixes i40e to recognize 2.5 and 5.0 GbE link speeds so that it is not reported as "Unknown bps". Fixes the driver to read the firmware LLDP agent status during DCB initialization, and to properly log the LLDP agent status to help with debugging when DCB fails to initialize. Martyna fixes i40e for the missing supported and advertised link modes information in ethtool. Jake fixes a function header comment that was incorrect for a PTP function in i40e. Maciej fixes an issue for i40e when a XDP program is loaded the descriptor count gets reset to the default value, resolve the issue by making the current descriptor count persistent across resets. Alice corrects a copyright date which she found to be incorrect. Piotr adds a log entry when the traffic class 0 is added or deleted, which was not being logged previously. Gustavo A. R. Silva updates i40e to use struct_size() where possible. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27mtd: Add support for HyperBus memory devicesVignesh Raghavendra
Cypress' HyperBus is Low Signal Count, High Performance Double Data Rate Bus interface between a host system master and one or more slave interfaces. HyperBus is used to connect microprocessor, microcontroller, or ASIC devices with random access NOR flash memory (called HyperFlash) or self refresh DRAM (called HyperRAM). Its a 8-bit data bus (DQ[7:0]) with Read-Write Data Strobe (RWDS) signal and either Single-ended clock(3.0V parts) or Differential clock (1.8V parts). It uses ChipSelect lines to select b/w multiple slaves. At bus level, it follows a separate protocol described in HyperBus specification[1]. HyperFlash follows CFI AMD/Fujitsu Extended Command Set (0x0002) similar to that of existing parallel NORs. Since HyperBus is x8 DDR bus, its equivalent to x16 parallel NOR flash with respect to bits per clock cycle. But HyperBus operates at >166MHz frequencies. HyperRAM provides direct random read/write access to flash memory array. But, HyperBus memory controllers seem to abstract implementation details and expose a simple MMIO interface to access connected flash. Add support for registering HyperFlash devices with MTD framework. MTD maps framework along with CFI chip support framework are used to support communicating with flash. Framework is modelled along the lines of spi-nor framework. HyperBus memory controller (HBMC) drivers calls hyperbus_register_device() to register a single HyperFlash device. HyperFlash core parses MMIO access information from DT, sets up the map_info struct, probes CFI flash and registers it with MTD framework. Some HBMC masters need calibration/training sequence[3] to be carried out, in order for DLL inside the controller to lock, by reading a known string/pattern. This is done by repeatedly reading CFI Query Identification String. Calibration needs to be done before trying to detect flash as part of CFI flash probe. HyperRAM is not supported at the moment. HyperBus specification can be found at[1] HyperFlash datasheet can be found at[2] [1] https://www.cypress.com/file/213356/download [2] https://www.cypress.com/file/213346/download [3] http://www.ti.com/lit/ug/spruid7b/spruid7b.pdf Table 12-5741. HyperFlash Access Sequence Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27mtd: cfi_cmdset_0002: Add support for polling status registerVignesh Raghavendra
HyperFlash devices are compliant with CFI AMD/Fujitsu Extended Command Set (0x0002) for flash operations, therefore drivers/mtd/chips/cfi_cmdset_0002.c can be used as is. But these devices do not support DQ polling method of determining chip ready/good status. These flashes provide Status Register whose bits can be polled to know status of flash operation. Cypress HyperFlash datasheet here[1], talks about CFI Amd/Fujitsu Extended Query version 1.5. Bit 0 of "Software Features supported" field of CFI Primary Vendor-Specific Extended Query table indicates presence/absence of status register and Bit 1 indicates whether or not DQ polling is supported. Using these bits, its possible to determine whether flash supports DQ polling or need to use Status Register. Add support for polling Status Register to know device ready/status of erase/write operations when DQ polling is not supported. Print error messages on erase/program failure by looking at related Status Register bits. [1] https://www.cypress.com/file/213346/download Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Reviewed-by: Tokunori Ikegami <ikegami.t@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2019-06-27batman-adv: mcast: detect, distribute and maintain multicast router presenceLinus Lüssing
To be able to apply our group aware multicast optimizations to packets with a scope greater than link-local we need to not only keep track of multicast listeners but also multicast routers. With this patch a node detects the presence of multicast routers on its segment by checking if /proc/sys/net/ipv{4,6}/conf/<bat0|br0(bat)>/mc_forwarding is set for one thing. This option is enabled by multicast routing daemons and needed for the kernel's multicast routing tables to receive and route packets. For another thing if a bridge is configured on top of bat0 then the presence of an IPv6 multicast router behind this bridge is currently detected by checking for an IPv6 multicast "All Routers Address" (ff02::2). This should later be replaced by querying the bridge, which performs proper, RFC4286 compliant Multicast Router Discovery (our simplified approach includes more hosts than necessary, most notably not just multicast routers but also unicast ones and is not applicable for IPv4). If no multicast router is detected then this is signalized via the new BATADV_MCAST_WANT_NO_RTR4 and BATADV_MCAST_WANT_NO_RTR6 multicast tvlv flags. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2019-06-27mm/hmm: Fix error flows in hmm_invalidate_range_startJason Gunthorpe
If the trylock on the hmm->mirrors_sem fails the function will return without decrementing the notifiers that were previously incremented. Since the caller will not call invalidate_range_end() on EAGAIN this will result in notifiers becoming permanently incremented and deadlock. If the sync_cpu_device_pagetables() required blocking the function will not return EAGAIN even though the device continues to touch the pages. This is a violation of the mmu notifier contract. Switch, and rename, the ranges_lock to a spin lock so we can reliably obtain it without blocking during error unwind. The error unwind is necessary since the notifiers count must be held incremented across the call to sync_cpu_device_pagetables() as we cannot allow the range to become marked valid by a parallel invalidate_start/end() pair while doing sync_cpu_device_pagetables(). Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Philip Yang <Philip.Yang@amd.com>
2019-06-27arm_pmu: acpi: spe: Add initial MADT/SPE probingJeremy Linton
ACPI 6.3 adds additional fields to the MADT GICC structure to describe SPE PPI's. We pick these out of the cached reference to the madt_gicc structure similarly to the core PMU code. We then create a platform device referring to the IRQ and let the user/module loader decide whether to load the SPE driver. Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27ACPI/PPTT: Add function to return ACPI 6.3 Identical tokensJeremy Linton
ACPI 6.3 adds a flag to indicate that child nodes are all identical cores. This is useful to authoritatively determine if a set of (possibly offline) cores are identical or not. Since the flag doesn't give us a unique id we can generate one and use it to create bitmaps of sibling nodes, or simply in a loop to determine if a subset of cores are identical. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-27siox: Add helper macro to simplify driver registrationEnrico Weigelt
Add more helper macros for trivial driver init cases, similar to the already existing module_platform_driver() or module_i2c_driver(). This helps to reduce driver init boilerplate. Signed-off-by: Enrico Weigelt <info@metux.net> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-06-27gpio: Add comments on #if/#else/#endifEnrico Weigelt
Improve readability a bit by commenting #if/#else/#endif statements with the checked preprocessor symbols. Signed-off-by: Enrico Weigelt <info@metux.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-06-27regulator: rk808: Add RK809 and RK817 support.Heiko Stuebner
Add support for the rk809 and rk817 regulator driver. Their specifications are as follows: 1. The RK809 and RK809 consist of 5 DCDCs, 9 LDOs and have the same registers for these components except dcdc5. 2. The dcdc5 is a boost dcdc for RK817 and is a buck for RK809. 3. The RK817 has one switch but The Rk809 has two. The output voltages are configurable and are meant to supply power to the main processor and other components. Signed-off-by: Tony Xie <tony.xie@rock-chips.com> Acked-by: Mark Brown <broonie@kernel.org> [rebased on top of 5.2-rc1] Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27mfd: rk808: Add RK817 and RK809 supportTony Xie
The RK809 and RK817 are a Power Management IC (PMIC) for multimedia and handheld devices. They contains the following components: - Regulators - RTC - Clocking Both RK809 and RK817 chips are using a similar register map, so we can reuse the RTC and Clocking functionality. Most of regulators have a some implementation also. Signed-off-by: Tony Xie <tony.xie@rock-chips.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27block: Remove unused codeDamien Le Moal
bio_flush_dcache_pages() is unused. Remove it. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-27docs: thermal: convert to ReSTMauro Carvalho Chehab
Rename the thermal documentation files to ReST, add an index for them and adjust in order to produce a nice html output via the Sphinx build system. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-27thermal/drivers/core: Add init section table for self-encapsulationDaniel Lezcano
Currently the governors are declared in their respective files but they export their [un]register functions which in turn call the [un]register governors core's functions. That implies a cyclic dependency which is not desirable. There is a way to self-encapsulate the governors by letting them to declare themselves in a __init section table. Define the table in the asm generic linker description like the other tables and provide the specific macros to deal with. Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-06-27media: cec-notifier: add new notifier functionsHans Verkuil
In order to support multiple CEC devices for an HDMI connector, and to support cec_connector_info, drivers should use either a cec_notifier_conn_(un)register pair of functions (HDMI drivers) or a cec_notifier_cec_adap_(un)register pair (CEC adapter drivers). This replaces cec_notifier_get_conn/cec_notifier_put. For CEC adapters it is also no longer needed to call cec_notifier_register, cec_register_cec_notifier and cec_notifier_unregister. This is now all handled internally by the new functions. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-06-27media: cec: add struct cec_connector_info supportDariusz Marcinkiewicz
Define struct cec_connector_info in media/cec.h and define CEC_CAP_CONNECTOR_INFO. In a later patch this will be moved to uapi/linux/cec.h. The CEC_CAP_CONNECTOR_INFO capability can be set by drivers, but cec_allocate_adapter() will remove it again until the public API for this can be enabled once all drm drivers wire this up correctly. Also add the cec_fill_conn_info_from_drm and cec_s_conn_info functions, which are needed by drm drivers to fill in the cec_connector info based on a drm_connector. The cec_notifier_(un)register and cec_register_cec_notifier prototypes were moved from cec-notifier.h to cec.h since cec.h no longer includes cec-notifier.h. These headers included each other before, which caused various problems. Due to these changes the seco-cec driver was changed as well: it should include cec-notifier.h, not cec.h. Signed-off-by: Dariusz Marcinkiewicz <darekm@google.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-06-27ACPI / PM: Introduce concept of a _PR0 dependent deviceMika Westerberg
If there are shared power resources between otherwise unrelated devices turning them on causes the other devices sharing them to be powered up as well. In case of PCI devices go into D0uninitialized state meaning that if they were configured to trigger wake that configuration is lost at this point. For this reason introduce a concept of "_PR0 dependent device" that can be added to any ACPI device that has power resources. The dependent device will be included in a list of dependent devices for all power resources returned by the ACPI device's _PR0 (assuming it has one). Whenever a power resource having dependent devices is turned physically on (its _ON method is called) we runtime resume all of them to allow their driver or in case of PCI the PCI core to re-initialize the device and its wake configuration. This adds two functions that can be used to add and remove these dependent devices. Note the dependent device does not necessary need share power resources so this functionality can be used to add "software dependencies" as well if needed. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-27mfd: bd70528: Support ROHM bd70528 PMIC coreMatti Vaittinen
ROHM BD70528MWV is an ultra-low quiescent current general purpose single-chip power management IC for battery-powered portable devices. Add MFD core which enables chip access for following subdevices: - regulators/LED drivers - battery-charger - gpios - 32.768kHz clk - RTC - watchdog Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27mfd: regulator: clk: Split rohm-bd718x7.hMatti Vaittinen
Split the bd718x7.h to ROHM common and bd718x7 specific parts so that we do not need to add same things in every new ROHM PMIC header. Please note that this change requires changes also in bd718x7 sub-device drivers for regulators and clk. Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2019-06-27clk: rockchip: add clock id for hdmi_phy special clock on rk3228Heiko Stuebner
Add the needed clock id to enable clock settings from devicetree. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Justin Swartz <justin.swartz@risingedge.co.za>
2019-06-27clk: rockchip: add clock id for watchdog pclk on rk3328Heiko Stuebner
Needed to export that added clock. Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2019-06-27Merge branch '5.3/scsi-sg' into scsi-nextMartin K. Petersen
2019-06-26fs/adfs: correct disc record structureRussell King
Fill in some padding in the disc record structure, and add GCC packed and aligned attributes to ensure that it is correctly laid out. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-06-26dt-bindings: clock: sifive: add MIT license as an option for the header filePaul Walmsley
At Bin Meng's request, add the MIT license as an option for the SiFive FU540 PRCI header file. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Bin Meng <bmeng.cn@gmail.com>
2019-06-26PCI: PM: Avoid skipping bus-level PM on platforms without ACPIRafael J. Wysocki
There are platforms that do not call pm_set_suspend_via_firmware(), so pm_suspend_via_firmware() returns 'false' on them, but the power states of PCI devices (PCIe ports in particular) are changed as a result of powering down core platform components during system-wide suspend. Thus the pm_suspend_via_firmware() checks in pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by commit 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to- idle") are not sufficient to determine that devices left in D0 during suspend will remain in D0 during resume and so the bus-level power management can be skipped for them. For this reason, introduce a new global suspend flag, PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only and replace the pm_suspend_via_firmware() checks mentioned above with checks against this flag. Fixes: 3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-idle") Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2019-06-26ipv6: constify rt6_nexthop()Nicolas Dichtel
There is no functional change in this patch, it only prepares the next one. rt6_nexthop() will be used by ip6_dst_lookup_neigh(), which uses const variables. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: kbuild test robot <lkp@intel.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-26Allow 0.0.0.0/8 as a valid address rangeDave Taht
The longstanding prohibition against using 0.0.0.0/8 dates back to two issues with the early internet. There was an interoperability problem with BSD 4.2 in 1984, fixed in BSD 4.3 in 1986. BSD 4.2 has long since been retired. Secondly, addresses of the form 0.x.y.z were initially defined only as a source address in an ICMP datagram, indicating "node number x.y.z on this IPv4 network", by nodes that know their address on their local network, but do not yet know their network prefix, in RFC0792 (page 19). This usage of 0.x.y.z was later repealed in RFC1122 (section 3.2.2.7), because the original ICMP-based mechanism for learning the network prefix was unworkable on many networks such as Ethernet (which have longer addresses that would not fit into the 24 "node number" bits). Modern networks use reverse ARP (RFC0903) or BOOTP (RFC0951) or DHCP (RFC2131) to find their full 32-bit address and CIDR netmask (and other parameters such as default gateways). 0.x.y.z has had 16,777,215 addresses in 0.0.0.0/8 space left unused and reserved for future use, since 1989. This patch allows for these 16m new IPv4 addresses to appear within a box or on the wire. Layer 2 switches don't care. 0.0.0.0/32 is still prohibited, of course. Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: John Gilmore <gnu@toad.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-26keys: Network namespace domain tagDavid Howells
Create key domain tags for network namespaces and make it possible to automatically tag keys that are used by networked services (e.g. AF_RXRPC, AFS, DNS) with the default network namespace if not set by the caller. This allows keys with the same description but in different namespaces to coexist within a keyring. Signed-off-by: David Howells <dhowells@redhat.com> cc: netdev@vger.kernel.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: linux-afs@lists.infradead.org
2019-06-26keys: Garbage collect keys for which the domain has been removedDavid Howells
If a key operation domain (such as a network namespace) has been removed then attempt to garbage collect all the keys that use it. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Include target namespace in match criteriaDavid Howells
Currently a key has a standard matching criteria of { type, description } and this is used to only allow keys with unique criteria in a keyring. This means, however, that you cannot have keys with the same type and description but a different target namespace in the same keyring. This is a potential problem for a containerised environment where, say, a container is made up of some parts of its mount space involving netfs superblocks from two different network namespaces. This is also a problem for shared system management keyrings such as the DNS records keyring or the NFS idmapper keyring that might contain keys from different network namespaces. Fix this by including a namespace component in a key's matching criteria. Keyring types are marked to indicate which, if any, namespace is relevant to keys of that type, and that namespace is set when the key is created from the current task's namespace set. The capability bit KEYCTL_CAPS1_NS_KEY_TAG is set if the kernel is employing this feature. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Move the user and user-session keyrings to the user_namespaceDavid Howells
Move the user and user-session keyrings to the user_namespace struct rather than pinning them from the user_struct struct. This prevents these keyrings from propagating across user-namespaces boundaries with regard to the KEY_SPEC_* flags, thereby making them more useful in a containerised environment. The issue is that a single user_struct may be represent UIDs in several different namespaces. The way the patch does this is by attaching a 'register keyring' in each user_namespace and then sticking the user and user-session keyrings into that. It can then be searched to retrieve them. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jann Horn <jannh@google.com>
2019-06-26keys: Namespace keyring namesDavid Howells
Keyring names are held in a single global list that any process can pick from by means of keyctl_join_session_keyring (provided the keyring grants Search permission). This isn't very container friendly, however. Make the following changes: (1) Make default session, process and thread keyring names begin with a '.' instead of '_'. (2) Keyrings whose names begin with a '.' aren't added to the list. Such keyrings are system specials. (3) Replace the global list with per-user_namespace lists. A keyring adds its name to the list for the user_namespace that it is currently in. (4) When a user_namespace is deleted, it just removes itself from the keyring name list. The global keyring_name_lock is retained for accessing the name lists. This allows (4) to work. This can be tested by: # keyctl newring foo @s 995906392 # unshare -U $ keyctl show ... 995906392 --alswrv 65534 65534 \_ keyring: foo ... $ keyctl session foo Joined session keyring: 935622349 As can be seen, a new session keyring was created. The capability bit KEYCTL_CAPS1_NS_KEYRING_NAME is set if the kernel is employing this feature. Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric W. Biederman <ebiederm@xmission.com>
2019-06-26keys: Add a 'recurse' flag for keyring searchesDavid Howells
Add a 'recurse' flag for keyring searches so that the flag can be omitted and recursion disabled, thereby allowing just the nominated keyring to be searched and none of the children. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Cache the hash value to avoid lots of recalculationDavid Howells
Cache the hash of the key's type and description in the index key so that we're not recalculating it every time we look at a key during a search. The hash function does a bunch of multiplications, so evading those is probably worthwhile - especially as this is done for every key examined during a search. This also allows the methods used by assoc_array to get chunks of index-key to be simplified. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Simplify key description managementDavid Howells
Simplify key description management by cramming the word containing the length with the first few chars of the description also. This simplifies the code that generates the index-key used by assoc_array. It should speed up key searching a bit too. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26keys: Kill off request_key_async{,_with_auxdata}David Howells
Kill off request_key_async{,_with_auxdata}() as they're not currently used. Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-26ipv4: reset rt_iif for recirculated mcast/bcast out pktsStephen Suryaputra
Multicast or broadcast egress packets have rt_iif set to the oif. These packets might be recirculated back as input and lookup to the raw sockets may fail because they are bound to the incoming interface (skb_iif). If rt_iif is not zero, during the lookup, inet_iif() function returns rt_iif instead of skb_iif. Hence, the lookup fails. v2: Make it non vrf specific (David Ahern). Reword the changelog to reflect it. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-26net/mlx5e: Specifying known origin of packets matching the flowJianbo Liu
In vport metadata matching, source port number is replaced by metadata. While FW has no idea about what it is in the metadata, a syndrome will happen. Specify a known origin to avoid the syndrome. However, there is no functional change because ANY_VPORT (0) is filled in flow_source, the same default value as before, as a pre-step towards metadata matching for fast path. There are two other values can be filled in flow_source. When setting 0x1, packet matching this rule is from uplink, while 0x2 is for packet from other local vports. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-26net/mlx5: E-Switch, Tag packet with vport number in VF vports and uplink ↵Jianbo Liu
ingress ACLs When a dual-port VHCA sends a RoCE packet on its non-native port, and the packet arrives to its affiliated vport FDB, a mismatch might occur on the rules that match the packet source vport as it is not represented by single VHCA only in this case. So we change to match on metadata instead of source vport. To do that, a rule is created in all vports and uplink ingress ACLs, to save the source vport number and vhca id in the packet's metadata in order to match on it later. The metadata register used is the first of the 32-bit type C registers. It can be used for matching and header modify operations. The higher 16 bits of this register are for vhca id, and the lower 16 ones is for vport number. This change is not for dual-port RoCE only. If HW and FW allow, the vport metadata matching is enabled by default. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>