summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-06-19dm kcopyd: add sequential write featureDamien Le Moal
When copyying blocks to host-managed zoned block devices, writes must be sequential. However, dm_kcopyd_copy() does not guarantee this as writes are issued in the completion order of reads, and reads may complete out of order despite being issued sequentially. Fix this by introducing the DM_KCOPYD_WRITE_SEQ feature flag. This can be specified when calling dm_kcopyd_copy() and should be set automatically if one of the destinations is a host-managed zoned block device. For a split job, the master job maintains the write position at which writes must be issued. This is checked with the pop() function which is modified to not return any write I/O sub job that is not at the correct write position. When DM_KCOPYD_WRITE_SEQ is specified for a job, errors cannot be ignored and the flag DM_KCOPYD_IGNORE_ERROR is ignored, even if specified by the user. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-19dm: introduce dm_remap_zone_report()Damien Le Moal
A target driver support zoned block devices and exposing it as such may receive REQ_OP_ZONE_REPORT request for the user to determine the mapped device zone configuration. To process properly such request, the target driver may need to remap the zone descriptors provided in the report reply. The helper function dm_remap_zone_report() does this generically using only the target start offset and length and the start offset within the target device. dm_remap_zone_report() will remap the start sector of all zones reported. If the report includes sequential zones, the write pointer position of these zones will also be remapped. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-19dm table: add zoned block devices validationDamien Le Moal
1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-19dm: convert DM printk macros to pr_<level> macrosJoe Perches
Using pr_<level> is the more common logging style. Standardize style and use new macro DM_FMT. Use no_printk in DMDEBUG macros when CONFIG_DM_DEBUG is not #defined. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-19dm ioctl: add a new DM_DEV_ARM_POLL ioctlMikulas Patocka
This ioctl will record the current global event number in the structure dm_file, so that next select or poll call will wait until new events arrived since this ioctl. The DM_DEV_ARM_POLL ioctl has the same effect as closing and reopening the handle. Using the DM_DEV_ARM_POLL ioctl is optional - if the userspace is OK with closing and reopening the /dev/mapper/control handle after select or poll, there is no need to re-arm via ioctl. Usage: 1. open the /dev/mapper/control device 2. send the DM_DEV_ARM_POLL ioctl 3. scan the event numbers of all devices we are interested in and process them 4. call select, poll or epoll on the handle (it waits until some new event happens since the DM_DEV_ARM_POLL ioctl) 5. go to step 2 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-19mfd: intel_soc_pmic_bxtwc: Use chained IRQs for second level IRQ chipsKuppuswamy Sathyanarayanan
Whishkey cove PMIC has support to mask/unmask interrupts at two levels. At first level we can mask/unmask interrupt domains like TMU, GPIO, ADC, CHGR, BCU THERMAL and PWRBTN and at second level, it provides facility to mask/unmask individual interrupts belong each of this domain. For example, in case of TMU, at first level we have TMU interrupt domain, and at second level we have two interrupts, wake alarm, system alarm that belong to the TMU interrupt domain. Currently, in this driver all first level IRQs are registered as part of IRQ chip(bxtwc_regmap_irq_chip). By default, after you register the IRQ chip from your driver, all IRQs in that chip will masked and can only be enabled if that IRQ is requested using request_irq() call. This is the default Linux IRQ behavior model. And whenever a dependent device that belongs to PMIC requests only the second level IRQ and not explicitly unmask the first level IRQ, then in essence the second level IRQ will still be disabled. For example, if TMU device driver request wake_alarm IRQ and not explicitly unmask TMU level 1 IRQ then according to the default Linux IRQ model, wake_alarm IRQ will still be disabled. So the proper solution to fix this issue is to use the chained IRQ chip concept. We should chain all the second level chip IRQs to the corresponding first level IRQ. To do this, we need to create separate IRQ chips for every group of second level IRQs. In case of TMU, when adding second level IRQ chip, instead of using PMIC IRQ we should use the corresponding first level IRQ. So the following code will change from ret = regmap_add_irq_chip(pmic->regmap, pmic->irq, ...) to, virq = regmap_irq_get_virq(&pmic->irq_chip_data, BXTWC_TMU_LVL1_IRQ); ret = regmap_add_irq_chip(pmic->regmap, virq, ...) In case of Whiskey Cove Type-C driver, Since USBC IRQ is moved under charger level2 IRQ chip. We should use charger IRQ chip(irq_chip_data_chgr) to get the USBC virtual IRQ number. Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Revieved-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2017-06-19mm: larger stack guard gap, between vmasHugh Dickins
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19crypto: engine - replace pr_xxx by dev_xxxCorentin LABBE
By adding a struct device *dev to struct engine, we could store the device used at register time and so use all dev_xxx functions instead of pr_xxx. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-06-18Merge tag 'at91-ab-4.13-soc' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux into next/soc SoC for 4.13: - New suspend/resume mode for sama5d2 - Initial support for armv7m based SoCs * tag 'at91-ab-4.13-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: ARM: at91: remove atmel_nand_data ARM: at91: fix at91_suspend_entering_slow_clock link error ARM: at91: debug: add samv7x support ARM: at91: add armv7m SoC detection ARM: at91: handle CONFIG_PM for armv7m configurations ARM: at91: Add armv7m support ARM: at91: Document armv7m compatibles ARM: at91: Documentation: add armv7m families ARM: at91: pm: fallback to slowclock when backup mode fails ARM: at91: pm: allow selecting standby and suspend modes ARM: at91: pm: Add sama5d2 backup mode Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'renesas-dt-bindings2-for-v4.13' of ↵Olof Johansson
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/dt Second Round of Renesas ARM Based SoC DT Bindings Updates for v4.13 * Document: - Add Renesas H3-based Salvator-XS board DT bindings - Add iW-RainboW-G20D-Qseven-RZG1M board - Add iW-RainboW-G20M-Qseven-RZG1M system on module - Update R-Car Gen3 ULCB board part numbers * Add clock bit definitions for r7s72100 SoC * tag 'renesas-dt-bindings2-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: Document Renesas H3-based Salvator-XS board DT bindings ARM: shmobile: Update R-Car Gen3 ULCB board part numbers ARM: shmobile: document iW-RainboW-G20D-Qseven-RZG1M board ARM: shmobile: document iW-RainboW-G20M-Qseven-RZG1M system on module ARM: dts: r7s72100: add clock bit definitions Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'renesas-drivers-for-v4.13' of ↵Olof Johansson
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/drivers Renesas ARM Based SoC Drivers Updates for v4.13 * Rework Kconfig and Makefile logic * tag 'renesas-drivers-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: soc: renesas: Rework Kconfig and Makefile logic Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-19Merge tag 'mac80211-for-davem-2017-06-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Here's just the fix for that ancient bug: * remove wext calling ndo_do_ioctl, since nobody needs that now and it makes the type change easier * use struct iwreq instead of struct ifreq almost everywhere in wireless extensions code * copy only struct iwreq from userspace in dev_ioctl for the wireless extensions, since it's smaller than struct ifreq ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-18Merge tag 'tegra-for-4.13-soc' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into next/drivers soc/tegra: Changes for v4.13-rc1 This contains an implementation of generic PM domains for Tegra186, based on the BPMP powergate request. * tag 'tegra-for-4.13-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux: soc/tegra: flowctrl: Fix error handling soc/tegra: bpmp: Implement generic PM domains soc/tegra: bpmp: Update ABI header PM / Domains: Allow overriding the ->xlate() callback Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'scpi-updates-4.13' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into next/drivers SCPI update for v4.13 Adds support to get DVFS transition latency and OPP for any device whose DVFS are managed by SCPI. This avoids code duplication in both cpufreq and devfreq SCPI drivers. * tag 'scpi-updates-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: cpufreq: scpi: use new scpi_ops functions to remove duplicate code firmware: arm_scpi: add support to populate OPPs and get transition latency Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'arm-soc/for-4.13/devicetree-arm64' of ↵Olof Johansson
http://github.com/Broadcom/stblinux into next/dt64 This pull request contains Broadcom ARM64-based SoCs Device Tree changes for 4.13. Please note the following from Eric: I've based this summary on the bcm2835-dt-next tag, to clarify what's in this patch series, but it does require being careful since it involves a cross-merge between branches. - Anup documents the Broadcom Stingray binding, common clocks, adds initial support for the Stingray DTSI and DTS files and adds support for the PL022, PL330 and SP805 - Sandeep adds the clock nodes to the Stingray Device Tree nodes - Pramod adds support for the NAND, pinctrl, GPIO to the Stingray Device Tree nodes - Oza adds I2C Device Tree nodes to the Stingray DTSes - Srinath adds PWM and SDHCI Device Tree nodes for the Stingray SoC - Ravijeta adds support for the USB Dual Role PHY on Northstar 2 - Gerd starts adding references to the sdhost and sdhci controllers, and then switches the sdcard to to use the SDHOST (faster than SDHCI) - Stefan defines the BCM2837 thermal coefficients in order for the Raspberry Pi thermal driver to work correctly * tag 'arm-soc/for-4.13/devicetree-arm64' of http://github.com/Broadcom/stblinux: arm64: dts: NS2: Add USB DRD PHY device tree node ARM64: dts: bcm2837: Define CPU thermal coefficients arm64: dts: Add PWM and SDHCI DT nodes for Stingray SOC arm64: dts: Add PL022, PL330 and SP805 DT nodes for Stingray arm64: dts: Add I2C DT nodes for Stingray SoC arm64: dts: Add GPIO DT nodes for Stingray SOC arm64: dts: Add pinctrl DT nodes for Stingray SOC arm64: dts: Add NAND DT nodes for Stingray SOC arm64: dts: Add clock DT nodes for Stingray SOC arm64: dts: Initial DTS files for Broadcom Stingray SOC dt-bindings: clk: Extend binding doc for Stingray SOC dt-bindings: bcm: Add Broadcom Stingray bindings document ARM: dts: bcm283x: switch from &sdhci to &sdhost arm64: dts: bcm2837: add &sdhci and &sdhost ARM: dts: bcm283x: Add CPU thermal zone with 1 trip point ARM: dts: Add devicetree for the Raspberry Pi 3, for arm32 (v6) Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'v4.12-next-soc' of https://github.com/mbgg/linux-mediatek into ↵Olof Johansson
next/drivers - enhance scpsys to support mt6797 - add mt6797 support to scpsys - fix error path in pmic-wrapper - fix possible NULL pointer dereference in pmic-wrapper * tag 'v4.12-next-soc' of https://github.com/mbgg/linux-mediatek: soc: mediatek: PMIC wrap: Fix possible NULL derefrence. soc: mediatek: PMIC wrap: Fix error handling soc: mediatek: add MT6797 scpsys support soc: mediatek: add vdec item for scpsys soc: mediatek: avoid using fixed spm power status defines Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'v4.13-rockchip-dts32-1' of ↵Olof Johansson
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into next/dt A bunch of changes including mali gpu nodes for rk3288 boards following (and including) the new Mali Midgard binding; a lot of improvements for the rk3228/rk3229 socs (tsadc, operating points, usb, clock-rates, pinctrl, watchdog); finalizing the rk1108->rv1108 rename and adc buttons for the rk3288 firefly boards. * tag 'v4.13-rockchip-dts32-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: enable usb for rk3229 evb board ARM: dts: rockchip: add usb nodes on rk322x ARM: dts: rockchip: add adc button for Firefly ARM: dts: rockchip: enable ARM Mali GPU on rk3288-veyron ARM: dts: rockchip: enable ARM Mali GPU on rk3288-firefly ARM: dts: rockchip: enable ARM Mali GPU on rk3288-rock2-som ARM: dts: rockchip: add ARM Mali GPU node for rk3288 dt-bindings: gpu: add bindings for the ARM Mali Midgard GPU ARM: dts: rockchip: set a sane frequence for tsadc on rk322x ARM: dts: rockchip: add operating-points-v2 for cpu on rk322x ARM: dts: rockchip: set default rates for core clocks on rk322x ARM: dts: rockchip: add second uart2 pinctrl on rk322x ARM: dts: rockchip: correct rk322x uart2 pinctrl ARM: dts: rockchip: add watchdog device node on rk322x clk: rockchip: add clock-ids for more rk3228 clocks clk: rockchip: add ids for camera on rk3399 ARM: dts: rockchip: fix rk322x i2s1 pinctrl error ARM: dts: rockchip: rename RK1108-evb to RV1108-evb ARM: dts: rockchip: rename core dtsi from RK1108 to RV1108 ARM: dts: rockchip: Setup usb vbus-supply on rk3288-rock2 Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'reset-for-4.13-2' of git://git.pengutronix.de/git/pza/linux into ↵Olof Johansson
next/drivers Reset controller changes for v4.13, part 2 - Add reset manager offsets for Stratix10 - Use kref for reset contol reference counting - Add new TI SCI reset driver for TI Keystone SoCs * tag 'reset-for-4.13-2' of git://git.pengutronix.de/git/pza/linux: reset: Add the TI SCI reset driver dt-bindings: reset: Add TI SCI reset binding reset: use kref for reference counting dt-bindings: reset: Add reset manager offsets for Stratix10 Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18Merge tag 'reset-for-4.13' of git://git.pengutronix.de/git/pza/linux into ↵Olof Johansson
next/drivers Reset controller changes for v4.13 - Use devm_kcalloc to allocate the channels array in sti/reset-syscfg - Rename the TI_SYSCON_RESET Kconfig option to RESET_TI_SYSCON for consistency - Add new reset driver and DT bindings for Cortina Systems Gemini reset controller * tag 'reset-for-4.13' of git://git.pengutronix.de/git/pza/linux: reset: Add a Gemini reset controller reset: add DT bindings header for Gemini reset controller reset: ti_syscon: Rename TI_SYSCON_RESET to RESET_TI_SYSCON reset: sti: Use devm_kcalloc() in syscfg_reset_controller_register() Signed-off-by: Olof Johansson <olof@lixom.net>
2017-06-18NFC: nfcmrvl: allow gpio 0 for reset signallingJohan Hovold
Allow gpio 0 to be used for reset signalling, and instead use negative errnos to disable the reset functionality. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2017-06-18blk-mq: don't stop queue for quiescingMing Lei
Queue can be started by other blk-mq APIs and can be used in different cases, this limits uses of blk_mq_quiesce_queue() if it is based on stopping queue, and make its usage very difficult, especially users have to use the stop queue APIs carefully for avoiding to break blk_mq_quiesce_queue(). We have applied the QUIESCED flag for draining and blocking dispatch, so it isn't necessary to stop queue any more. After stopping queue is removed, blk_mq_quiesce_queue() can be used safely and easily, then users won't worry about queue restarting during quiescing at all. Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq: use QUEUE_FLAG_QUIESCED to quiesce queueMing Lei
It is required that no dispatch can happen any more once blk_mq_quiesce_queue() returns, and we don't have such requirement on APIs of stopping queue. But blk_mq_quiesce_queue() still may not block/drain dispatch in the the case of BLK_MQ_S_START_ON_RUN, so use the new introduced flag of QUEUE_FLAG_QUIESCED and evaluate it inside RCU read-side critical sections for fixing this issue. Also blk_mq_quiesce_queue() is implemented via stopping queue, which limits its uses, and easy to cause race, because any queue restart in other paths may break blk_mq_quiesce_queue(). With the introduced flag of QUEUE_FLAG_QUIESCED, we don't need to depend on stopping queue for quiescing any more. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq: introduce blk_mq_unquiesce_queueMing Lei
blk_mq_start_stopped_hw_queues() is used implictly as counterpart of blk_mq_quiesce_queue() for unquiescing queue, so we introduce blk_mq_unquiesce_queue() and make it as counterpart of blk_mq_quiesce_queue() explicitly. This function is for improving the current quiescing mechanism in the following patches. Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq: introduce blk_mq_quiesce_queue_nowait()Ming Lei
This patch introduces blk_mq_quiesce_queue_nowait() so that we can workaround mpt3sas for quiescing its queue. Once mpt3sas is fixed, we can remove this helper. Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq: move blk_mq_quiesce_queue() into include/linux/blk-mq.hMing Lei
We usually put blk_mq_*() into include/linux/blk-mq.h, so move this API into there. Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18block: remove bio_clone() and all references.NeilBrown
bio_clone() is no longer used. Only bio_clone_bioset() or bio_clone_fast(). This is for the best, as bio_clone() used fs_bio_set, and filesystems are unlikely to want to use bio_clone(). So remove bio_clone() and all references. This includes a fix to some incorrect documentation. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk: make the bioset rescue_workqueue optional.NeilBrown
This patch converts bioset_create() to not create a workqueue by default, so alloctions will never trigger punt_bios_to_rescuer(). It also introduces a new flag BIOSET_NEED_RESCUER which tells bioset_create() to preserve the old behavior. All callers of bioset_create() that are inside block device drivers, are given the BIOSET_NEED_RESCUER flag. biosets used by filesystems or other top-level users do not need rescuing as the bio can never be queued behind other bios. This includes fs_bio_set, blkdev_dio_pool, btrfs_bioset, xfs_ioend_bioset, and one allocated by target_core_iblock.c. biosets used by md/raid do not need rescuing as their usage was recently audited and revised to never risk deadlock. It is hoped that most, if not all, of the remaining biosets can end up being the non-rescued version. Reviewed-by: Christoph Hellwig <hch@lst.de> Credit-to: Ming Lei <ming.lei@redhat.com> (minor fixes) Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk: replace bioset_create_nobvec() with a flags arg to bioset_create()NeilBrown
"flags" arguments are often seen as good API design as they allow easy extensibility. bioset_create_nobvec() is implemented internally as a variation in flags passed to __bioset_create(). To support future extension, make the internal structure part of the API. i.e. add a 'flags' argument to bioset_create() and discard bioset_create_nobvec(). Note that the bio_split allocations in drivers/md/raid* do not need the bvec mempool - they should have used bioset_create_nobvec(). Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk: remove bio_set arg from blk_queue_split()NeilBrown
blk_queue_split() is always called with the last arg being q->bio_split, where 'q' is the first arg. Also blk_queue_split() sometimes uses the passed-in 'bs' and sometimes uses q->bio_split. This is inconsistent and unnecessary. Remove the last arg and always use q->bio_split inside blk_queue_split() Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Credit-to: Javier González <jg@lightnvm.io> (Noticed that lightnvm was missed) Reviewed-by: Javier González <javier@cnexlabs.com> Tested-by: Javier González <javier@cnexlabs.com> Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq-sched: unify request prepare methodsChristoph Hellwig
This patch makes sure we always allocate requests in the core blk-mq code and use a common prepare_request method to initialize them for both mq I/O schedulers. For Kyber and additional limit_depth method is added that is called before allocating the request. Also because none of the intializations can really fail the new method does not return an error - instead the bfq finish method is hardened to deal with the no-IOC case. Last but not least this removes the abuse of RQF_QUEUE by the blk-mq scheduling code as RQF_ELFPRIV is all that is needed now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq-sched: unify request finished methodsChristoph Hellwig
No need to have two different callouts of bfq vs kyber. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-17net: add debug atomic_inc_not_zero() in dst_hold()Wei Wang
This patch is meant to add a debug warning on the situation where dst is being held during its destroy phase. This could potentially cause double free issue on the dst. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: reorder all the dst flagsWei Wang
As some dst flags are removed, reorder the dst flags to fill in the blanks. Note: these flags are not exposed into user space. So it is safe to reorder. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: remove DST_NOCACHE flagWei Wang
DST_NOCACHE flag check has been removed from dst_release() and dst_hold_safe() in a previous patch because all the dst are now ref counted properly and can be released based on refcnt only. Looking at the rest of the DST_NOCACHE use, all of them can now be removed or replaced with other checks. So this patch gets rid of all the DST_NOCACHE usage and remove this flag completely. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: remove DST_NOGC flagWei Wang
Now that all the components have been changed to release dst based on refcnt only and not depend on dst gc anymore, we can remove the temporary flag DST_NOGC. Note that we also need to remove the DST_NOCACHE check in dst_release() and dst_hold_safe() because now all the dst are released based on refcnt and behaves as DST_NOCACHE. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: remove dst gc related codeWei Wang
This patch removes all dst gc related code and all the dst free functions Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17xfrm: take refcnt of dst when creating struct xfrm_dst bundleWei Wang
During the creation of xfrm_dst bundle, always take ref count when allocating the dst. This way, xfrm_bundle_create() will form a linked list of dst with dst->child pointing to a ref counted dst child. And the returned dst pointer is also ref counted. This makes the link from the flow cache to this dst now ref counted properly. As the dst is always ref counted properly, we can safely mark DST_NOGC flag so dst_release() will release dst based on refcnt only. And dst gc is no longer needed and all dst_free() and its related function calls should be replaced with dst_release() or dst_release_immediate(). The special handling logic for dst->child in dst_destroy() can be replaced with a simple dst_release_immediate() call on the child to release the whole list linked by dst->child pointer. Previously used DST_NOHASH flag is not needed anymore as well. The reason that DST_NOHASH is used in the existing code is mainly to prevent the dst inserted in the fib tree to be wrongly destroyed during the deletion of the xfrm_dst bundle. So in the existing code, DST_NOHASH flag is marked in all the dst children except the one which is in the fib tree. However, with this patch series to remove dst gc logic and release dst only based on ref count, it is safe to release all the children from a xfrm_dst bundle as long as the dst children are all ref counted properly which is already the case in the existing code. So, this patch removes the use of DST_NOHASH flag. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv6: get rid of icmp6 dst garbage collectorWei Wang
icmp6 dst route is currently ref counted during creation and will be freed by user during its call of dst_release(). So no need of a garbage collector for it. Remove all icmp6 dst garbage collector related code. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17ipv4: call dst_hold_safe() properlyWei Wang
This patch checks all the calls to dst_hold()/skb_dst_force()/dst_clone()/dst_use() to see if dst_hold_safe() is needed to avoid double free issue if dst gc is removed and dst_release() directly destroys dst when dst->__refcnt drops to 0. In tx path, TCP hold sk->sk_rx_dst ref count and also hold sock_lock(). UDP and other similar protocols always hold refcount for skb->_skb_refdst. So both paths seem to be safe. In rx path, as it is lockless and skb_dst_set_noref() is likely to be used, dst_hold_safe() should always be used when trying to hold dst. In the routing code, if dst is held during an rcu protected session, it is necessary to call dst_hold_safe() as the current dst might be in its rcu grace period. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: introduce a new function dst_dev_put()Wei Wang
This function should be called when removing routes from fib tree after the dst gc is no longer in use. We first mark DST_OBSOLETE_DEAD on this dst to make sure next dst_ops->check() fails and returns NULL. Secondly, as we no longer keep the gc_list, we need to properly release dst->dev right at the moment when the dst is removed from the fib/fib6 tree. It does the following: 1. change dst->input and output pointers to dst_discard/dst_dscard_out to discard all packets 2. replace dst->dev with loopback interface Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-17net: introduce DST_NOGC in dst_release() to destroy dst based on refcntWei Wang
The current mechanism of freeing dst is a bit complicated. dst has its ref count and when user grabs the reference to the dst, the ref count is properly taken in most cases except in IPv4/IPv6/decnet/xfrm routing code due to some historic reasons. If the reference to dst is always taken properly, we should be able to simplify the logic in dst_release() to destroy dst when dst->__refcnt drops from 1 to 0. And this should be the only condition to determine if we can call dst_destroy(). And as dst is always ref counted, there is no need for a dst garbage list to hold the dst entries that already get removed by the routing code but are still held by other users. And the task to periodically check the list to free dst if ref count become 0 is also not needed anymore. This patch introduces a temporary flag DST_NOGC(no garbage collector). If it is set in the dst, dst_release() will call dst_destroy() when dst->__refcnt drops to 0. dst_hold_safe() will also check for this flag and do atomic_inc_not_zero() similar as DST_NOCACHE to avoid double free issue. This temporary flag is mainly used so that we can make the transition component by component without breaking other parts. This flag will be removed after all components are properly transitioned. This patch also introduces a new function dst_release_immediate() which destroys dst without waiting on the rcu when refcnt drops to 0. It will be used in later patches. Follow-up patches will correct all the places to properly take ref count on dst and mark DST_NOGC. dst_release() or dst_release_immediate() will be used to release the dst instead of dst_free() and its related functions. And final clean-up patch will remove the DST_NOGC flag. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16Merge tag 'meson-clk-for-4.13-2' of git://github.com/BayLibre/clk-meson into ↵Stephen Boyd
clk-next Pull Amlogic clk driver updates from Jerome Brunet: * Expose more clock gate on meson8 (SAR ADC, RNG, USB, SDIO, ETH) * Add new compatible to the meson8 clock controller for meson8b * Add missing parents to gxbb clk81 * tag 'meson-clk-for-4.13-2' of git://github.com/BayLibre/clk-meson: clk: meson: gxbb: add all clk81 parents clk: meson: meson8b: add compatibles for Meson8 and Meson8m2 clk: meson8b: export the ethernet gate clock clk: meson8b: export the USB clocks clk: meson8b: export the gate clock for the HW random number generator clk: meson8b: export the SDIO clock clk: meson8b: export the SAR ADC clocks
2017-06-16Merge branch 'for-4.13-ti-clkctrl' of https://github.com/t-kristo/linux-pm ↵Stephen Boyd
into clk-next * 'for-4.13-ti-clkctrl' of https://github.com/t-kristo/linux-pm: clk: ti: omap4: add clkctrl clock data dt-bindings: clk: add omap4 clkctrl definitions clk: ti: add support for clkctrl clocks Documentation: dt-bindings: Add binding documentation for TI clkctrl clocks
2017-06-16Merge tag 'sunxi-clk-for-4.13' of ↵Stephen Boyd
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-next Pull Allwinner clock patches from Maxime Ripard: Some new clock units are supported, for the display clocks unsed in the newer SoCs, and the A83T PRCM. There is also a bunch of minor fixes for clocks that are not used by anyone, and reworks needed by drivers that will land in 4.13. * tag 'sunxi-clk-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: (21 commits) clk: sunxi-ng: Move all clock types to a library clk: sunxi-ng: a83t: Add support for A83T's PRCM dt-bindings: clock: sunxi-ccu: Add compatible string for A83T PRCM clk: sunxi-ng: select SUNXI_CCU_MULT for sun8i-a83t clk: sunxi-ng: a83t: Fix audio PLL divider offset clk: sunxi-ng: a83t: Fix PLL lock status register offset clk: sunxi-ng: Add driver for A83T CCU clk: sunxi-ng: Support multiple variable pre-dividers dt-bindings: clock: sunxi-ccu: Add compatible string for A83T CCU clk: sunxi-ng: de2: fix wrong pointer passed to PTR_ERR() clk: sunxi-ng: sun5i: Export video PLLs clk: sunxi-ng: mux: Re-adjust parent rate clk: sunxi-ng: mux: Change pre-divider application function prototype clk: sunxi-ng: mux: split out the pre-divider computation code clk: sunxi-ng: mux: Don't just rely on the parent for CLK_SET_RATE_PARENT clk: sunxi-ng: div: Switch to divider_round_rate clk: sunxi-ng: Pass the parent and a pointer to the clocks round rate clk: divider: Make divider_round_rate take the parent clock clk: sunxi-ng: explicitly include linux/spinlock.h clk: sunxi-ng: add support for DE2 CCU ...
2017-06-16PCI: Test INTx masking during enumeration, not at run-timePiotr Gregor
The test for INTx masking via PCI_COMMAND_INTX_DISABLE performed in pci_intx_mask_supported() should be done before the device can be used. This is to avoid writing PCI_COMMAND while the driver owns the device, in case that has any effect on MSI/MSI-X interrupts. Move the content of pci_intx_mask_supported() to pci_intx_mask_broken() and call it from pci_setup_device(). The test result can be queried at any time later using the same pci_intx_mask_supported() interface as before (though with changed implementation), so callers (uio, vfio) should be unaffected. Signed-off-by: Piotr Gregor <piotrgregor@rsyncme.org> [bhelgaas: changelog, remove quirk check, remove locking, move dev->broken_intx_masking assignment to caller] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16amdgpu: use drm sync objects for shared semaphores (v6)Dave Airlie
This creates a new command submission chunk for amdgpu to add in and out sync objects around the submission. Sync objects are managed via the drm syncobj ioctls. The command submission interface is enhanced with two new chunks, one for syncobj pre submission dependencies, and one for post submission sync obj signalling, and just takes a list of handles for each. This is based on work originally done by David Zhou at AMD, with input from Christian Konig on what things should look like. In theory VkFences could be backed with sync objects and just get passed into the cs as syncobj handles as well. NOTE: this interface addition needs a version bump to expose it to userspace. TODO: update to dep_sync when rebasing onto amdgpu master. (with this - r-b from Christian) v1.1: keep file reference on import. v2: move to using syncobjs v2.1: change some APIs to just use p pointer. v3: make more robust against CS failures, we now add the wait sems but only remove them once the CS job has been submitted. v4: rewrite names of API and base on new syncobj code. v5: move post deps earlier, rename some apis v6: lookup post deps earlier, and just replace fences in post deps stage (Christian) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16mfd: cros_ec: add debugfs, console log fileEric Caruso
If the EC supports the new CONSOLE_READ command type, then we place a console_log file in debugfs for that EC device which allows us to grab EC logs. The kernel will poll every 10 seconds for the log and keep its own buffer, but userspace should grab this and write it out to some logs which actually get rotated. Signed-off-by: Eric Caruso <ejcaruso@chromium.org> Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Lee Jones <lee.jones@linaro.org> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> [bleung: restored original version of this commit, with pointer size issue to be fixed in next commit] Signed-off-by: Benson Leung <bleung@chromium.org>
2017-06-16mfd: cros_ec: Add EC console read structures definitionsNicolas Boichat
ec_params_console_read_v1 is used to capture EC logs from kernel, and ec_params_get_cmd_versions_v1 is used to probe whether EC supports that command. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Acked-by: Lee Jones <lee.jones@linaro.org> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Benson Leung <bleung@chromium.org>
2017-06-16mfd: cros_ec: Add helper for event notifier.Gwendal Grignou
Add cros_ec_get_event() entry point to retrieve event within functions called by the notifier. Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Benson Leung <bleung@chromium.org>
2017-06-16Merge tag 'mlx5-updates-2017-06-16' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox mlx5 updates and cleanups 2017-06-16 mlx5-updates-2017-06-16 This series provide some updates and cleanups for mlx5 core and netdevice driver. From Eli Cohen, add a missing event string. From Or Gerlitz, some checkpatch cleanups. From Moni, Disalbe HW level LAG when SRIOV is enabled. From Tariq, A code reuse cleanup in aRFS flow. From Itay Aveksis, Typo fix. From Gal Pressman, ethtool statistics updates and "update stats" deferred work optimizations. From Majd Dibbiny, Fast unload support on kernel shutdown. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>