summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2015-04-22Merge MTD fixes from 4.0 into -nextBrian Norris
2015-04-22Merge tag 'mmc-4.1-rc1' of git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds
Pull MMC fixes from Ulf Hansson: "Here is two mmc core fixes for v.4.1 rc1: - fix error code propagation in mmc_pwrseq_simple_alloc() - revert 'mmc: core: Convert mmc_driver to device_driver'" * tag 'mmc-4.1-rc1' of git://git.linaro.org/people/ulf.hansson/mmc: Revert "mmc: core: Convert mmc_driver to device_driver" mmc: pwrseq: Fix error code propagation in mmc_pwrseq_simple_alloc()
2015-04-22dmaengine: hsu: don't prompt for hsu_core partVinod Koul
HSU_DMA is selected by the HSU_DMA_PCI driver, this should be user selected so remove the user prompt for this Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-22Merge tag 'armsoc-multiplatform' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC multiplatform code changes from Olof Johansson: "The changes here belong to two main platforms: - Atmel At91 is flipping the bit and going multiplatform. This includes some cleanups and removal of code, and the final flip of config dependencies - Shmobile has several platforms that are going multiplatform, but this branch also contains a bunch of cleanups that they weren't able to keep separate in a good way. THere's also a removal of one of their SoCs and the corresponding boards (sh7372 and mackerel)" * tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (67 commits) ARM: at91/pm: move AT91_MEMCTRL_* to pm.h ARM: at91/pm: move the standby functions to pm.c ARM: at91: fix pm_suspend.S compilation when ARMv6 is selected ARM: at91: add a Kconfig dependency on multi-platform ARM: at91: drop AT91_TIMER_HZ ARM: at91: remove hardware.h ARM: at91: remove SoC headers ARM: at91: remove useless mach/cpu.h ARM: at91: remove unused headers ARM: at91: switch at91_dt_defconfig to multiplatform ARM: at91: switch to multiplatform ARM: shmobile: r8a7778: enable multiplatform target ARM: shmobile: bockw: add sound to DT ARM: shmobile: r8a7778: add sound to DT ARM: shmobile: bockw: add devices hooked up to i2c0 to DT DT: i2c: add trivial binding for OKI ML86V7667 video decoder ARM: shmobile: r8a7778: common clock framework CPG driver ARM: shmobile: bockw dts: set extal clock frequency ARM: shmobile: bockw dts: Move Ethernet node to BSC ARM: shmobile: r8a73a4: Remove legacy code ...
2015-04-22Merge tag 'armsoc-drivers' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Olof Johansson: "Driver updates for v4.1. Some of these are for drivers/soc, where we find more and more SoC-specific drivers these days. Some are for other driver subsystems where we have received acks from the appropriate maintainers. The larger parts of this branch are: - MediaTek support for their PMIC wrapper interface, a high-level interface for talking to the system PMIC over a dedicated I2C interface. - Qualcomm SCM driver has been moved to drivers/firmware. It's used for CPU up/down and needs to be in a shared location for arm/arm64 common code. - cleanup of ARM-CCI PMU code. - another set of cleanusp to the OMAP GPMC code" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (43 commits) soc/mediatek: Remove unused variables clocksource: atmel-st: select MFD_SYSCON soc: mediatek: Add PMIC wrapper for MT8135 and MT8173 SoCs arm-cci: Fix CCI PMU event validation arm-cci: Split the code for PMU vs driver support arm-cci: Get rid of secure transactions for PMU driver arm-cci: Abstract the CCI400 PMU specific definitions arm-cci: Rearrange code for splitting PMU vs driver code drivers: cci: reject groups spanning multiple HW PMUs ARM: at91: remove useless include clocksource: atmel-st: remove mach/hardware dependency clocksource: atmel-st: use syscon/regmap ARM: at91: time: move the system timer driver to drivers/clocksource ARM: at91: properly initialize timer ARM: at91: at91rm9200: remove deprecated arm_pm_restart watchdog: at91rm9200: implement restart handler watchdog: at91rm9200: use the system timer syscon mfd: syscon: Add atmel system timer registers definition ARM: at91/dt: declare atmel,at91rm9200-st as a syscon soc: qcom: gsbi: Add support for ADM CRCI muxing ...
2015-04-22Merge tag 'armsoc-dt' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM DT updates from Olof Johansson: "As always, this tends to be one of our bigger branches. There are lots of updates this release, but not that many jumps out as something that needs more detailed coverage. Some of the highlights are: - DTs for the new Annapurna Labs Alpine platform - more graphics DT pieces falling into place on Exynos, bridges, clocks. - plenty of DT updates for Qualcomm platforms for various IP blocks - some churn on Tegra due to switch-over to tool-generated pinctrl data - misc fixes and updates for Atmel at91 platforms - various DT updates to add IP block support on Broadcom's Cygnus platforms - more updates for Renesas platforms as DT support is added for various IP blocks (IPMMU, display, audio, etc)" * tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (231 commits) ARM: dts: alpine: add internal pci Revert "ARM: dts: mt8135: Add pinctrl/GPIO/EINT node for mt8135." ARM: mvebu: use 0xf1000000 as internal registers on Armada 370 DB ARM: dts: qcom: Add idle state device nodes for 8064 ARM: dts: qcom: Add idle states device nodes for 8084 ARM: dts: qcom: Add idle states device nodes for 8974/8074 ARM: dts: qcom: Update power-controller device node for 8064 Krait CPUs ARM: dts: qcom: Add power-controller device node for 8084 Krait CPUs ARM: dts: qcom: Add power-controller device node for 8074 Krait CPUs devicetree: bindings: Document qcom,idle-states devicetree: bindings: Update qcom,saw2 node bindings dt-bindings: Add #defines for MSM8916 clocks and resets arm: dts: qcom: Add LPASS Audio HW to IPQ8064 device tree arm: dts: qcom: Add APQ8084 chipset SPMI PMIC's nodes arm: dts: qcom: Add 8x74 chipset SPMI PMIC's nodes arm: dts: qcom: Add SPMI PMIC Arbiter nodes for APQ8084 and MSM8974 arm: dts: qcom: Add LCC nodes arm: dts: qcom: Add TCSR support for MSM8960 arm: dts: qcom: Add TCSR support for MSM8660 arm: dts: qcom: Add TCSR support for IPQ8064 ...
2015-04-22Merge tag 'armsoc-soc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC platform updates from Olof Johansson: "Our SoC branch usually contains expanded support for new SoCs and other core platform code. In this case, that includes: - support for the new Annapurna Labs "Alpine" platform - a rework greatly simplifying adding new platform support to the MCPM subsystem (Multi-cluster power management) - cpuidle and PM improvements for Exynos3250 - misc updates for Renesas, OMAP, Meson, i.MX. Some of these could have gone in other branches but ended up here for various reasons" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (53 commits) ARM: alpine: add support for generic pci ARM: Exynos: migrate DCSCB to the new MCPM backend abstraction ARM: vexpress: migrate DCSCB to the new MCPM backend abstraction ARM: vexpress: DCSCB: tighten CPU validity assertion ARM: vexpress: migrate TC2 to the new MCPM backend abstraction ARM: MCPM: move the algorithmic complexity to the core code ARM: EXYNOS: allow cpuidle driver usage on Exynos3250 SoC ARM: EXYNOS: add AFTR mode support for Exynos3250 ARM: EXYNOS: add code for setting/clearing boot flag ARM: EXYNOS: fix CPU1 hotplug on Exynos3250 ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore ARM: cygnus: fix const declaration bcm_cygnus_dt_compat ARM: DRA7: hwmod: Fix the hwmod class for GPTimer4 ARM: DRA7: hwmod: Add data for GPTimers 13 through 16 ARM: EXYNOS: Remove left over 'extra_save' ARM: EXYNOS: Constify exynos_pm_data array ARM: EXYNOS: use static in suspend.c ARM: EXYNOS: Use platform device name as power domain name ARM: EXYNOS: add support for async-bridge clocks for pm_domains ARM: omap-device: add missed callback for suspend-to-disk ...
2015-04-22Merge tag 'armsoc-cleanup' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC cleanups from Olof Johansson: "We've got a fairly large cleanup branch this time. The bulk of this is removal of non-DT platforms of several flavors: - Atmel at91 platforms go full-DT, with removal of remaining board-file based support - OMAP removes legacy board files for three more platforms - removal of non-DT mach-msm, newer Qualcomm platforms now live in mach-qcom - Freescale i.MX25 also removes non-DT platform support" Most of the rest of the changes here are fallout from the above, i.e. for example removal of drivers that now lack platforms, etc. * tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (58 commits) mmc: Remove msm_sdcc driver gpio: Remove gpio-msm-v1 driver ARM: Remove mach-msm and associated ARM architecture code ARM: shmobile: cpuidle: Remove the pointless default driver ARM: davinci: dm646x: Add interrupt resource for McASPs ARM: davinci: irqs: Correct McASP1 TX interrupt definition for DM646x ARM: davinci: dm646x: Clean up the McASP DMA resources ARM: davinci: devices-da8xx: Add support for McASP2 on da830 ARM: davinci: devices-da8xx: Clean up and correct the McASP device creation ARM: davinci: devices-da8xx: Add interrupt resource to McASP structs ARM: davinci: devices-da8xx: Add resource name for the McASP DMA request ARM: OMAP2+: Remove legacy support for omap3 TouchBook ARM: OMAP3: Remove legacy support for devkit8000 ARM: OMAP3: Remove legacy support for EMA-Tech Stalker board ARM: shmobile: Consolidate the pm code for R-Car Gen2 ARM: shmobile: r8a7791: Correct SYSCIER value ARM: shmobile: r8a7790: Correct SYSCIER value ARM: at91: remove old setup ARM: at91: sama5d4: remove useless map_io ARM: at91: sama5 use SoC detection infrastructure ...
2015-04-22Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Here's the usual "low-priority fixes that didn't make it into the last few -rcs, with a twist: We had a fixes pull request that I didn't send in time to get into 4.0, so we'll send some of them to Greg for -stable as well. Contents here is as usual not all that controversial: - a handful of randconfig fixes from Arnd, in particular for older Samsung platforms - Exynos fixes, !SMP building, DTS updates for MMC and lid switch - Kbuild fix to create output subdirectory for DTB files - misc minor fixes for OMAP" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits) ARM: at91/dt: sama5d3 xplained: add phy address for macb1 kbuild: Create directory for target DTB ARM: mvebu: Disable CPU Idle on Armada 38x ARM: DRA7: Enable Cortex A15 errata 798181 ARM: dts: am57xx-beagle-x15: Add thermal map to include fan and tmp102 ARM: dts: DRA7: Add bandgap and related thermal nodes bus: ocp2scp: SYNC2 value should be changed to 0x6 ARM: dts: am4372: Add "ti,am437x-ocp2scp" as compatible string for OCP2SCP ARM: OMAP2+: remove superfluous NULL pointer check ARM: EXYNOS: Fix build breakage cpuidle on !SMP ARM: dts: fix lid and power pin-functions for exynos5250-spring ARM: dts: fix mmc node updates for exynos5250-spring ARM: OMAP4: remove dead kconfig option OMAP4_ERRATA_I688 MAINTAINERS: add OMAP defconfigs under OMAP SUPPORT ARM: OMAP1: PM: fix some build warnings on 1510-only Kconfigs ARM: cns3xxx: don't export static symbol ARM: S3C24XX: avoid a Kconfig warning ARM: S3C24XX: fix header file inclusions ARM: S3C24XX: fix building without PM_SLEEP ARM: S3C24XX: use SAMSUNG_WAKEMASK for s3c2416 ...
2015-04-22rbd: rbd_wq comment is obsoleteIlya Dryomov
After the switch to blk-mq rbd_wq processes requests, not devices. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22watchdog: stmp3xxx_rtc_wdt: fix broken email addressWolfram Sang
My Pengutronix address is not valid anymore, redirect people to the Pengutronix kernel team. Reported-by: Harald Geyer <harald@ccbib.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: pnx4008_wdt: fix broken email addressWolfram Sang
My Pengutronix address is not valid anymore, redirect people to the Pengutronix kernel team. Reported-by: Harald Geyer <harald@ccbib.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: octeon: use fixed length string for register namesAaro Koskinen
Use fixed length string for register names. This saves 416 bytes in text size. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: octeon: fix some trivial coding style issuesAaro Koskinen
Fix some trivial coding style issues to reduce noise from static analyzers. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: octeon: convert to WATCHDOG_CORE APIAaro Koskinen
Convert OCTEON watchdog to WATCHDOG_CORE API. This enables support for multiple watchdogs on OCTEON boards. Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: cadence: Remove Kconfig dependency on ARCHMichal Simek
Remove Kconfig dependency and enable driver for all ARCHs. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: qcom: use timer devicetree bindingMathieu Olivari
MSM watchdog configuration happens in the same register block as the timer, so we'll use the same binding as the existing timer. The qcom-wdt will now be probed when devicetree has an entry compatible with "qcom,kpss-timer" or "qcom-scss-timer". Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22watchdog: bcm281xx: Remove use of seq_printf return valueJoe Perches
The seq_printf return value, because it's frequently misused, will eventually be converted to void. See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to seq_has_overflowed() and make public") Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Guenter Roeck <linux~roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22dmaengine: dw: don't prompt for DW_DMAC_COREVinod Koul
DW_DMAC_CORE is slected by PCI or Platform driver, so this symbol shouldn't be user selectable, so remove the prompt Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-04-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Just a few fixes trickling in at this point. 1) If we see an attached socket on an skb in the ipv4 forwarding path, bail. This can happen due to races with FIB rule addition, and deletion, and we should just drop such frames. From Sebastian Pöhn. 2) pppoe receive should only accept packets destined for this hosts's MAC address. From Joakim Tjernlund. 3) Handle checksum unwrapping properly in ppp receive properly when it's encapsulated in UDP in some way, fix from Tom Herbert. 4) Fix some bugs in mv88e6xxx DSA driver resulting from the conversion from register offset constants to mnenomic macros. From Vivien Didelot. 5) Fix handling of HCA max message size in mlx4 adapters, from Eran Ben ELisha" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/mlx4_core: Fix reading HCA max message size in mlx4_QUERY_DEV_CAP tcp: add memory barriers to write space paths altera tse: Error-Bit on tx-avalon-stream always set. net: dsa: mv88e6xxx: use PORT_DEFAULT_VLAN net: dsa: mv88e6xxx: fix setup of port control 1 ppp: call skb_checksum_complete_unset in ppp_receive_frame net: add skb_checksum_complete_unset pppoe: Lacks DST MAC address check ip_forward: Drop frames with attached skb->sk
2015-04-22ACPI / EC: fix NULL pointer dereference in acpi_ec_remove_query_handler()Chris Bainbridge
Use list_for_each_entry_safe for iterating because handler may be freed in the loop. BUG: unable to handle kernel NULL pointer dereference at 000000000000002c IP: [<ffffffff814d69c8>] acpi_ec_put_query_handler+0x7/0x1a Call Trace: acpi_ec_remove_query_handler+0x87/0x97 acpi_smbus_hc_remove+0x2a/0x44 [sbshc] acpi_device_remove+0x7b/0x9a __device_release_driver+0x7e/0x110 driver_detach+0xb0/0xc0 bus_remove_driver+0x54/0xe0 driver_unregister+0x2b/0x60 acpi_bus_unregister_driver+0x10/0x12 acpi_smb_hc_driver_exit+0x10/0x12 [sbshc] SyS_delete_module+0x1b8/0x210 system_call_fastpath+0x12/0x6a Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-21Merge branch 'parisc-4.1-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "The patch by Guenter Roeck fixes the build on parisc which got broken because of commit f24ffde43237 ("parisc: expose number of page table levels on Kconfig level") and the patch from Matthew Wilcox converts our code to use the generic scatterlist.h header file" * 'parisc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Replace PT_NLEVELS with CONFIG_PGTABLE_LEVELS parisc: Eliminate sg_virt_addr() and private scatterlist.h
2015-04-22md/raid5: don't do chunk aligned read on degraded array.Eric Mei
When array is degraded, read data landed on failed drives will result in reading rest of data in a stripe. So a single sequential read would result in same data being read twice. This patch is to avoid chunk aligned read for degraded array. The downside is to involve stripe cache which means associated CPU overhead and extra memory copy. Test Results: Following test are done on a enterprise storage node with Seagate 6T SAS drives and Xeon E5-2648L CPU (10 cores, 1.9Ghz), 10 disks MD RAID6 8+2, chunk size 128 KiB. I use FIO, using direct-io with various bs size, enough queue depth, tested sequential and 100% random read against 3 array config: 1) optimal, as baseline; 2) degraded; 3) degraded with this patch. Kernel version is 4.0-rc3. Each individual test I only did once so there might be some variations, but we just focus on big trend. Sequential Read: bs=(KiB) optimal(MiB/s) degraded(MiB/s) degraded-with-patch (MiB/s) 1024 1608 656 995 512 1624 710 956 256 1635 728 980 128 1636 771 983 64 1612 1119 1000 32 1580 1420 1004 16 1368 688 986 8 768 647 953 4 411 413 850 Random Read: bs=(KiB) optimal(IOPS) degraded(IOPS) degraded-with-patch (IOPS) 1024 163 160 156 512 274 273 272 256 426 428 424 128 576 592 591 64 726 724 726 32 849 848 837 16 900 970 971 8 927 940 929 4 948 940 955 Some notes: * In sequential + optimal, as bs size getting smaller, the FIO thread become CPU bound. * In sequential + degraded, there's big increase when bs is 64K and 32K, I don't have explanation. * In sequential + degraded-with-patch, the MD thread mostly become CPU bound. If you want to we can discuss specific data point in those data. But in general it seems with this patch, we have more predictable and in most cases significant better sequential read performance when array is degraded, and almost no noticeable impact on random read. Performance is a complicated thing, the patch works well for this particular configuration, but may not be universal. For example I imagine testing on all SSD array may have very different result. But I personally think in most cases IO bandwidth is more scarce resource than CPU. Signed-off-by: Eric Mei <eric.mei@seagate.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md/raid5: allow the stripe_cache to grow and shrink.NeilBrown
The default setting of 256 stripe_heads is probably much too small for many configurations. So it is best to make it auto-configure. Shrinking the cache under memory pressure is easy. The only interesting part here is that we put a fairly high cost ('seeks') on shrinking the cache as the cost is greater than just having to read more data, it reduces parallelism. Growing the cache on demand needs to be done carefully. If we allow fast growth, that can upset memory balance as lots of dirty memory can quickly turn into lots of memory queued in the stripe_cache. It is important for the raid5 block device to appear congested to allow write-throttling to work. So we only add stripes slowly. We set a flag when an allocation fails because all stripes are in use, allocate at a convenient time when that flag is set, and don't allow it to be set again until at least one stripe_head has been released for re-use. This means that a spurt of requests will only cause one stripe_head to be allocated, but a steady stream of requests will slowly increase the cache size - until memory pressure puts it back again. It could take hours to reach a steady state. The value written to, and displayed in, stripe_cache_size is used as a minimum. The cache can grow above this and shrink back down to it. The actual size is not directly visible, though it can be deduced to some extent by watching stripe_cache_active. Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md/raid5: change ->inactive_blocked to a bit-flag.NeilBrown
This allows us to easily add more (atomic) flags. Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md/raid5: move max_nr_stripes management into grow_one_stripe and ↵NeilBrown
drop_one_stripe Rather than adjusting max_nr_stripes whenever {grow,drop}_one_stripe() succeeds, do it inside the functions. Also choose the correct hash to handle next inside the functions. This removes duplication and will help with future new uses of {grow,drop}_one_stripe. This also fixes a minor bug where the "md/raid:%md: allocate XXkB" message always said "0kB". Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md/raid5: pass gfp_t arg to grow_one_stripe()NeilBrown
This is needed for future improvement to stripe cache management. Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md/raid5: introduce configuration option rmw_levelMarkus Stockhausen
Depending on the available coding we allow optimized rmw logic for write operations. To support easier testing this patch allows manual control of the rmw/rcw descision through the interface /sys/block/mdX/md/rmw_level. The configuration can handle three levels of control. rmw_level=0: Disable rmw for all RAID types. Hardware assisted P/Q calculation has no implementation path yet to factor in/out chunks of a syndrome. Enforcing this level can be benefical for slow CPUs with hardware syndrome support and fast SSDs. rmw_level=1: Estimate rmw IOs and rcw IOs. Execute rmw only if we will save IOs. This equals the "old" unpatched behaviour and will be the default. rmw_level=2: Execute rmw even if calculated IOs for rmw and rcw are equal. We might have higher CPU consumption because of calculating the parity twice but it can be benefical otherwise. E.g. RAID4 with fast dedicated parity disk/SSD. The option is implemented just to be forward-looking and will ONLY work with this patch! Signed-off-by: Markus Stockhausen <stockhausen@collogia.de> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md/raid5: activate raid6 rmw featureMarkus Stockhausen
Glue it altogehter. The raid6 rmw path should work the same as the already existing raid5 logic. So emulate the prexor handling/flags and split functions as needed. 1) Enable xor_syndrome() in the async layer. 2) Split ops_run_prexor() into RAID4/5 and RAID6 logic. Xor the syndrome at the start of a rmw run as we did it before for the single parity. 3) Take care of rmw run in ops_run_reconstruct6(). Again process only the changed pages to get syndrome back into sync. 4) Enhance set_syndrome_sources() to fill NULL pages if we are in a rmw run. The lower layers will calculate start & end pages from that and call the xor_syndrome() correspondingly. 5) Adapt the several places where we ignored Q handling up to now. Performance numbers for a single E5630 system with a mix of 10 7200k desktop/server disks. 300 seconds random write with 8 threads onto a 3,2TB (10*400GB) RAID6 64K chunk without spare (group_thread_cnt=4) bsize rmw_level=1 rmw_level=0 rmw_level=1 rmw_level=0 skip_copy=1 skip_copy=1 skip_copy=0 skip_copy=0 4K 115 KB/s 141 KB/s 165 KB/s 140 KB/s 8K 225 KB/s 275 KB/s 324 KB/s 274 KB/s 16K 434 KB/s 536 KB/s 640 KB/s 534 KB/s 32K 751 KB/s 1,051 KB/s 1,234 KB/s 1,045 KB/s 64K 1,339 KB/s 1,958 KB/s 2,282 KB/s 1,962 KB/s 128K 2,673 KB/s 3,862 KB/s 4,113 KB/s 3,898 KB/s 256K 7,685 KB/s 7,539 KB/s 7,557 KB/s 7,638 KB/s 512K 19,556 KB/s 19,558 KB/s 19,652 KB/s 19,688 Kb/s Signed-off-by: Markus Stockhausen <stockhausen@collogia.de> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22raid5: handle expansion/resync case with stripe batchingshli@kernel.org
expansion/resync can grab a stripe when the stripe is in batch list. Since all stripes in batch list must be in the same state, we can't allow some stripes run into expansion/resync. So we delay expansion/resync for stripe in batch list. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22raid5: handle io error of batch listshli@kernel.org
If io error happens in any stripe of a batch list, the batch list will be split, then normal process will run for the stripes in the list. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22RAID5: batch adjacent full stripe writeshli@kernel.org
stripe cache is 4k size. Even adjacent full stripe writes are handled in 4k unit. Idealy we should use big size for adjacent full stripe writes. Bigger stripe cache size means less stripes runing in the state machine so can reduce cpu overhead. And also bigger size can cause bigger IO size dispatched to under layer disks. With below patch, we will automatically batch adjacent full stripe write together. Such stripes will be added to the batch list. Only the first stripe of the list will be put to handle_list and so run handle_stripe(). Some steps of handle_stripe() are extended to cover all stripes of the list, including ops_run_io, ops_run_biodrain and so on. With this patch, we have less stripes running in handle_stripe() and we send IO of whole stripe list together to increase IO size. Stripes added to a batch list have some limitations. A batch list can only include full stripe write and can't cross chunk boundary to make sure stripes have the same parity disks. Stripes in a batch list must be in the same state (no written, toread and so on). If a stripe is in a batch list, all new read/write to add_stripe_bio will be blocked to overlap conflict till the batch list is handled. The limitations will make sure stripes in a batch list be in exactly the same state in the life circly. I did test running 160k randwrite in a RAID5 array with 32k chunk size and 6 PCIe SSD. This patch improves around 30% performance and IO size to under layer disk is exactly 32k. I also run a 4k randwrite test in the same array to make sure the performance isn't changed with the patch. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22raid5: track overwrite disk countshli@kernel.org
Track overwrite disk count, so we can know if a stripe is a full stripe write. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22raid5: add a new flag to track if a stripe can be batchedshli@kernel.org
A freshly new stripe with write request can be batched. Any time the stripe is handled or new read is queued, the flag will be cleared. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22raid5: use flex_array for scribble datashli@kernel.org
Use flex_array for scribble data. Next patch will batch several stripes together, so scribble data should be able to cover several stripes, so this patch also allocates scribble data for stripes across a chunk. Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md raid0: access mddev->queue (request queue member) conditionally because ↵Heinz Mauelshagen
it is not set when accessed from dm-raid The patch makes 3 references to mddev->queue in the raid0 personality conditional in order to allow for it to be accessed from dm-raid. Mandatory, because md instances underneath dm-raid don't manage a request queue of their own which'd lead to oopses without the patch. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Tested-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md: allow resync to go faster when there is competing IO.NeilBrown
When md notices non-sync IO happening while it is trying to resync (or reshape or recover) it slows down to the set minimum. The default minimum might have made sense many years ago but the drives have become faster. Changing the default to match the times isn't really a long term solution. This patch changes the code so that instead of waiting until the speed has dropped to the target, it just waits until pending requests have completed. This means that the delay inserted is a function of the speed of the devices. Testing shows that: - for some loads, the resync speed is unchanged. For those loads increasing the minimum doesn't change the speed either. So this is a good result. To increase resync speed under such loads we would probably need to increase the resync window size. - for other loads, resync speed does increase to a reasonable fraction (e.g. 20%) of maximum possible, and throughput of the load only drops a little bit (e.g. 10%) - for other loads, throughput of the non-sync load drops quite a bit more. These seem to be latency-sensitive loads. So it isn't a perfect solution, but it is mostly an improvement. Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md: remove 'go_faster' option from ->sync_request()NeilBrown
This option is not well justified and testing suggests that it hardly ever makes any difference. The comment suggests there might be a need to wait for non-resync activity indicated by ->nr_waiting, however raise_barrier() already waits for all of that. So just remove it to simplify reasoning about speed limiting. This allows us to remove a 'FIXME' comment from raid5.c as that never used the flag. Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md: don't require sync_min to be a multiple of chunk_size.NeilBrown
There is really no need for sync_min to be a multiple of chunk_size, and values read from here often aren't. That means you cannot read a value and expect to be able to write it back later. So remove the chunk_size check, and round down to a multiple of 4K, to be sure everything works with 4K-sector devices. Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22Merge branch 'cluster' into for-nextNeilBrown
2015-04-22md-cluster: re-add capabilitiesGoldwyn Rodrigues
When "re-add" is writted to /sys/block/mdXX/md/dev-YYY/state, the clustered md: 1. Sends RE_ADD message with the desc_nr. Nodes receiving the message clear the Faulty bit in their respective rdev->flags. 2. The node initiating re-add, gathers the bitmaps of all nodes and copies them into the local bitmap. It does not clear the bitmap from which it is copying. 3. Initiating node schedules a md recovery to sync the devices. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md: re-add a failed diskGoldwyn Rodrigues
This adds the capability of re-adding a failed disk by writing "re-add" to /sys/block/mdXX/md/dev-YYY/state. This facilitates adding disks which have encountered a temporary error such as a network disconnection/hiccup in an iSCSI device, or a SAN cable disconnection which has been restored. In such a situation, you do not need to remove and re-add the device. Writing re-add to the failed device's state would add it again to the array and perform the recovery of only the blocks which were written after the device failed. This works for generic md, and is not related to clustering. However, this patch is to ease re-add operations listed above in clustering environments. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md-cluster: remove capabilitiesGoldwyn Rodrigues
This adds "remove" capabilities for the clustered environment. When a user initiates removal of a device from the array, a REMOVE message with disk number in the array is sent to all the nodes which kick the respective device in their own array. This facilitates the removal of failed devices. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md: Export and rename find_rdev_nr_rcuGoldwyn Rodrigues
This is required by the clustering module (patches to follow) to find the device to remove or re-add. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md: Export and rename kick_rdev_from_arrayGoldwyn Rodrigues
This export is required for clustering module in order to co-ordinate remove/readd a rdev from all nodes. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22md-cluster: correct the num for comparisonGuoqing Jiang
Since the node num of md-cluster is from zero, and cinfo->slot_number represents the slot num of dlm, no need to check for equality. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-21net/mlx4_core: Fix reading HCA max message size in mlx4_QUERY_DEV_CAPEran Ben Elisha
Currently we parse max_msg_sz from the wrong offset in QUERY_DEV_CAP, fix to use the right offset. Fixes: 0b131561a7d6 ('net/mlx4_en: Add Flow control statistics [..]') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-21parisc: Eliminate sg_virt_addr() and private scatterlist.hMatthew Wilcox
The only reason to keep parisc's private asm/scatterlist.h was that it had the macro sg_virt_addr(). Convert all callers to use something else (sometimes just sg->offset was enough, others should use sg_virt()), and we can just use the asm-generic scatterlist.h instead. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Dave Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2015-04-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull more input subsystem updates from Dmitry Torokhov: - an update to Atmel MXT driver that makes it functional on Google Pixel 2 boxes (both touchpad and touchscreen) - a new VMware VMMouse driver that should allow us drop X vmmouse driver that requires root privileges (since it accesses ioports) - XBox One controllers now support force feedback (rumble) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: lm8333 - fix broken email address Input: cyapa - fix setting suspend scan rate Input: elan_i2c - fix calculating number of x and y traces. Input: elan_i2c - report hovering contacts Input: elants_i2c - zero-extend hardware ID in firmware name Input: alps - document separate pointstick button bits for V2 devices Input: atmel_mxt_ts - add support for Google Pixel 2 Input: xpad - add rumble support for Xbox One controller Input: ff-core - use new debug macros Input: add vmmouse driver Input: elan_i2c - adjust for newer firmware pressure reporting
2015-04-21media: remove unused variable that causes a warningLinus Torvalds
My 'allmodconfig' build is _almost_ free of warnings, and most of the remaining ones are for legacy drivers that just do bad things that I can't find it in my black heart to care too much about. But this one was just annoying me: drivers/media/v4l2-core/videobuf2-core.c:3256:26: warning: unused variable ‘fileio’ [-Wunused-variable] because commit 0e661006370b ("[media] vb2: fix 'UNBALANCED' warnings when calling vb2_thread_stop()") removed all users of 'fileio' and instead calls "__vb2_cleanup_fileio(q)" to clean up q->fileio. But the now unused 'fileio' variable was left around. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>