summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-11-02Merge branch 'acpi-processor'Rafael J. Wysocki
* acpi-processor: ACPI / CPPC: Fix potential memory leak ACPI / CPPC: signedness bug in register_pcc_channel() ACPI: Allow selection of the ACPI processor driver for ARM64 CPPC: Probe for CPPC tables for each ACPI Processor object ACPI: Add weak routines for ACPI CPU Hotplug ACPI / CPPC: Add a CPUFreq driver for use with CPPC ACPI: Introduce CPU performance controls using CPPC
2015-11-02Merge branch 'acpica'Rafael J. Wysocki
* acpica: ACPICA: Update version to 20150930 ACPICA: Debugger: Fix dead lock issue ocurred in single stepping mode ACPI: Enable build of AML interpreter debugger ACPICA: Debugger: Add thread ID support so that single step mode can only apply to the debugger thread ACPICA: Debugger: Fix "terminate" command by cleaning up subsystem shutdown logic ACPICA: Debugger: Fix "quit/exit" command by cleaning up user commands termination logic ACPICA: Linuxize: Export debugger files to Linux ACPICA: iASL: General cleanup of the file suffix #defines ACPICA: Improve typechecking, both compile-time and runtime ACPICA: Update NFIT table to rename a flags field ACPICA: Debugger: Update mutexes used for multithreaded debugger ACPICA: Update exception code for "file not found" error ACPICA: iASL: Add symbolic operator support for Index() operator ACPICA: Remove unnecessary conditional compilation
2015-11-01Merge branch 'libnvdimm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull memremap fix from Dan Williams: "The new memremap() api introduced in the 4.3 cycle to unify/replace ioremap_cache() and ioremap_wt() is mishandling the highmem case. This patch has received a build success notification from a 0day-kbuild-robot run and has received an ack from Ard" From the commit message: "The impact of this bug is low for now since the pmem driver is the only user of memremap(), but this is important to fix before more conversions to memremap arrive in 4.4" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: memremap: fix highmem support
2015-11-01stmmac: Correctly report PTP capabilities.Phil Reid
priv->hwts_*_en indicate if timestamping is enabled/disabled at run time. But priv->dma_cap.time_stamp and priv->dma_cap.atime_stamp indicates HW is support for PTPv1/PTPv2. Signed-off-by: Phil Reid <preid@electromag.com.au> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01Use 64-bit timekeepingTina Ruchandani
This patch changes the use of struct timespec in dccp_probe to use struct timespec64 instead. timespec uses a 32-bit seconds field which will overflow in the year 2038 and beyond. timespec64 uses a 64-bit seconds field. Note that the correctness of the code isn't changed, since the original code only uses the timestamps to compute a small elapsed interval. This patch is part of a larger attempt to remove instances of 32-bit timekeeping structures (timespec, timeval, time_t) from the kernel so it is easier to identify where the real 2038 issues are. Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01Merge branch 'ipv4_link_down'David S. Miller
Julian Anastasov says: ==================== ipv4: fix problems from the RTNH_F_LINKDOWN introduction Fix two problems from the change that introduced RTNH_F_LINKDOWN flag. The first patch deals with the removal of local route on DOWN event. The second patch makes sure the RTNH_F_LINKDOWN flag is properly updated on UP event because the DOWN event sets it in all cases. v2->v3: - use bool for force var v1->v2: - forgot to add ifconfig dummy0 down in the test case - split to 2 patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01ipv4: update RTNH_F_LINKDOWN flag on UP eventJulian Anastasov
When nexthop is part of multipath route we should clear the LINKDOWN flag when link goes UP or when first address is added. This is needed because we always set LINKDOWN flag when DEAD flag was set but now on UP the nexthop is not dead anymore. Examples when LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered: - link goes down (LINKDOWN is set), then link goes UP and device shows carrier OK but LINKDOWN remains set - last address is deleted (LINKDOWN is set), then address is added and device shows carrier OK but LINKDOWN remains set Steps to reproduce: modprobe dummy ifconfig dummy0 192.168.168.1 up here add a multipath route where one nexthop is for dummy0: ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE ifconfig dummy0 down ifconfig dummy0 up now ip route shows nexthop that is not dead. Now set the sysctl var: echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown now ip route will show a dead nexthop because the forgotten RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD. Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01ipv4: fix to not remove local route on link downJulian Anastasov
When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event we should not delete the local routes if the local address is still present. The confusion comes from the fact that both fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN constant. Fix it by returning back the variable 'force'. Steps to reproduce: modprobe dummy ifconfig dummy0 192.168.168.1 up ifconfig dummy0 down ip route list table local | grep dummy | grep host local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1 Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01net: hisilicon: Remove .owner assignment from platform_driverhuangdaode
platform_driver doesn't need to set .owner, because platform_driver_register() will set it. Signed-off-by: huangdaode <huangdaode@hisilicon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01net: dsa: use switchdev obj for VLAN add/del opsVivien Didelot
Simplify DSA by pushing the switchdev objects for VLAN add and delete operations down to its drivers. Currently only mv88e6xxx is affected. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01net: bcmgenet: Software reset EPHY after power onFlorian Fainelli
The EPHY on GENET v1->v3 is extremely finicky, and will show occasional failures based on the timing and reset sequence, ranging from duplicate packets, to extremely high latencies. Perform an additional software reset, and re-configuration to make sure it is in a consistent and working state. Fixes: 6ac3ce8295e6 ("net: bcmgenet: Remove excessive PHY reset") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "This set of updates contains: - Another bugfix for the pathologic vm86 machinery. Clear thread.vm86 on fork to prevent corrupting the parent state. This comes along with an update to the vm86 selftest case - Fix another corner case in the ioapic setup code which causes a boot crash on some oddball systems - Fix the fallout from the dma allocation consolidation work, which leads to a NULL pointer dereference when the allocation code is called with a NULL device" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vm86: Set thread.vm86 to NULL on fork/clone selftests/x86: Add a fork() to entry_from_vm86 to catch fork bugs x86/ioapic: Prevent NULL pointer dereference in setup_ioapic_dest() x86/dma-mapping: Fix arch_dma_alloc_attrs() oops with NULL dev
2015-11-01Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "The last round of minimalistic fixes for clocksource drivers: - Prevent multiple shutdown of the sh_mtu2 clocksource - Annotate a bunch of clocksource/schedclock functions with notrace to prevent an annoying ftrace recursion issue" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/sh_mtu2: Fix multiple shutdown call issue clocksource/drivers/digicolor: Prevent ftrace recursion clocksource/drivers/fsl_ftm_timer: Prevent ftrace recursion clocksource/drivers/vf_pit_timer: Prevent ftrace recursion clocksource/drivers/prima2: Prevent ftrace recursion clocksource/drivers/samsung_pwm_timer: Prevent ftrace recursion clocksource/drivers/pistachio: Prevent ftrace recursion clocksource/drivers/arm_global_timer: Prevent ftrace recursion
2015-11-01Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "The last two one-liners for 4.3 from the irqchip space: - Regression fix for armada SoC which addresses the fallout from the set_irq_flags() cleanup - Add the missing propagation of the irq_set_type() callback to the parent interrupt controller of the tegra interrupt chip" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/tegra: Propagate IRQ type setting to parent irqchip/armada-370-xp: Fix regression by clearing IRQ_NOAUTOEN
2015-11-01VSOCK: define VSOCK_SS_LISTEN once onlyStefan Hajnoczi
The SS_LISTEN socket state is defined by both af_vsock.c and vmci_transport.c. This is risky since the value could be changed in one file and the other would be out of sync. Rename from SS_LISTEN to VSOCK_SS_LISTEN since the constant is not part of enum socket_state (SS_CONNECTED, ...). This way it is clear that the constant is vsock-specific. The big text reflow in af_vsock.c was necessary to keep to the maximum line length. Text is unchanged except for s/SS_LISTEN/VSOCK_SS_LISTEN/. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01net: smsc911x: Fix crash if loopback test failsPavel Fedin
On certain hardware in certain situations loopback test fails and the driver gets removed. During mdiobus_unregister() instance of PHY driver gets disposed. But by this time it has already been started using phy_connect_direct(). PHY driver uses DELAYED_WORK in order to maintain its state. Attempting to dispose the driver without calling phy_disconnect() causes deallocation of DELAYED_WORK being active. This shortly causes a bad crash in timer code. The problem can be discovered by enabling CONFIG_DEBUG_OBJECTS_TIMERS and CONFIG_DEBUG_OBJECTS_FREE Signed-off-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01tipc: linearize arriving NAME_DISTR and LINK_PROTO buffersJon Paul Maloy
Testing of the new UDP bearer has revealed that reception of NAME_DISTRIBUTOR, LINK_PROTOCOL/RESET and LINK_PROTOCOL/ACTIVATE message buffers is not prepared for the case that those may be non-linear. We now linearize all such buffers before they are delivered up to the generic reception layer. In order for the commit to apply cleanly to 'net' and 'stable', we do the change in the function tipc_udp_recv() for now. Later, we will post a commit to 'net-next' moving the linearization to generic code, in tipc_named_rcv() and tipc_link_proto_rcv(). Fixes: commit d0f91938bede ("tipc: add ip/udp media type") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01fec: Use gpio_set_value_cansleep()Fabio Estevam
We are in a context where we can sleep, and the FEC PHY reset gpio may be on an I2C expander. Use the cansleep() variant when setting the GPIO value. Based on a patch from Russell King for pci-mvebu.c. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01Merge branch 'csum_partial_frags'David S. Miller
Hannes Frederic Sowa says: ==================== net: clean up interactions of CHECKSUM_PARTIAL and fragmentation This series fixes wrong checksums on the wire for IPv4 and IPv6. Large send buffers and especially NFS lead to wrong checksums in both IPv4 and IPv6. CHECKSUM_PARTIAL skbs should not receive the respective fragmentations functions, so we add WARN_ON_ONCE to those functions to fix up those as soon as they get reported. Changelog: v2: added v4 checks v3: removed WARN_ON_ONCES (advice by Tom Herbert) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragmentHannes Frederic Sowa
CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one of those warn about them once and handle them gracefully by recalculating the checksum. Fixes: commit 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo packets") See-also: commit 72e843bb09d45 ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL") Cc: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked socketsHannes Frederic Sowa
We cannot reliable calculate packet size on MSG_MORE corked sockets and thus cannot decide if they are going to be fragmented later on, so better not use CHECKSUM_PARTIAL in the first place. The IPv6 code also intended to protect and not use CHECKSUM_PARTIAL in the existence of IPv6 extension headers, but the condition was wrong. Fix it up, too. Also the condition to check whether the packet fits into one fragment was wrong and has been corrected. Fixes: commit 32dce968dd987 ("ipv6: Allow for partial checksums on non-ufo packets") See-also: commit 72e843bb09d45 ("ipv6: ip6_fragment() should check CHECKSUM_PARTIAL") Cc: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragmentHannes Frederic Sowa
CHECKSUM_PARTIAL skbs should never arrive in ip_fragment. If we get one of those warn about them once and handle them gracefully by recalculating the checksum. Cc: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01ipv4: no CHECKSUM_PARTIAL on MSG_MORE corked socketsHannes Frederic Sowa
We cannot reliable calculate packet size on MSG_MORE corked sockets and thus cannot decide if they are going to be fragmented later on, so better not use CHECKSUM_PARTIAL in the first place. Cc: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-01x86/cpu: Add CLZERO detectionWan Zongshun
AMD Fam17h processors introduce support for the CLZERO instruction. It zeroes out the 64 byte cache line specified in RAX. Add the bit here to allow /proc/cpuinfo to list the feature. Boris: we're adding this as a separate ->x86_capability leaf because CPUID_80000008_EBX is going to contain more feature bits and it will fill out with time. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> [ Wrap code in patch form, fix comments. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1446207099-24948-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-01x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init()Borislav Petkov
Caught by building with W= which enable -Wswitch-default also. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1446207099-24948-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-01x86/mce: Add a Scalable MCA vendor flags bitAravind Gopalakrishnan
Scalable MCA (SMCA) is a new feature in AMD Fam17h processors which indicates presence of MCA extensions. MCA extensions expands existing register space for the MCE banks and also introduces a new MSR range to accommodate new banks. Add the detection bit. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Reformat mce_vendor_flags definitions and save indentation levels. Improve comments. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1446207099-24948-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-01gpio: fix up SPI submenuLinus Walleij
- Relax dependencies on SPI_MASTER for drivers in the SPI menu that already has this dependency. - Move out the expander that would be hidden for I2C access if SPI_MASTER was not selected. Tentatively create a separate menu for this. - Move the ZX SoC driver to memory-mapped drivers, this must be a mistake and only worked because the system has an SPI master enabled at the same time. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-11-01gpio: drop surplus I2C dependenciesLinus Walleij
The I2C expander menu already depends on I2C, drop subdependecies on individual drivers. Keep the instances of depends on I2C=y though, so these are still restricted to the compiled-in case. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-11-01gpio: drop surplus X86 dependenciesLinus Walleij
Port-mapped I/O depends on X86 already, so individual drivers need not specify this dependency. Suggested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-10-31Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "This should be our final batch of fixes for 4.3: - A patch from Sudeep Holla that fixes annotation of wakeup sources properly, old unused format seems to have spread through copying. - Two patches from Tony for OMAP. One dealing with MUSB setup problems due to runtime PM being enabled too early on the parent device. The other fixes IRQ numbering for OMAP1" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: usb: musb: omap2430: Fix regression caused by driver core change ARM: OMAP1: fix incorrect INT_DMA_LCD ARM: dts: fix gpio-keys wakeup-source property
2015-10-31Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is three essential bug fixes for various SCSI parts. The only affected users are SCSI multi-path via device handler (basically all the enterprise) and mvsas users. The dh bugs are an async entanglement in boot resulting in a serious WARN_ON trip and a use after free on remove leading to a crash with strict memory accounting. The mvsas bug manifests as a null deref oops but only on abort sequences; however, these can commonly occur with SATA attached devices, hence the fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi_dh: don't try to load a device handler during async probing scsi_dh: fix use-after-free when removing scsi device mvsas: Fix NULL pointer dereference in mvs_slot_task_free
2015-10-31Merge tag 'md/4.3-rc7-fixes' of git://neil.brown.name/mdLinus Torvalds
Pull md bug fixes from Neil Brown: "Two more bug fixes for md. One bugfix for a list corruption in raid5 because of incorrect locking. Other for possible data corruption when a recovering device is failed, removed, and re-added. Both tagged for -stable" * tag 'md/4.3-rc7-fixes' of git://neil.brown.name/md: Revert "md: allow a partially recovered device to be hot-added to an array." md/raid5: fix locking in handle_stripe_clean_event()
2015-11-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2015-11-01MD: when RAID journal is missing/faulty, block RESTART_ARRAY_RWSong Liu
When RAID-4/5/6 array suffers from missing journal device, we put the array in read only state. We should not allow trasition to read-write states (clean and active) before replacing journal device. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01MD: set journal disk ->raid_diskShaohua Li
Set journal disk ->raid_disk to >=0, I choose raid_disks + 1 instead of 0, because we already have a disk with ->raid_disk 0 and this causes sysfs entry creation conflict. A lot of places assumes disk with ->raid_disk >=0 is normal raid disk, so we add check for journal disk. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01MD: kick out journal disk if it's not freshSong Liu
When journal disk is faulty and we are reassemabling the raid array, the journal disk is old. We don't allow the journal disk added to the raid array. Since journal disk is missing in the array, the raid5 will mark the array readonly. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: start raid5 readonly if journal is missingShaohua Li
If raid array is expected to have journal (eg, journal is set in MD superblock feature map) and the array is started without journal disk, start the array readonly. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01MD: add new bit to indicate raid array with journalSong Liu
If a raid array has journal feature bit set, add a new bit to indicate this. If the array is started without journal disk existing, we know there is something wrong. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: IO error handlingShaohua Li
There are 3 places the raid5-cache dispatches IO. The discard IO error doesn't matter, so we ignore it. The superblock write IO error can be handled in MD core. The remaining are log write and flush. When the IO error happens, we mark log disk faulty and fail all write IO. Read IO is still allowed to run. Userspace will get a notification too and corresponding daemon can choose setting raid array readonly for example. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5: journal disk can't be removedShaohua Li
raid5-cache uses journal disk rdev->bdev, rdev->mddev in several places. Don't allow journal disk disappear magically. On the other hand, we do need to update superblock for other disks to bump up ->events, so next time journal disk will be identified as stale. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: add trim support for logShaohua Li
Since superblock is updated infrequently, we do a simple trim of log disk (a synchronous trim) Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01MD: fix info output for journal diskShaohua Li
journal disk can be faulty. The Journal and Faulty aren't exclusive with each other. Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: use bio chainingChristoph Hellwig
Simplify the bio completion handler by using bio chaining and submitting bios as soon as they are full. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: small log->seq cleanupChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: new helper: r5_reserve_log_entryChristoph Hellwig
Factor out code to reserve log space. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: inline r5l_alloc_io_unit into r5l_new_metaChristoph Hellwig
This is the only user, and keeping all code initializing the io_unit structure together improves readbility. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: take rdev->data_offset into account early onChristoph Hellwig
Set up bi_sector properly when we allocate an bio instead of updating it at submission time. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: refactor bio allocationChristoph Hellwig
Split out a helper to allocate a bio for log writes. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: clean up r5l_get_metaChristoph Hellwig
Remove the only partially used local 'io' variable to simplify the code flow. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
2015-11-01raid5-cache: simplify state machine when caches flushes are not neededChristoph Hellwig
For devices without a volatile write cache we don't need to send a FLUSH command to ensure writes are stable on disk, and thus can avoid the whole step of batching up bios for processing by the MD thread. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>