summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-09-06iwlwifi: bump FW API to 49 for 22000 seriesLuca Coelho
Start supporting API version 49 for 22000 series. Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-09-06PM: runtime: Documentation: add runtime_status ABI documentAkinobu Mita
/sys/devices/.../power/runtime_status is introduced by commit c92445fadb91 ("PM / Runtime: Add sysfs debug files"). Then commit 0fcb4eef8294 ("PM / Runtime: Make runtime_status attribute not debug-only (v. 2)") made the runtime_status attribute available without CONFIG_PM_ADVANCED_DEBUG. This adds missing runtime_status ABI document. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-09-06Merge tag 'irqchip-5.4' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates for Linux 5.4 from Marc Zyngier: - Large GICv3 updates to support new PPI and SPI ranges - Conver all alloc_fwnode() users to use PAs instead of VAs - Add support for Marvell's MMP3 irqchip - Add support for Amlogic Meson SM1 - Various cleanups and fixes
2019-09-06Add Acer Aspire Ethos 8951G model quirkSergey Bostandzhyan
This notebook has 6 built in speakers for 5.1 surround support, however only two got autodetected and have also not been assigned correctly. This patch enables all speakers and also fixes muting when headphones are plugged in. The speaker layout is as follows: pin 0x15 Front Left / Front Right pin 0x18 Front Center / Subwoofer pin 0x1b Rear Left / Rear Right (Surround) The quirk will be enabled automatically on this hardware, but can also be activated manually via the model=aspire-ethos module parameter. Caveat: pin 0x1b is shared between headphones jack and surround speakers. When headphones are plugged in, the surround speakers get muted automatically by the hardware, however all other speakers remain unmuted. Currently it's not possible to make use of the generic automute function in the driver, because such shared pins are not supported. If we would change the pin settings to identify the pin as headphones, the surround channel and thus the ability to select 5.1 profiles would get lost. This quirk solves the above problem by monitoring jack state of 0x1b and by connecting/disconnecting all remaining speaker pins when something gets plugged in or unplugged from the headphones jack port. Signed-off-by: Sergey Bostandzhyan <jin@mediatomb.cc> Link: https://lore.kernel.org/r/20190906093343.GA7640@xn--80adja5bqm.su Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-09-06Documentation/process/embargoed-hardware-issues: Microsoft ambassadorSasha Levin
Add Sasha Levin as Microsoft's process ambassador. Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Link: https://lore.kernel.org/r/20190906095852.23568-1-sashal@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-06gpio: Fix further merge errorsLinus Walleij
The previous merge of v5.3-rc7 was struggle enough, now it gave rise to new errors and now I fix those too. Fixes: 151a41014bff ("Merge tag 'v5.3-rc7' into devel") Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-09-06soc: qcom: geni: Provide parameter error checkingLee Jones
When booting with ACPI, the Geni Serial Engine is not set as the I2C/SPI parent and thus, the wrapper (parent device) is unassigned. This causes the kernel to crash with a null dereference error. Link: https://lore.kernel.org/r/20190905082555.15020-1-lee.jones@linaro.org Fixes: 8bc529b25354 ("soc: qcom: geni: Add support for ACPI") Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-09-06Merge tag 'mt76-for-kvalo-2019-09-05' of https://github.com/nbd168/wirelessKalle Valo
mt76 patches for 5.4 * beacon tx fix for mt76x02 * sparse/checkpatch warning fixes * DFS pattern detector for mt7615 (DFS channels not enabled yet) * CSA support for mt7615 * mt7615 cleanup/fixes * mt7615 rate control improvements * usb fixes * mt7615 powersave buffering fix * new device support for mt76x0 * support for more ciphers in mt7615 * watchdog time fixes * smart carrier sense on mt7615 * survey support on mt7615 * multiple interfaces on mt76x02u * calibration data fix for mt7615 * fix for sending BAR after disassoc
2019-09-06iommu/amd: Fix race in increase_address_space()Joerg Roedel
After the conversion to lock-less dma-api call the increase_address_space() function can be called without any locking. Multiple CPUs could potentially race for increasing the address space, leading to invalid domain->mode settings and invalid page-tables. This has been happening in the wild under high IO load and memory pressure. Fix the race by locking this operation. The function is called infrequently so that this does not introduce a performance regression in the dma-api path again. Reported-by: Qian Cai <cai@lca.pw> Fixes: 256e4621c21a ('iommu/amd: Make use of the generic IOVA allocator') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-06x86/asm: Make some functions local labelsJiri Slaby
Boris suggests to make a local label (prepend ".L") to these functions to eliminate them from the symbol table. These are functions with very local names and really should not be visible anywhere. Note that objtool won't see these functions anymore (to generate ORC debug info). But all the functions are not annotated with ENDPROC, so they won't have objtool's attention anyway. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Cao jin <caoj.fnst@cn.fujitsu.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Winslow <swinslow@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Huang <wei@redhat.com> Cc: x86-ml <x86@kernel.org> Cc: Xiaoyao Li <xiaoyao.li@linux.intel.com> Link: https://lkml.kernel.org/r/20190906075550.23435-2-jslaby@suse.cz
2019-09-06iommu/amd: Flush old domains in kdump kernelStuart Hayes
When devices are attached to the amd_iommu in a kdump kernel, the old device table entries (DTEs), which were copied from the crashed kernel, will be overwritten with a new domain number. When the new DTE is written, the IOMMU is told to flush the DTE from its internal cache--but it is not told to flush the translation cache entries for the old domain number. Without this patch, AMD systems using the tg3 network driver fail when kdump tries to save the vmcore to a network system, showing network timeouts and (sometimes) IOMMU errors in the kernel log. This patch will flush IOMMU translation cache entries for the old domain when a DTE gets overwritten with a new domain number. Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com> Fixes: 3ac3e5ee5ed5 ('iommu/amd: Copy old trans table from old kernel') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-06x86/asm/suspend: Get rid of bogus_64_magicJiri Slaby
bogus_64_magic is only a dead-end loop. There is no need for an out-of-order function (and unannotated local label), so just handle it in-place and also store 0xbad-m-a-g-i-c to %rcx beforehand, in case someone is inspecting registers. Here a qemu+gdb example: Remote debugging using localhost:1235 wakeup_long64 () at arch/x86/kernel/acpi/wakeup_64.S:26 26 jmp 1b (gdb) info registers rax 0x123456789abcdef0 1311768467463790320 rbx 0x0 0 rcx 0xbad6d61676963 3286910041024867 ^^^^^^^^^^^^^^^ [ bp: Add the gdb example. ] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: linux-pm@vger.kernel.org Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190906075550.23435-1-jslaby@suse.cz
2019-09-06coccinelle: platform_get_irq: Fix parse errorYueHaibing
When do coccicheck, I get this error: spatch -D report --no-show-diff --very-quiet --cocci-file ./scripts/coccinelle/api/platform_get_irq.cocci --include-headers --dir . -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./arch/x86/include/uapi -I ./arch/x86/include/generated/uapi -I ./include/uapi -I ./include/generated/uapi --include ./include/linux/kconfig.h --jobs 192 --chunksize 1 minus: parse error: File "./scripts/coccinelle/api/platform_get_irq.cocci", line 24, column 9, charpos = 355 around = '\(', whole content = if ( ret \( < \| <= \) 0 ) In commit e56476897448 ("fpga: Remove dev_err() usage after platform_get_irq()") log, I found the semantic patch, it fix this issue. Fixes: 98051ba2b28b ("coccinelle: Add script to check for platform_get_irq() excessive prints") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Link: https://lore.kernel.org/r/20190906033006.17616-1-yuehaibing@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-06x86/purgatory: Change compiler flags from -mcmodel=kernel to -mcmodel=large ↵Steve Wahl
to fix kexec relocation errors The last change to this Makefile caused relocation errors when loading a kdump kernel. Restore -mcmodel=large (not -mcmodel=kernel), -ffreestanding, and -fno-zero-initialized-bsss, without reverting to the former practice of resetting KBUILD_CFLAGS. Purgatory.ro is a standalone binary that is not linked against the rest of the kernel. Its image is copied into an array that is linked to the kernel, and from there kexec relocates it wherever it desires. With the previous change to compiler flags, the error "kexec: Overflow in relocation type 11 value 0x11fffd000" was encountered when trying to load the crash kernel. This is from kexec code trying to relocate the purgatory.ro object. From the error message, relocation type 11 is R_X86_64_32S. The x86_64 ABI says: "The R_X86_64_32 and R_X86_64_32S relocations truncate the computed value to 32-bits. The linker must verify that the generated value for the R_X86_64_32 (R_X86_64_32S) relocation zero-extends (sign-extends) to the original 64-bit value." This type of relocation doesn't work when kexec chooses to place the purgatory binary in memory that is not reachable with 32 bit addresses. The compiler flag -mcmodel=kernel allows those type of relocations to be emitted, so revert to using -mcmodel=large as was done before. Also restore the -ffreestanding and -fno-zero-initialized-bss flags because they are appropriate for a stand alone piece of object code which doesn't explicitly zero the bss, and one other report has said undefined symbols are encountered without -ffreestanding. These identical compiler flag changes need to happen for every object that becomes part of the purgatory.ro object, so gather them together first into PURGATORY_CFLAGS_REMOVE and PURGATORY_CFLAGS, and then apply them to each of the objects that have C source. Do not apply any of these flags to kexec-purgatory.o, which is not part of the standalone object but part of the kernel proper. Tested-by: Vaibhav Rustagi <vaibhavrustagi@google.com> Tested-by: Andreas Smas <andreas@lonelycoder.com> Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: None Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: clang-built-linux@googlegroups.com Cc: dimitri.sivanich@hpe.com Cc: mike.travis@hpe.com Cc: russ.anderson@hpe.com Fixes: b059f801a937 ("x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS") Link: https://lkml.kernel.org/r/20190905202346.GA26595@swahl-linux Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06Merge tag 'drm-misc-fixes-2019-09-05' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v5.3 final: - Make ingenic panel type DPI insteado f unknown. - Fixes for command line parser modes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/606d87b2-1840-c893-eb30-d6c471c9e50a@linux.intel.com
2019-09-06Merge branch 'vmwgfx-fixes-5.3' of ↵Dave Airlie
git://people.freedesktop.org/~thomash/linux into drm-fixes Single vmwgfx double free fix. Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-09-06perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initializationMark-PK Tsai
If we disable the compiler's auto-initialization feature, if -fplugin-arg-structleak_plugin-byref or -ftrivial-auto-var-init=pattern are disabled, arch_hw_breakpoint may be used before initialization after: 9a4903dde2c86 ("perf/hw_breakpoint: Split attribute parse and commit") On our ARM platform, the struct step_ctrl in arch_hw_breakpoint, which used to be zero-initialized by kzalloc(), may be used in arch_install_hw_breakpoint() without initialization. Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alix Wu <alix.wu@mediatek.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: YJ Chiang <yj.chiang@mediatek.com> Link: https://lkml.kernel.org/r/20190906060115.9460-1-mark-pk.tsai@mediatek.com [ Minor edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06iio: hid-sensor-attributes: Fix divisions for 32-bit platformsAndy Shevchenko
The commit 473d12f7638c ("iio: hid-sensor-attributes: Convert to use int_pow()") converted to use generic int_pow() helper. Though, the generic one returns 64-bit value and, in cases when it is used as divisor, it compels 64-bit division from compiler. In order to fix this, introduce a temporary 32-bit variable to hold the result of int_pow() and use it as divisor afterwards. In couple of cases, replace int_pow() with a predefined unit factors for time and frequency. Fixes: 473d12f7638c ("iio: hid-sensor-attributes: Convert to use int_pow()") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20190905112759.13035-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-06Merge tag 'misc-habanalabs-next-2019-09-05' of ↵Greg Kroah-Hartman
git://people.freedesktop.org/~gabbayo/linux into char-misc-next Oded writes: This tag contains the following changes for kernel 5.4: - Create an additional char device per PCI device. The new char device allows any application to query the device for stats, information, idle state and more. This is needed to support system/monitoring applications, while also allowing the deep-learning application to send work to the ASIC through the main (original) char device. - Fix possible kernel crash in case user supplies a smaller-than-required buffer to the DEBUG IOCTL. - Expose the device to userspace only after initialization was done, to prevent a race between the initialization and user submitting workloads. - Add uapi, as part of INFO IOCTL, to allow user to query the device utilization rate. - Add uapi, as part of INFO IOCTL, to allow user to retrieve aggregate H/W events, i.e. counting H/W events from the loading of the driver. - Register to the HWMON subsystem with the board's name, to allow the user to prepare a custom sensor file per board. - Use correct macros for endian swapping. - Improve error printing in multiple places. - Small bug fixes. * tag 'misc-habanalabs-next-2019-09-05' of git://people.freedesktop.org/~gabbayo/linux: (30 commits) habanalabs: correctly cast variable to __le32 habanalabs: show correct id in error print habanalabs: stop using the acronym KMD habanalabs: display card name as sensors header habanalabs: add uapi to retrieve aggregate H/W events habanalabs: add uapi to retrieve device utilization habanalabs: Make the Coresight timestamp perpetual habanalabs: explicitly set the queue-id enumerated numbers habanalabs: print to kernel log when reset is finished habanalabs: replace __le32_to_cpu with le32_to_cpu habanalabs: replace __cpu_to_le32/64 with cpu_to_le32/64 habanalabs: Handle HW_IP_INFO if device disabled or in reset habanalabs: Expose devices after initialization is done habanalabs: improve security in Debug IOCTL habanalabs: use default structure for user input in Debug IOCTL habanalabs: Add descriptive name to PSOC app status register habanalabs: Add descriptive names to PSOC scratch-pad registers habanalabs: create two char devices per ASIC habanalabs: change device_setup_cdev() to be more generic habanalabs: maintain a list of file private data objects ...
2019-09-06x86/platform/uv: Fix kmalloc() NULL check routineAustin Kim
The result of kmalloc() should have been checked ahead of below statement: pqp = (struct bau_pq_entry *)vp; Move BUG_ON(!vp) before above statement. Signed-off-by: Austin Kim <austindh.kim@gmail.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Hedi Berriche <hedi.berriche@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Travis <mike.travis@hpe.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Steve Wahl <steve.wahl@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: allison@lohutok.net Cc: andy@infradead.org Cc: armijn@tjaldur.nl Cc: bp@alien8.de Cc: dvhart@infradead.org Cc: gregkh@linuxfoundation.org Cc: hpa@zytor.com Cc: kjlu@umn.edu Cc: platform-driver-x86@vger.kernel.org Link: https://lkml.kernel.org/r/20190905232951.GA28779@LGEARND20B15 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06Merge tag 'v5.3-rc7' into x86/platform, to refresh the branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06x86/cpu: Update init data for new Airmont CPU modelRahul Tanwar
Update properties for newly added Airmont CPU variant. Signed-off-by: Rahul Tanwar <rahul.tanwar@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Gayatri Kammela <gayatri.kammela@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190905193020.14707-5-tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06x86/cpu: Add new Airmont variant to Intel familyRahul Tanwar
Add new Airmont variant CPU model to Intel family. Signed-off-by: Rahul Tanwar <rahul.tanwar@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Gayatri Kammela <gayatri.kammela@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190905193020.14707-4-tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06x86/cpu: Add Elkhart Lake to Intel familyGayatri Kammela
Add the model number/CPUID of atom based Elkhart Lake to the Intel family. Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rahul Tanwar <rahul.tanwar@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190905193020.14707-3-tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06x86/cpu: Add Tiger Lake to Intel familyGayatri Kammela
Add the model numbers/CPUIDs of Tiger Lake mobile and desktop to the Intel family. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rahul Tanwar <rahul.tanwar@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190905193020.14707-2-tony.luck@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06Merge branch 'x86/cleanups' into x86/cpu, to pick up dependent changesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-05xfs: push the grant head when the log head moves forwardDave Chinner
When the log fills up, we can get into the state where the outstanding items in the CIL being committed and aggregated are larger than the range that the reservation grant head tail pushing will attempt to clean. This can result in the tail pushing range being trimmed back to the the log head (l_last_sync_lsn) and so may not actually move the push target at all. When the iclogs associated with the CIL commit finally land, the log head moves forward, and this removes the restriction on the AIL push target. However, if we already have transactions sleeping on the grant head, and there's nothing in the AIL still to flush from the current push target, then nothing will move the tail of the log and trigger a log reservation wakeup. Hence the there is nothing that will trigger xlog_grant_push_ail() to recalculate the AIL push target and start pushing on the AIL again to write back the metadata objects that pin the tail of the log and hence free up space and allow the transaction reservations to be woken and make progress. Hence we need to push on the grant head when we move the log head forward, as this may be the only trigger we have that can move the AIL push target forwards in this situation. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: push iclog state cleaning into xlog_state_clean_logDave Chinner
xlog_state_clean_log() is only called from one place, and it occurs when an iclog is transitioning back to ACTIVE. Prior to calling xlog_state_clean_log, the iclog we are processing has a hard coded state check to DIRTY so that xlog_state_clean_log() processes it correctly. We also have a hard coded wakeup after xlog_state_clean_log() to enfore log force waiters on that iclog are woken correctly. Both of these things are operations required to finish processing an iclog and return it to the ACTIVE state again, so they make little sense to be separated from the rest of the clean state transition code. Hence push these things inside xlog_state_clean_log(), document the behaviour and rename it xlog_state_clean_iclog() to indicate that it's being driven by an iclog state change and does the iclog state change work itself. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: factor iclog state processing out of xlog_state_do_callback()Dave Chinner
The iclog IO completion state processing is somewhat complex, and because it's inside two nested loops it is highly indented and very hard to read. Factor it out, flatten the logic flow and clean up the comments so that it much easier to see what the code is doing both in processing the individual iclogs and in the over xlog_state_do_callback() operation. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: factor callbacks out of xlog_state_do_callback()Dave Chinner
Simplify the code flow by lifting the iclog callback work out of the main iclog iteration loop. This isolates the log juggling and callbacks from the iclog state change logic in the loop. Note that the loopdidcallbacks variable is not actually tracking whether callbacks are actually run - it is tracking whether the icloglock was dropped during the loop and so determines if we completed the entire iclog scan loop atomically. Hence we know for certain there are either no more ordered completions to run or that the next completion will run the remaining ordered iclog completions. Hence rename that variable appropriately for it's function. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: factor debug code out of xlog_state_do_callback()Dave Chinner
Start making this function readable by lifting the debug code into a conditional function. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: prevent CIL push holdoff in log recoveryDave Chinner
generic/530 on a machine with enough ram and a non-preemptible kernel can run the AGI processing phase of log recovery enitrely out of cache. This means it never blocks on locks, never waits for IO and runs entirely through the unlinked lists until it either completes or blocks and hangs because it has run out of log space. It runs out of log space because the background CIL push is scheduled but never runs. queue_work() queues the CIL work on the current CPU that is busy, and the workqueue code will not run it on any other CPU. Hence if the unlinked list processing never yields the CPU voluntarily, the push work is delayed indefinitely. This results in the CIL aggregating changes until all the log space is consumed. When the log recoveyr processing evenutally blocks, the CIL flushes but because the last iclog isn't submitted for IO because it isn't full, the CIL flush never completes and nothing ever moves the log head forwards, or indeed inserts anything into the tail of the log, and hence nothing is able to get the log moving again and recovery hangs. There are several problems here, but the two obvious ones from the trace are that: a) log recovery does not yield the CPU for over 4 seconds, b) binding CIL pushes to a single CPU is a really bad idea. This patch addresses just these two aspects of the problem, and are suitable for backporting to work around any issues in older kernels. The more fundamental problem of preventing the CIL from consuming more than 50% of the log without committing will take more invasive and complex work, so will be done as followup work. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: fix missed wakeup on l_flush_waitRik van Riel
The code in xlog_wait uses the spinlock to make adding the task to the wait queue, and setting the task state to UNINTERRUPTIBLE atomic with respect to the waker. Doing the wakeup after releasing the spinlock opens up the following race condition: Task 1 task 2 add task to wait queue wake up task set task state to UNINTERRUPTIBLE This issue was found through code inspection as a result of kworkers being observed stuck in UNINTERRUPTIBLE state with an empty wait queue. It is rare and largely unreproducable. Simply moving the spin_unlock to after the wake_up_all results in the waker not being able to see a task on the waitqueue before it has set its state to UNINTERRUPTIBLE. This bug dates back to the conversion of this code to generic waitqueue infrastructure from a counting semaphore back in 2008 which didn't place the wakeups consistently w.r.t. to the relevant spin locks. [dchinner: Also fix a similar issue in the shutdown path on xc_commit_wait. Update commit log with more details of the issue.] Fixes: d748c62367eb ("[XFS] Convert l_flushsema to a sv_t") Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Rik van Riel <riel@surriel.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: push the AIL in xlog_grant_head_wakeDave Chinner
In the situation where the log is full and the CIL has not recently flushed, the AIL push threshold is throttled back to the where the last write of the head of the log was completed. This is stored in log->l_last_sync_lsn. Hence if the CIL holds > 25% of the log space pinned by flushes and/or aggregation in progress, we can get the situation where the head of the log lags a long way behind the reservation grant head. When this happens, the AIL push target is trimmed back from where the reservation grant head wants to push the log tail to, back to where the head of the log currently is. This means the push target doesn't reach far enough into the log to actually move the tail before the transaction reservation goes to sleep. When the CIL push completes, it moves the log head forward such that the AIL push target can now be moved, but that has no mechanism for puhsing the log tail. Further, if the next tail movement of the log is not large enough wake the waiter (i.e. still not enough space for it to have a reservation granted), we don't wake anything up, and hence we do not update the AIL push target to take into account the head of the log moving and allowing the push target to be moved forwards. To avoid this particular condition, if we fail to wake the first waiter on the grant head because we don't have enough space, push on the AIL again. This will pick up any movement of the log head and allow the push target to move forward due to completion of CIL pushing. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05xfs: Use WARN_ON_ONCE for bailout mount-operationAustin Kim
If the CONFIG_BUG is enabled, BUG is executed and then system is crashed. However, the bailout for mount is no longer proceeding. Using WARN_ON_ONCE rather than BUG can prevent this situation. Signed-off-by: Austin Kim <austindh.kim@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-05sd: Set ELEVATOR_F_ZBD_SEQ_WRITE for ZBC disksDamien Le Moal
Using the helper blk_queue_required_elevator_features(), set the elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request queue of SCSI ZBC disks. This feature requirement can always be satisfied as the mq-deadline elevator is always selected for in-kernel compilation when CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-05block: Set ELEVATOR_F_ZBD_SEQ_WRITE for nullblk zoned disksDamien Le Moal
Using the helper blk_queue_required_elevator_features(), set the elevator feature ELEVATOR_F_ZBD_SEQ_WRITE as required for the request queue of null_blk devices created with zoned mode enabled. This feature requirement can always be satisfied as the mq-deadline elevator is always selected for in-kernel compilation when CONFIG_BLK_DEV_ZONED (zoned block device support) is enabled. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-05block: Delay default elevator initializationDamien Le Moal
When elevator_init_mq() is called from blk_mq_init_allocated_queue(), the only information known about the device is the number of hardware queues as the block device scan by the device driver is not completed yet for most drivers. The device type and elevator required features are not set yet, preventing to correctly select the default elevator most suitable for the device. This currently affects all multi-queue zoned block devices which default to the "none" elevator instead of the required "mq-deadline" elevator. These drives currently include host-managed SMR disks connected to a smartpqi HBA and null_blk block devices with zoned mode enabled. Upcoming NVMe Zoned Namespace devices will also be affected. Fix this by adding the boolean elevator_init argument to blk_mq_init_allocated_queue() to control the execution of elevator_init_mq(). Two cases exist: 1) elevator_init = false is used for calls to blk_mq_init_allocated_queue() within blk_mq_init_queue(). In this case, a call to elevator_init_mq() is added to __device_add_disk(), resulting in the delayed initialization of the queue elevator after the device driver finished probing the device information. This effectively allows elevator_init_mq() access to more information about the device. 2) elevator_init = true preserves the current behavior of initializing the elevator directly from blk_mq_init_allocated_queue(). This case is used for the special request based DM devices where the device gendisk is created before the queue initialization and device information (e.g. queue limits) is already known when the queue initialization is executed. Additionally, to make sure that the elevator initialization is never done while requests are in-flight (there should be none when the device driver calls device_add_disk()), freeze and quiesce the device request queue before calling blk_mq_init_sched() in elevator_init_mq(). Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-05block: Improve default elevator selectionDamien Le Moal
For block devices that do not specify required features, preserve the current default elevator selection (mq-deadline for single queue devices, none for multi-queue devices). However, for devices specifying required features (e.g. zoned block devices ELEVATOR_F_ZBD_SEQ_WRITE feature), select the first available elevator providing the required features. In all cases, default to "none" if no elevator is available or if the initialization of the default elevator fails. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-05block: Introduce elevator featuresDamien Le Moal
Introduce the definition of elevator features through the elevator_features flags in the elevator_type structure. Each flag can represent a feature supported by an elevator. The first feature defined by this patch is support for zoned block device sequential write constraint with the flag ELEVATOR_F_ZBD_SEQ_WRITE, which is implemented by the mq-deadline elevator using zone write locking. Other possible features are IO priorities, write hints, latency targets or single-LUN dual-actuator disks (for which the elevator could maintain one LBA ordered list per actuator). The required_elevator_features field is also added to the request_queue structure to allow a device driver to specify elevator feature flags that an elevator must support for the correct operation of the device (e.g. device drivers for zoned block devices can have the ELEVATOR_F_ZBD_SEQ_WRITE flag as a required feature). The helper function blk_queue_required_elevator_features() is defined for setting this new field. With these two new fields in place, the elevator functions elevator_match() and elevator_find() are modified to allow a user to set only an elevator with a set of features that satisfies the device required features. Elevators not matching the device requirements are not shown in the device sysfs queue/scheduler file to prevent their use. The "none" elevator can always be selected as before. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-05block: Change elevator_init_mq() to always succeedDamien Le Moal
If the default elevator chosen is mq-deadline, elevator_init_mq() may return an error if mq-deadline initialization fails, leading to blk_mq_init_allocated_queue() returning an error, which in turn will cause the block device initialization to fail and the device not being exposed. Instead of taking such extreme measure, handle mq-deadline initialization failures in the same manner as when mq-deadline is not available (no module to load), that is, default to the "none" scheduler. With this change, elevator_init_mq() return type can be changed to void. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-05block: Cleanup elevator_init_mq() useDamien Le Moal
Instead of checking a queue tag_set BLK_MQ_F_NO_SCHED flag before calling elevator_init_mq() to make sure that the queue supports IO scheduling, use the elevator.c function elv_support_iosched() in elevator_init_mq(). This does not introduce any functional change but ensure that elevator_init_mq() does the right thing based on the queue settings. Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-06parisc: Save some bytes in dino driverHelge Deller
Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-05net/mlx5e: Add port buffer's congestion countersAya Levin
Add 3 counters per priority to ethtool using PPCNT: 1) rx_prio[p]_buf_discard - the number of packets discarded by device due to lack of per host receive buffers 2) rx_prio[p]_cong_discard - the number of packets discarded by device due to per host congestion 3) rx_prio[p]_marked - the number of packets ECN marked by device due to per host congestion Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05net/mlx5: Expose HW capability bits for port buffer per priority congestion ↵Aya Levin
counters Map capability bit indicating that HCA supports port buffer's congestion counters. Also map registers with the corresponding counters. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05net/mlx5: DR, Remove redundant dev_name print from err logSaeed Mahameed
mlx5_core_err already prints the name of the device. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05net/mlx5: DR, Fix error return code in dr_domain_init_resources()Wei Yongjun
Fix to return negative error code -ENOMEM from the error handling case instead of 0, as done elsewhere in this function. Fixes: 4ec9e7b02697 ("net/mlx5: DR, Expose steering domain functionality") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05net/mlx5: DR, Remove useless set memory to zero use memset()Wei Yongjun
The memory return by kzalloc() has already be set to zero, so remove useless memset(0). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05net/mlx5e: Remove unnecessary clear_bit()sMaxim Mikityanskiy
Don't clear MLX5E_SQ_STATE_ENABLED on error in mlx5e_open_txqsq and mlx5e_open_icosq, because it's not set there, and is 0 by default. Fixes: acc6c5953af1 ("net/mlx5e: Split open/close channels to stages") Fixes: 9d18b5144a0a ("net/mlx5e: Split open/close ICOSQ into stages") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-09-05net/mlx5e: kTLS, Remove unused function parameterTariq Toukan
SKB parameter is no longer used in tx_post_resync_dump(), remove it. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>