summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-02-04ext4: preserve the needs_recovery flag when the journal is abortedTheodore Ts'o
If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2017-02-04jbd2: don't leak modified metadata buffers on an aborted journalTheodore Ts'o
If the journal has been aborted, we shouldn't mark the underlying buffer head as dirty, since that will cause the metadata block to get modified. And if the journal has been aborted, we shouldn't allow this since it will almost certainly lead to a corrupted file system. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2017-02-04ext4: fix inline data error pathsTheodore Ts'o
The write_end() function must always unlock the page and drop its ref count, even on an error. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2017-02-04netlabel: out of bound access in cipso_v4_validate()Eric Dumazet
syzkaller found another out of bound access in ip_options_compile(), or more exactly in cipso_v4_validate() Fixes: 20e2a8648596 ("cipso: handle CIPSO options correctly when NetLabel is disabled") Fixes: 446fda4f2682 ("[NetLabel]: CIPSOv4 engine") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paul Moore <paul@paul-moore.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-04ipv4: keep skb->dst around in presence of IP optionsEric Dumazet
Andrey Konovalov got crashes in __ip_options_echo() when a NULL skb->dst is accessed. ipv4_pktinfo_prepare() should not drop the dst if (evil) IP options are present. We could refine the test to the presence of ts_needtime or srr, but IP options are not often used, so let's be conservative. Thanks to syzkaller team for finding this bug. Fixes: d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-05x86/cpufeature: Enable RING3MWAIT for Knights MillPiotr Luc
Enable ring 3 MONITOR/MWAIT for Intel Xeon Phi codenamed Knights Mill. We can't guarantee that this (KNM) will be the last CPU model that needs this hack. But, we do recognize that this is far from optimal, and there is an effort to ensure we don't keep doing extending this hack forever. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Cc: Piotr.Luc@intel.com Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/r/1484918557-15481-6-git-send-email-grzegorz.andrejczuk@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04libnvdimm, pfn: fix memmap reservation size versus 4K alignmentDan Williams
When vmemmap_populate() allocates space for the memmap it does so in 2MB sized chunks. The libnvdimm-pfn driver incorrectly accounts for this when the alignment of the device is set to 4K. When this happens we trigger memory allocation failures in altmap_alloc_block_buf() and trigger warnings of the form: WARNING: CPU: 0 PID: 3376 at arch/x86/mm/init_64.c:656 arch_add_memory+0xe4/0xf0 [..] Call Trace: dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xe4/0xf0 devm_memremap_pages+0x29b/0x4e0 Fixes: 315c562536c4 ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE") Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2017-02-04gpio: aspeed: Remove dependence on GPIOF_* macrosAndrew Jeffery
1736f75d35e47409ad776273133d0f558a4c8253 is a (v2) patch which had unresolved review comments[1]. Address the comments by removing the use of macros from the consumer header (this patch represents the diff between v2 and v3[2]). [1] https://lkml.org/lkml/2017/1/26/337 [2] https://lkml.org/lkml/2017/1/26/786 Fixes: 1736f75d35e4 ("gpio: aspeed: Add banks Y, Z, AA, AB and AC") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Acked-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-02-04Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - Prevent double activation of interrupt lines, which causes problems on certain interrupt controllers - Handle the fallout of the above because x86 (ab)uses the activation function to reconfigure interrupts under the hood. * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Make irq activate operations symmetric irqdomain: Avoid activating interrupts more than once
2017-02-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fix from Radim Krčmář: "Fix a regression that prevented migration between hosts with different XSAVE features even if the missing features were not used by the guest (for stable)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: do not save guest-unsupported XSAVE state
2017-02-04Merge tag 'char-misc-4.10-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are two bugfixes that resolve some reported issues. One in the firmware loader, that should fix the much-reported problem of crashes with it. The other is a hyperv fix for a reported regression. Both have been in linux-next for a week or so with no reported issues" * tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read() firmware: fix NULL pointer dereference in __fw_load_abort()
2017-02-04Merge tag 'staging-4.10-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO fixes from Greg KH: "Here are a few small IIO and one staging driver fix for 4.10-rc7. They fix some reported issues with the drivers. All of them have been in linux-next for a week or so with no reported issues" * tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: greybus: timesync: validate platform state callback iio: dht11: Use usleep_range instead of msleep for start signal iio: adc: palmas_gpadc: retrieve a valid iio_dev in suspend/resume iio: health: max30100: fixed parenthesis around FIFO count check iio: health: afe4404: retrieve a valid iio_dev in suspend/resume iio: health: afe4403: retrieve a valid iio_dev in suspend/resume
2017-02-04Merge tag 'usb-4.10-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for some reported issues, and the usual number of new device ids for 4.10-rc7. All of these, except the last new device id, have been in linux-next for a while with no reported issues" * tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: pl2303: add ATEN device ID usb: gadget: f_fs: Assorted buffer overflow checks. USB: Add quirk for WORLDE easykey.25 MIDI keyboard usb: musb: Fix external abort on non-linefetch for musb_irq_work() usb: musb: Fix host mode error -71 regression USB: serial: option: add device ID for HP lt2523 (Novatel E371) USB: serial: qcserial: add Dell DW5570 QDL
2017-02-04dm: don't allow ioctls to targets that don't map to whole devicesChristoph Hellwig
.. at least for unprivileged users. Before we called into the SCSI ioctl code to allow excemptions for a few SCSI passthrough ioctls, but this is pretty unsafe and except for this call dm knows nothing about SCSI ioctls. As the SCSI ioctl code is now optional, we really don't want to drag it in for DM, and the exception is not very useful anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-04x86/mm/pat: Use rb_entry()Geliang Tang
To make the code clearer, use rb_entry() instead of open coding it Signed-off-by: Geliang Tang <geliangtang@gmail.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Link: http://lkml.kernel.org/r/974a91cd4ed2d04c92e4faa4765077e38f248d6b.1482157956.git.geliangtang@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04regulator: gpio: correct default typeHans Holmberg
The driver defaults to voltage, not current, type so correct this in the device tree binding documentation. Signed-off-by: Hans Holmberg <hans@pixelmunchies.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04regulator: cpcap: Add basic regulator supportTony Lindgren
Many Motorola phones like droid 4 are using a custom PMIC called CPCAP or 6556002. This PMIC is used with several SoCs, I've noticed at least omap3, omap4 and Tegra2 based Motorola phones and tablets using it. Cc: devicetree@vger.kernel.org Cc: Marcel Partap <mpartap@gmx.net> Cc: Michael Scott <michael.scott@linaro.org> Cc: Rob Herring <robh@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04regulator: core: fix typo in regulator_bulk_disable()Dmitry Torokhov
"re-enable" was misspelled as "reename". Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04regulator: core: optimize devm_regulator_bulk_get()Dmitry Torokhov
When performing this bulk operation, there is no need to track every supply individually. It is more efficient to treat entire group as a single managed resource. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04regulator: core: simplify regulator_bulk_force_disable()Dmitry Torokhov
There is no need to have two loops there, we can store error for subsequent reporting. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04regulator: core: have _regulator_get() accept get_type argumentDmitry Torokhov
Instead of separate "exclusive" and "allow_dummy" arguments, that formed 3 valid combinations (normal, exclusive and optional) and an invalid one, let's accept explicit "get_type", like we did in devm-managed code. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04regulator: core: remove dead code in _regulator_get()Dmitry Torokhov
There is no point in assigning value to 'ret' before calling regulator_dev_lookup() as it will clobber 'ret' anyway. Also, let's explicitly return -PROBE_DEFER when try_module_get() fails, instead of relying that earlier initialization of "regulator" carries correct value. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-02-04x86/traps: Get rid of unnecessary preempt_disable/preempt_enable_no_reschedAlexander Kuleshov
Exception handlers which may run on IST stack call ist_enter() at the start of execution and ist_exit() in the end. ist_enter() disables preemption unconditionally and ist_exit() enables it. So the extra preempt_disable/enable() pairs nested inside the ist_enter/exit() regions are pointless and can be removed. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Jianyu Zhan <nasa4836@gmail.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20161128075057.7724-1-kuleshovmail@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04x86/pci-calgary: Fix iommu_free() comparison of unsigned expression >= 0Nikola Pajkovsky
commit 8fd524b355da ("x86: Kill bad_dma_address variable") has killed bad_dma_address variable and used instead of macro DMA_ERROR_CODE which is always zero. Since dma_addr is unsigned, the statement dma_addr >= DMA_ERROR_CODE is always true, and not needed. arch/x86/kernel/pci-calgary_64.c: In function ‘iommu_free’: arch/x86/kernel/pci-calgary_64.c:299:2: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if (unlikely((dma_addr >= DMA_ERROR_CODE) && (dma_addr < badend))) { Fixes: 8fd524b355da ("x86: Kill bad_dma_address variable") Signed-off-by: Nikola Pajkovsky <npajkovsky@suse.cz> Cc: iommu@lists.linux-foundation.org Cc: Jon Mason <jdmason@kudzu.us> Cc: Muli Ben-Yehuda <mulix@mulix.org> Link: http://lkml.kernel.org/r/7612c0f9dd7c1290407dbf8e809def922006920b.1479161177.git.npajkovsky@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04debugobjects: Scale thresholds with # of CPUsWaiman Long
On a large SMP systems with hundreds of CPUs, the current thresholds for allocating and freeing debug objects (256 and 1024 respectively) may not work well. This can cause a lot of needless calls to kmem_aloc() and kmem_free() on those systems. To alleviate this thrashing problem, the object freeing threshold is now increased to "1024 + # of CPUs * 32". Whereas the object allocation threshold is increased to "256 + # of CPUs * 4". That should make the debug objects subsystem scale better with the number of CPUs available in the system. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04debugobjects: Track number of kmem_cache_alloc/kmem_cache_free doneWaiman Long
New debugfs stat counters are added to track the numbers of kmem_cache_alloc() and kmem_cache_free() function calls to get a sense of how the internal debug objects cache management is performing. Signed-off-by: Waiman Long <longman@redhat.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Du Changbin" <changbin.du@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Stancek <jstancek@redhat.com> Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04tick/broadcast: Reduce lock cacheline contentionWaiman Long
It was observed that on an Intel x86 system without the ARAT (Always running APIC timer) feature and with fairly large number of CPUs as well as CPUs coming in and out of intel_idle frequently, the lock contention on the tick_broadcast_lock can become significant. To reduce contention, the lock is put into its own cacheline and all the cpumask_var_t variables are put into the __read_mostly section. Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket 16-core 32-thread Nehalam system, the performance number improved from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on a 4.9.6 kernel. This is a 3.4% improvement. Signed-off-by: Waiman Long <longman@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04x86/cpufeature: Enable RING3MWAIT for Knights LandingGrzegorz Andrejczuk
Enable ring 3 MONITOR/MWAIT for Intel Xeon Phi x200 codenamed Knights Landing. Presence of this feature cannot be detected automatically (by reading any other MSR) therefore it is required to explicitly check for the family and model of the CPU before attempting to enable it. Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Piotr.Luc@intel.com Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/r/1484918557-15481-5-git-send-email-grzegorz.andrejczuk@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04x86/cpufeature: Add RING3MWAIT to CPU featuresGrzegorz Andrejczuk
Add software-defined CPUID bit for the non-architectural ring 3 MONITOR/MWAIT feature. Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Piotr.Luc@intel.com Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/r/1484918557-15481-4-git-send-email-grzegorz.andrejczuk@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04x86/elf: Add HWCAP2 to expose ring 3 MONITOR/MWAITGrzegorz Andrejczuk
Introduce ELF_HWCAP2 variable for x86 and reserve its bit 0 to expose the ring 3 MONITOR/MWAIT. HWCAP variables contain bitmasks which can be used by userspace applications to detect which instruction sets are supported by CPU. On x86 architecture information about CPU capabilities can be checked via CPUID instructions, unfortunately presence of ring 3 MONITOR/MWAIT feature cannot be checked this way. ELF_HWCAP cannot be used as well, because on x86 it is set to CPUID[1].EDX which means that all bits are reserved there. HWCAP2 approach was chosen because it reuses existing solution present in other architectures, so only minor modifications are required to the kernel and userspace applications. When ELF_HWCAP2 is defined kernel maps it to AT_HWCAP2 during the start of the application. This way the ring 3 MONITOR/MWAIT feature can be detected using getauxval() API in a simple and fast manner. ELF_HWCAP2 type is u32 to be consistent with x86 ELF_HWCAP type. Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Piotr.Luc@intel.com Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/r/1484918557-15481-3-git-send-email-grzegorz.andrejczuk@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-04x86/msr: Add MSR_MISC_FEATURE_ENABLES and RING3MWAIT bitGrzegorz Andrejczuk
Define new MSR MISC_FEATURE_ENABLES (0x140). On supported CPUs if bit 1 of this MSR is set, then calling MONITOR and MWAIT instructions outside of ring 0 will not cause invalid-opcode exception. The MSR MISC_FEATURE_ENABLES is not yet documented in the SDM. Here is the relevant documentation: Hex Dec Name Scope 140H 320 MISC_FEATURE_ENABLES Thread 0 Reserved 1 If set to 1, the MONITOR and MWAIT instructions do not cause invalid-opcode exceptions when executed with CPL > 0 or in virtual-8086 mode. If MWAIT is executed when CPL > 0 or in virtual-8086 mode, and if EAX indicates a C-state other than C0 or C1, the instruction operates as if EAX indicated the C-state C1. 63:2 Reserved Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Piotr.Luc@intel.com Cc: dave.hansen@linux.intel.com Link: http://lkml.kernel.org/r/1484918557-15481-2-git-send-email-grzegorz.andrejczuk@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-03Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "A single fix this time: a fix for a virtqueue removal bug which only appears to affect S390, but which results in the queue hanging forever thus causing the machine to fail shutdown" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: virtio_scsi: Reject commands when virtqueue is broken
2017-02-04cpufreq: Fix typos in commentsViresh Kumar
- s/freqnency/frequency/ - s/accomodating/accommodating/ Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04PM / runtime: Avoid false-positive warnings from might_sleep_if()Rafael J. Wysocki
The might_sleep_if() assertions in __pm_runtime_idle(), __pm_runtime_suspend() and __pm_runtime_resume() may generate false-positive warnings in some situations. For example, that happens if a nested pm_runtime_get_sync()/pm_runtime_put() pair is executed with disabled interrupts within an outer pm_runtime_get_sync()/pm_runtime_put() section for the same device. [Generally, pm_runtime_get_sync() may sleep, so it should not be called with disabled interrupts, but in this particular case the previous pm_runtime_get_sync() guarantees that the device will not be suspended, so the inner pm_runtime_get_sync() will return immediately after incrementing the device's usage counter.] That started to happen in the i915 driver in 4.10-rc, leading to the following splat: BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1032 in_atomic(): 1, irqs_disabled(): 0, pid: 1500, name: Xorg 1 lock held by Xorg/1500: #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0680c13>] i915_mutex_lock_interruptible+0x43/0x140 [i915] CPU: 0 PID: 1500 Comm: Xorg Not tainted Call Trace: dump_stack+0x85/0xc2 ___might_sleep+0x196/0x260 __might_sleep+0x53/0xb0 __pm_runtime_resume+0x7a/0x90 intel_runtime_pm_get+0x25/0x90 [i915] aliasing_gtt_bind_vma+0xaa/0xf0 [i915] i915_vma_bind+0xaf/0x1e0 [i915] i915_gem_execbuffer_relocate_entry+0x513/0x6f0 [i915] i915_gem_execbuffer_relocate_vma.isra.34+0x188/0x250 [i915] ? trace_hardirqs_on+0xd/0x10 ? i915_gem_execbuffer_reserve_vma.isra.31+0x152/0x1f0 [i915] ? i915_gem_execbuffer_reserve.isra.32+0x372/0x3a0 [i915] i915_gem_do_execbuffer.isra.38+0xa70/0x1a40 [i915] ? __might_fault+0x4e/0xb0 i915_gem_execbuffer2+0xc5/0x260 [i915] ? __might_fault+0x4e/0xb0 drm_ioctl+0x206/0x450 [drm] ? i915_gem_execbuffer+0x340/0x340 [i915] ? __fget+0x5/0x200 do_vfs_ioctl+0x91/0x6f0 ? __fget+0x111/0x200 ? __fget+0x5/0x200 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc6 even though the code triggering it is correct. Unfortunately, the might_sleep_if() assertions in question are too coarse-grained to cover such cases correctly, so make them a bit less sensitive in order to avoid the false-positives. Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/vhost fixes from Michael S. Tsirkin: "Last minute fixes: - ARM DMA fix revert - vhost endian-ness fix - MAINTAINERS: email address change for Amit" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS: update email address for Amit Shah vhost: fix initialization for vq->is_le Revert "vring: Force use of DMA API for ARM-based systems with legacy devices"
2017-02-03Merge tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fix from Alex Williamson: "Fix an error path in SPAPR IOMMU backend (Alexey Kardashevskiy)" * tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio: vfio/spapr: Fix missing mutex unlock when creating a window
2017-02-04cpufreq: intel_pstate: Disable energy efficiency optimizationSrinivas Pandruvada
Some Kabylake desktop processors may not reach max turbo when running in HWP mode, even if running under sustained 100% utilization. This occurs when the HWP.EPP (Energy Performance Preference) is set to "balance_power" (0x80) -- the default on most systems. It occurs because the platform BIOS may erroneously enable an energy-efficiency setting -- MSR_IA32_POWER_CTL BIT-EE, which is not recommended to be enabled on this SKU. On the failing systems, this BIOS issue was not discovered when the desktop motherboard was tested with Windows, because the BIOS also neglects to provide the ACPI/CPPC table, that Windows requires to enable HWP, and so Windows runs in legacy P-state mode, where this setting has no effect. Linux' intel_pstate driver does not require ACPI/CPPC to enable HWP, and so it runs in HWP mode, exposing this incorrect BIOS configuration. There are several ways to address this problem. First, Linux can also run in legacy P-state mode on this system. As intel_pstate is how Linux enables HWP, booting with "intel_pstate=disable" will run in acpi-cpufreq/ondemand legacy p-state mode. Or second, the "performance" governor can be used with intel_pstate, which will modify HWP.EPP to 0. Or third, starting in 4.10, the /sys/devices/system/cpu/cpufreq/policy*/energy_performance_preference attribute in can be updated from "balance_power" to "performance". Or fourth, apply this patch, which fixes the erroneous setting of MSR_IA32_POWER_CTL BIT_EE on this model, allowing the default configuration to function as designed. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Len Brown <len.brown@intel.com> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04cpufreq: intel_pstate: Calculate guaranteed performance for HWPSrinivas Pandruvada
When HWP is active, turbo activation ratio is not used to calculate max non turbo ratio. But on these systems the max non turbo ratio is decided by config TDP settings. This change removes usage of MSR_TURBO_ACTIVATION_RATIO for HWP systems, instead directly use TDP ratios, when more than one TDPs are available. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04cpufreq: intel_pstate: Make HWP limits compatible with legacySrinivas Pandruvada
Under HWP the performance limits are calculated using max_perf_pct and min_perf_pct using possible performance, not available performance. The available performance can be reduced by no_turbo setting. To make compatible with legacy mode, use max/min performance percentage with respect to available performance. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04cpufreq: intel_pstate: Lower frequency than expected under no_turboSrinivas Pandruvada
When turbo is not disabled by BIOS, but user disabled from intel P-State sysfs and changes max/min using cpufreq sysfs, the resultant frequency is lower than what user requested. The reason for this, when the perf limits are calculated in set_policy() callback, they are with reference to max cpu frequency (turbo frequency ), but when enforced in the intel_pstate_get_min_max() they are with reference to max available performance as documented in the intel_pstate documentation (in this case max non turbo P-State). This needs similar change as done in intel_cpufreq_verify_policy() for passive mode. Set policy->cpuinfo.max_freq based on the turbo status. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04cpufreq: intel_pstate: Operation mode control from sysfsRafael J. Wysocki
Make it possible to change the operation mode of intel_pstate with the help of a new sysfs attribute called "status". There are three possible configurations that can be selected using this attribute: "off" - The driver is not in use at this time. "active" - The driver works as a P-state governor (default). "passive" - The driver works as a regular cpufreq one and collaborates with the generic cpufreq governors (it sets P-states as requested by those governors). [This is the same mode the driver can be started in by passing intel_pstate=passive in the kernel command line.] The current setting is returned by reads from this attribute. Writing one of the above strings to it changes the operation mode as indicated by that string, if possible. If HW-managed P-states (HWP) feature is enabled, it is not possible to change the driver's operation mode and attempts to write to this attribute will fail. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04cpufreq: intel_pstate: Expose global sysfs attributes upfrontRafael J. Wysocki
Expose the intel_pstate's global sysfs attributes before registering the driver to prepare for the addition of an attribute that also will have to work if the driver is not registered. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04cpufreq: Remove CPUFREQ_START notifier eventViresh Kumar
Its not used anymore, remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-04ACPI: processor_perflib: Simplify code and stop using CPUFREQ_STARTViresh Kumar
acpi_processor_ppc_notifier() can live without using CPUFREQ_START (which is gonna be removed soon), as it is only used while setting ignore_ppc to 0. This can be done with the help of "ignore_ppc < 0" check alone. The notifier function anyway ignores all events except CPUFREQ_ADJUST and dropping CPUFREQ_START wouldn't harm at all. Once CPUFREQ_START event is removed from the cpufreq core, acpi_processor_ppc_notifier() will get called only for CPUFREQ_NOTIFY or CPUFREQ_ADJUST event. Drop the return statement from the first if block to make sure we don't ignore any such events. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03cpufreq: powernv: Add boost files to export ultra-turbo frequenciesShilpasri G Bhat
In P8+, Workload Optimized Frequency(WOF) provides the capability to boost the cpu frequency based on the utilization of the other cpus running in the chip. The On-Chip-Controller(OCC) firmware will control the achievability of these frequencies depending on the power headroom available in the chip. Currently the ultra-turbo frequencies provided by this feature are exported along with the turbo and sub-turbo frequencies as scaling_available_frequencies. This patch will export the ultra-turbo frequencies separately as scaling_boost_frequencies in WOF enabled systems. This patch will add the boost sysfs file which can be used to disable/enable ultra-turbo frequencies. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03cpufreq: Documentation: Updates based on current codeViresh Kumar
The cpufreq core has gone though lots of updates in recent times, but on many occasions the documentation wasn't updated along with the code. This patch tries to catchup the documentation with the code. Also add Rafael and Viresh as the contributors to the documentation. Based on a patch from Claudio Scordino. Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03cpufreq: Documentation: Minor reformattingViresh Kumar
This patch doesn't change the content of the documentation, but rather reformat it to make it more readable. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03cpufreq: Remove CONFIG_CPU_FREQ_STAT_DETAILS config optionViresh Kumar
This doesn't have any benefit apart from saving a small amount of memory when it is disabled. The ifdef hackery in the code makes it dirty unnecessarily. Clean it up by removing the Kconfig option completely. Few defconfigs are also updated and CONFIG_CPU_FREQ_STAT_DETAILS is replaced with CONFIG_CPU_FREQ_STAT now in them, as users wanted stats to be enabled. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03cpufreq: Remove policy create/remove notifiersViresh Kumar
Those were added by: commit fcd7af917abb ("cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly") but aren't used anymore since: commit 1aefc75b2449 ("cpufreq: stats: Make the stats code non-modular"). Remove them. Also remove the redundant parameter to the respective routines. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-03Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "8 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, fs: check for fatal signals in do_generic_file_read() fs: break out of iomap_file_buffered_write on fatal signals base/memory, hotplug: fix a kernel oops in show_valid_zones() mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() jump label: pass kbuild_cflags when checking for asm goto support shmem: fix sleeping from atomic context kasan: respect /proc/sys/kernel/traceoff_on_warning zswap: disable changing params if init fails