summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2017-02-01efi/x86: Move the EFI BGRT init code to early init codeDave Young
Before invoking the arch specific handler, efi_mem_reserve() reserves the given memory region through memblock. efi_bgrt_init() will call efi_mem_reserve() after mm_init(), at which time memblock is dead and should not be used anymore. The EFI BGRT code depends on ACPI initialization to get the BGRT ACPI table, so move parsing of the BGRT table to ACPI early boot code to ensure that efi_mem_reserve() in EFI BGRT code still use memblock safely. Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-acpi@vger.kernel.org Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-9-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01efi: Use typed function pointers for the runtime services tableArd Biesheuvel
Instead of using void pointers, and casting them to correctly typed function pointers upon use, declare the runtime services pointers as function pointers using their respective prototypes, for which typedefs are already available. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-8-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01efi: Introduce the EFI_MEM_ATTR bit and set it from the memory attributes tableSai Praneeth
UEFI v2.6 introduces a configuration table called EFI_MEMORY_ATTRIBUTES_TABLE which provides additional information about EFI runtime regions. Currently this table describes memory protections that may be applied to the EFI Runtime code and data regions by the kernel. Allocate a EFI_XXX bit to keep track of whether this feature is published by firmware or not. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Lee, Chun-Yi <jlee@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-5-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01x86/efi: Deduplicate efi_char16_printk()Lukas Wunner
Eliminate the separate 32-bit and 64x- bit code paths by way of the shiny new efi_call_proto() macro. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/1485868902-20401-3-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/uncore: Make package handling more robustThomas Gleixner
The package management code in uncore relies on package mapping being available before a CPU is started. This changed with: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") because the ACPI/BIOS information turned out to be unreliable, but that left uncore in broken state. This was not noticed because on a regular boot all CPUs are online before uncore is initialized. Move the allocation to the CPU online callback and simplify the hotplug handling. At this point the package mapping is established and correct. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-01perf/x86/intel/rapl: Make package handling more robustThomas Gleixner
The package management code in RAPL relies on package mapping being available before a CPU is started. This changed with: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") because the ACPI/BIOS information turned out to be unreliable, but that left RAPL in broken state. This was not noticed because on a regular boot all CPUs are online before RAPL is initialized. A possible fix would be to reintroduce the mess which allocates a package data structure in CPU prepare and when it turns out to already exist in starting throw it away later in the CPU online callback. But that's a horrible hack and not required at all because RAPL becomes functional for perf only in the CPU online callback. That's correct because user space is not yet informed about the CPU being onlined, so nothing caan rely on RAPL being available on that particular CPU. Move the allocation to the CPU online callback and simplify the hotplug handling. At this point the package mapping is established and correct. This also adds a missing check for available package data in the event_init() function. Reported-by: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") Link: http://lkml.kernel.org/r/20170131230141.212593966@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31Merge branch 'for-4.10-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu fix from Tejun Heo: "Douglas found and fixed a ref leak bug in percpu_ref_tryget[_live](). The bug is caused by storing the return value of atomic_long_inc_not_zero() into an int temp variable before returning it as a bool. The interim cast to int loses the upper bits and can lead to false negatives. As percpu_ref uses a high bit to mark a draining counter, this can happen relatively easily. Fixed by using bool for the temp variable" * 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu-refcount: fix reference leak during percpu-atomic transition
2017-01-31block: move internal_tag to same cache line as tagJens Axboe
Since we removed cmd_type, we now have a hole in the struct. Move the internal_tag member to the same cacheline as tag, since we use them at the same time. This doesn't fix the hole, just moves it elsewhere. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31block: fold cmd_type into the REQ_OP_ spaceChristoph Hellwig
Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31ide: don't abuse cmd_typeChristoph Hellwig
Currently the legacy ide driver defines several request types of it's own, which is in the way of removing that field entirely. Instead add a type field to struct ide_request and use that to distinguish the different types of IDE-internal requests. It's a bit of a mess, but so is the surrounding code.. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31block: introduce blk_rq_is_passthroughChristoph Hellwig
This can be used to check for fs vs non-fs requests and basically removes all knowledge of BLOCK_PC specific from the block layer, as well as preparing for removing the cmd_type field in struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31fscache: Fix dead object requeueDavid Howells
Under some circumstances, an fscache object can become queued such that it fscache_object_work_func() can be called once the object is in the OBJECT_DEAD state. This results in the kernel oopsing when it tries to invoke the handler for the state (which is hard coded to 0x2). The way this comes about is something like the following: (1) The object dispatcher is processing a work state for an object. This is done in workqueue context. (2) An out-of-band event comes in that isn't masked, causing the object to be queued, say EV_KILL. (3) The object dispatcher finishes processing the current work state on that object and then sees there's another event to process, so, without returning to the workqueue core, it processes that event too. It then follows the chain of events that initiates until we reach OBJECT_DEAD without going through a wait state (such as WAIT_FOR_CLEARANCE). At this point, object->events may be 0, object->event_mask will be 0 and oob_event_mask will be 0. (4) The object dispatcher returns to the workqueue processor, and in due course, this sees that the object's work item is still queued and invokes it again. (5) The current state is a work state (OBJECT_DEAD), so the dispatcher jumps to it - resulting in an OOPS. When I'm seeing this, the work state in (1) appears to have been either LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is fscache_osm_lookup_oob). The window for (2) is very small: (A) object->event_mask is cleared whilst the event dispatch process is underway - though there's no memory barrier to force this to the top of the function. The window, therefore is from the time the object was selected by the workqueue processor and made requeueable to the time the mask was cleared. (B) fscache_raise_event() will only queue the object if it manages to set the event bit and the corresponding event_mask bit was set. The enqueuement is then deferred slightly whilst we get a ref on the object and get the per-CPU variable for workqueue congestion. This slight deferral slightly increases the probability by allowing extra time for the workqueue to make the item requeueable. Handle this by giving the dead state a processor function and checking the for the dead state address rather than seeing if the processor function is address 0x2. The dead state processor function can then set a flag to indicate that it's occurred and give a warning if it occurs more than once per object. If this race occurs, an oops similar to the following is seen (note the RIP value): BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 IP: [<0000000000000002>] 0x1 PGD 0 Oops: 0010 [#1] SMP Modules linked in: ... CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 Workqueue: fscache_object fscache_object_work_func [fscache] task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000 RIP: 0010:[<0000000000000002>] [<0000000000000002>] 0x1 RSP: 0018:ffff880717547df8 EFLAGS: 00010202 RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200 RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480 RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510 R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000 R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600 FS: 0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00 ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000 ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900 Call Trace: [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache] [<ffffffff8109d5db>] process_one_work+0x17b/0x470 [<ffffffff8109e4ac>] worker_thread+0x21c/0x400 [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400 [<ffffffff810a5acf>] kthread+0xcf/0xe0 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 [<ffffffff816460d8>] ret_from_fork+0x58/0x90 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeremy McNicoll <jeremymc@redhat.com> Tested-by: Frank Sorenson <sorenson@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-01-31lightnvm: allow targets to use sysfsJavier González
In order to register through the sysfs interface, a driver needs to know its kobject. On a disk structure, this happens when the partition information is added (device_add_disk), which for lightnvm takes place after the target has been initialized. This means that on target initialization, the kboject has not been created yet. This patch adds a target function to let targets initialize their own kboject as a child of the disk kobject. Signed-off-by: Javier González <javier@cnexlabs.com> Added exit typedef and passed gendisk instead of void pointer for exit. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: Add CRC read errorJavier González
Let the host differentiate between a read error and a CRC check error on the device side. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: use end_io callback instead of instanceMatias Bjørling
When the lightnvm core had the "gennvm" layer between the device and the target, there was a need for the core to be able to figure out which target it should send an end_io callback to. Leading to a "double" end_io, first for the media manager instance, and then for the target instance. Now that core and gennvm is merged, there is no longer a need for this, and a single end_io callback will do. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: reduce number of nvm_id groups to oneMatias Bjørling
The number of configuration groups has been limited to one in current code, even if there is support for up to four. With the introduction of the open-channel SSD 1.3 specification, only a single group is exposed onwards. Reflect this in the nvm_id structure. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: cleanup nvm transformation functionsMatias Bjørling
Going from target specific ppa addresses to device was accomplished by first converting target to generic ppa addresses and generic to device addresses. The conversion was either open-coded or used the built-in nvm_trans_* and nvm_map_* functions for conversion. Simplify the interface and cleanup the calls to provide clean functions that now either take a list of ppas or a nvm_rq, and is exposed through: void nvm_ppa_* - target to/from device with a list of PPAs, void nvm_rq_* - target to/from device with a nvm_rq. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: remove nvm_get_bb_tbl and nvm_set_bb_tblMatias Bjørling
Since the merge of gennvm and core, there is no longer a need for the device specific bad block functions. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: remove nvm_submit_ppa* functionsMatias Bjørling
The nvm_submit_ppa* functions are no longer needed after gennvm and core have been merged. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: collapse nvm_erase_ppa and nvm_erase_blkMatias Bjørling
After gennvm and core have been merged, there are no more callers to nvm_erase_ppa. Therefore collapse the device specific and target specific erase functions. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31lightnvm: merge gennvm with coreMatias Bjørling
For the first iteration of Open-Channel SSDs, it was anticipated that there could be various media managers on top of an open-channel SSD, such to allow vendors to plug in their own host-side FTLs, without the media manager in between. Now that an Open-Channel SSD is exposed as a traditional block device, there is no longer a need for this. Therefore lets merge the gennvm code with core and simplify the stack. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommuCQ Tang
Some of the macros are incorrect with wrong bit-shifts resulting in picking the incorrect invalidation granularity. Incorrect Source-ID in extended devtlb invalidation caused device side errors. To: Joerg Roedel <joro@8bytes.org> To: David Woodhouse <dwmw2@infradead.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Cc: CQ Tang <cq.tang@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Fixes: 2f26e0a9 ("iommu/vt-d: Add basic SVM PASID support") Signed-off-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Tested-by: CQ Tang <cq.tang@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-01-31Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read()Dexuan Cui
Commit a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()") added the proper mb(), but removed the test "prev_write_sz < pending_sz" when making the signal decision. As a result, the guest can signal the host unnecessarily, and then the host can throttle the guest because the host thinks the guest is buggy or malicious; finally the user running stress test can perceive intermittent freeze of the guest. This patch brings back the test, and properly handles the in-place consumption APIs used by NetVSC (see get_next_pkt_raw(), put_pkt_raw() and commit_rd_index()). Fixes: a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com> Tested-by: Rolf Neugebauer <rolf.neugebauer@docker.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-31Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.gitKalle Valo
ath.git patches for 4.11. Major changes: wcn36xx * convert to a proper QCOM_SMD driver (from the platform_driver interface) ath10k * VHT160 support * dump Copy Engine registers during firmware crash * search board file extension from SMBIOS wil6210 * add disable_ap_sme module parameter
2017-01-31PM / devfreq: Fix available_governor sysfsChanwoo Choi
The devfreq using passive governor is not able to change the governor. So, the user can not change the governor through 'available_governor' sysfs entry. Also, the devfreq which don't use the passive governor is not able to change to 'passive' governor on the fly. Fixes: 996133119f57 ("PM / devfreq: Add new passive governor") Cc: stable@vger.kernel.org Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
2017-01-31Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - Dynticks updates, consolidating open-coded counter accesses into a well-defined API - SRCU updates: Simplify algorithm, add formal verification - Documentation updates - Miscellaneous fixes - Torture-test updates Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-31dmaengine: Provide a wrapper for memcpy operationsBoris Brezillon
Almost all ->device_prep_dma_xx() methods have a wrapper defined in dmaengine.h. Add one for ->device_prep_dma_memcpy(). Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2017-01-31cpuidle:powernv: Add helper function to populate powernv idle states.Gautham R. Shenoy
In the current code for powernv_add_idle_states, there is a lot of code duplication while initializing an idle state in powernv_states table. Add an inline helper function to populate the powernv_states[] table for a given idle state. Invoke this for populating the "Nap", "Fastsleep" and the stop states in powernv_add_idle_states. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-30net/mlx4_en: Pass user MTU value to Firmware at set port commandShaker Daibes
When starting the port, driver will inform Firmware about the actual MTU which does not include implicit headers, such as FCS or VLAN tags. Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30net/mlx4_en: Adding support of turning off link autonegotiation via ethtoolAriel Levkovich
This feature will allow the user to disable auto negotiation on the port for mlx4 devices while setting the speed is limited to 1GbE speeds. Other speeds will not be accepted in autoneg off mode. This functionality is permitted providing that the firmware is compatible with this feature. The above is determined by querying a new dedicated capability bit in the device. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-30[media] ir-rx51: port to rc-coreSean Young
This driver was written using lirc since rc-core did not support transmitter-only hardware at that time. Now that it does, port this driver. Compile tested only. Signed-off-by: Sean Young <sean@mess.org> Cc: Timo Kokkonen <timo.t.kokkonen@iki.fi> Cc: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-01-30Merge branch 'iommu/iommu-priv' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/core
2017-01-30Merge branch 'for-joerg/arm-smmu/updates' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu
2017-01-30Merge branch 'iommu/guest-msi' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/core
2017-01-30irqdomain: Avoid activating interrupts more than onceMarc Zyngier
Since commit f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early"), we can end-up activating a PCI/MSI twice (once at allocation time, and once at startup time). This is normally of no consequences, except that there is some HW out there that may misbehave if activate is used more than once (the GICv3 ITS, for example, uses the activate callback to issue the MAPVI command, and the architecture spec says that "If there is an existing mapping for the EventID-DeviceID combination, behavior is UNPREDICTABLE"). While this could be worked around in each individual driver, it may make more sense to tackle the issue at the core level. In order to avoid getting in that situation, let's have a per-interrupt flag to remember if we have already activated that interrupt or not. Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early") Reported-and-tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1484668848-24361-1-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-30Merge tag 'v4.10-rc6' into develLinus Walleij
Linux 4.10-rc6 Resolved conflicts in: drivers/pinctrl/pinctrl-amd.c drivers/pinctrl/samsung/pinctrl-exynos.c
2017-01-30perf/core: Remove perf_cpu_context::unique_pmuDavid Carrillo-Cisneros
cpuctx->unique_pmu was originally introduced as a way to identify cpuctxs with shared pmus in order to avoid visiting the same cpuctx more than once in a for_each_pmu loop. cpuctx->unique_pmu == cpuctx->pmu in non-software task contexts since they have only one pmu per cpuctx. Since perf_pmu_sched_task() is only called in hw contexts, this patch replaces cpuctx->unique_pmu by cpuctx->pmu in it. The change above, together with the previous patch in this series, removed the remaining uses of cpuctx->unique_pmu, so we remove it altogether. Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Vince Weaver <vince@deater.net> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170118192454.58008-3-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30perf/core: Make cgroup switch visit only cpuctxs with cgroup eventsDavid Carrillo-Cisneros
This patch follows from a conversation in CQM/CMT's last series about speeding up the context switch for cgroup events: https://patchwork.kernel.org/patch/9478617/ This is a low-hanging fruit optimization. It replaces the iteration over the "pmus" list in cgroup switch by an iteration over a new list that contains only cpuctxs with at least one cgroup event. This is necessary because the number of PMUs have increased over the years e.g modern x86 server systems have well above 50 PMUs. The iteration over the full PMU list is unneccessary and can be costly in heavy cache contention scenarios. Below are some instrumentation measurements with 10, 50 and 90 percentiles of the total cost of context switch before and after this optimization for a simple array read/write microbenchark. Contention Level Nr events Before (us) After (us) Median L2 L3 types (10%, 50%, 90%) (10%, 50%, 90% Speedup -------------------------------------------------------------------------- Low Low 1 (1.72, 2.42, 5.85) (1.35, 1.64, 5.46) 29% High Low 1 (2.08, 4.56, 19.8) (1720, 2.20, 13.7) 51% High High 1 (2.86, 10.4, 12.7) (2.54, 4.32, 12.1) 58% Low Low 2 (1.98, 3.20, 6.89) (1.68, 2.41, 8.89) 24% High Low 2 (2.48, 5.28, 22.4) (2150, 3.69, 14.6) 30% High High 2 (3.32, 8.09, 13.9) (2.80, 5.15, 13.7) 36% where: 1 event type = cycles 2 event types = cycles,intel_cqm/llc_occupancy/ Contention L2 Low: workset < L2 cache size. High: " >> L2 " " . Contention L3 Low: workset of task on all sockets < L3 cache size. High: " " " " " " >> L3 " " . Median Speedup is (50%ile Before - 50%ile After) / 50%ile Before Unsurprisingly, the benefits of this optimization decrease with the number of cpuctxs with a cgroup events, yet, is never detrimental. Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Cc: Vince Weaver <vince@deater.net> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20170118192454.58008-2-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30mtd: nand: fsmc: remove stale non-DT probe pathLinus Walleij
The FSMC driver has an execution path and a header file in <linux/mtd/fsmc.h> that serves to support passing in platform data through board files, albeit no upstream users of this mechanism exist. The header file also contains function headers for functions that do not exist in the kernel. Delete this and move the platform data struct, parsing and handling into the driver, assume we are using OF and make the driver depend on OF, remove the ifdefs making that optional. Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Stefan Roese <sr@denx.de> Cc: Vipin Kumar <vipin.kumar@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Stefan Roese <sr@denx.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-01-30mtd: nand: fix nand.h kernel-doc warningsRandy Dunlap
Fix kernel-doc warnings in <linux/mtd/nand.h>: ..//include/linux/mtd/nand.h:658: warning: No description found for parameter 'tCEH_min' ..//include/linux/mtd/nand.h:877: warning: No description found for parameter 'data_interface' Fixes: eee64b700e26 ("mtd: nand: Introduce nand_data_interface") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-01-30Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-30Merge branch 'fortglx/4.11/time' of ↵Thomas Gleixner
https://git.linaro.org/people/john.stultz/linux into timers/core - Remove unused functions - Document udelay inaccuracy - Remove posix timer data from task struct when posix timers are off
2017-01-30PM / OPP: Add 'struct kref' to struct dev_pm_oppViresh Kumar
Add kref to struct dev_pm_opp for easier accounting of the OPPs. Note that the OPPs are freed under the opp_table->lock mutex only. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-30PM / OPP: Return opp_table from dev_pm_opp_set_*() routinesViresh Kumar
Now that we have proper kernel reference infrastructure in place for OPP tables, use it to guarantee that the OPP table isn't freed while being used by the callers of dev_pm_opp_set_*() APIs. Make them all return the pointer to the OPP table after taking its reference and put the reference back with dev_pm_opp_put_*() APIs. Now that the OPP table wouldn't get freed while these routines are executing after dev_pm_opp_get_opp_table() is called, there is no need to take opp_table_lock. Drop them as well. Remove the rcu specific comments from these routines as they aren't relevant anymore. Note that prototypes of dev_pm_opp_{set|put}_regulators() were already updated by another patch. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-30PM / OPP: Add 'struct kref' to OPP tableViresh Kumar
Add kref to struct opp_table for easier accounting of the OPP table. Note that the new routine dev_pm_opp_get_opp_table() takes the reference from under the opp_table_lock, which guarantees that the OPP table doesn't get freed unless dev_pm_opp_put_opp_table() is called for the OPP table. Two separate release mechanisms are added: locked and unlocked. In unlocked version the routines aren't required to take/drop opp_table_lock as the callers have already done that. This is required to avoid breaking git bisect, otherwise we may get lockdeps between commits. Once all the users of OPP table are updated the unlocked version shall be removed. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-29net: add devm version of alloc_etherdev_mqs functionRafał Miłecki
This patch adds devm_alloc_etherdev_mqs function and devm_alloc_etherdev macro. These can be used for simpler netdev allocation without having to care about calling free_netdev. Thanks to this change drivers, their error paths and removal paths may get simpler by a bit. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-29can: Fix kernel panic at security_sock_rcv_skbEric Dumazet
Zhang Yanmin reported crashes [1] and provided a patch adding a synchronize_rcu() call in can_rx_unregister() The main problem seems that the sockets themselves are not RCU protected. If CAN uses RCU for delivery, then sockets should be freed only after one RCU grace period. Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's ease stable backports with the following fix instead. [1] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0 Call Trace: <IRQ> [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60 [<ffffffff81d55771>] sk_filter+0x41/0x210 [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0 [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0 [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370 [<ffffffff81f07af9>] can_receive+0xd9/0x120 [<ffffffff81f07beb>] can_rcv+0xab/0x100 [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0 [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0 [<ffffffff81d37f67>] process_backlog+0x127/0x280 [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0 [<ffffffff810c88d4>] __do_softirq+0x184/0x440 [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40 [<ffffffff810c8bed>] do_softirq+0x1d/0x20 [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110 [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520 [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230 [<ffffffff810e3baf>] process_one_work+0x24f/0x670 [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0 [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480 [<ffffffff810ebafc>] kthread+0x12c/0x150 [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70 Reported-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-29leds: class: Add new optional brightness_hw_changed attributeHans de Goede
Some LEDs may have their brightness level changed autonomously (outside of kernel control) by hardware / firmware. This commit adds support for an optional brightness_hw_changed attribute to signal such changes to userspace (if a driver can detect them): What: /sys/class/leds/<led>/brightness_hw_changed Date: January 2017 KernelVersion: 4.11 Description: Last hardware set brightness level for this LED. Some LEDs may be changed autonomously by hardware/firmware. Only LEDs where this happens and the driver can detect this, will have this file. This file supports poll() to detect when the hardware changes the brightness. Reading this file will return the last brightness level set by the hardware, this may be different from the current brightness. Drivers which want to support this, simply add LED_BRIGHT_HW_CHANGED to their flags field and call led_classdev_notify_brightness_hw_changed() with the hardware set brightness when they detect a hardware / firmware triggered brightness change. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
2017-01-28Merge tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Stable patches: - NFSv4.1: Fix a deadlock in layoutget - NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors - NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode - Fix a memory leak when removing the SUNRPC module Bugfixes: - Fix a reference leak in _pnfs_return_layout" * tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: pNFS: Fix a reference leak in _pnfs_return_layout nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED" SUNRPC: cleanup ida information when removing sunrpc module NFSv4.0: always send mode in SETATTR after EXCLUSIVE4 nfs: Don't increment lock sequence ID after NFS4ERR_MOVED NFSv4.1: Fix a deadlock in layoutget
2017-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <davem@davemloft.net>