linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2016-09-22	net: ethernet: mediatek: fix missing changes merged for conflicts ↵	Sean Wang
	overlapping commits add the missing commits about 1) Commit d3bd1ce4db8e843dce421e2f8f123e5251a9c7d3 ("remove redundant free_irq for devm_request_ir allocated irq") 2) Commit 7c6b0d76fa02213393815e3b6d5e4a415bf3f0e2 ("fix logic unbalance between probe and remove") during merge for conflicts overlapping commits by Commit b20b378d49926b82c0a131492fa8842156e0e8a9 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22	net/mlx4_core: Fix to clean devlink resources	Kamal Heib
	This patch cleans devlink resources by calling devlink_port_unregister() to avoid the following issues: - Kernel panic when triggering reset flow. - Memory leak due to unfreed resources in mlx4_init_port_info(). Fixes: 09d4d087cd48 ("mlx4: Implement devlink interface") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22	cxgb4: add support for drop and redirect actions	Rahul Lakkireddy
	Add support for dropping matched packets in hardware. Also add support for re-directing matched packets to a specified port in hardware. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22	cxgb4: add support for offloading u32 filters	Rahul Lakkireddy
	Add support for offloading u32 filter onto hardware. Links are stored in a jump table to perform necessary jumps to match TCP/UDP header. When inserting rules in the linked bucket, the TCP/UDP match fields in the corresponding entry of the jump table are appended to the filter rule before insertion. If a link is deleted, then all corresponding filters associated with the link are also deleted. Also enable hardware tc offload as a supported feature. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22	cxgb4: add parser to translate u32 filters to internal spec	Rahul Lakkireddy
	Parse information sent by u32 into internal filter specification. Add support for parsing several fields in IPv4, IPv6, TCP, and UDP. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22	cxgb4: add common api support for configuring filters	Rahul Lakkireddy
	Enable filters for non-offload configuration and add common api support for setting and deleting filters in LE-TCAM region of the hardware. IPv4 filters occupy one slot. IPv6 filters occupy 4 slots and must be on a 4-slot boundary. IPv4 filters can not occupy a slot belonging to IPv6 and the vice-versa is also true. Filters are set and deleted asynchronously. Use completion to wait for reply from firmware in order to allow for synchronization if needed. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22	cxgb4: move common filter code to separate file	Rahul Lakkireddy
	Move common filter code to separate files. Also fix the following checkpatch checks. CHECK: Comparison to NULL could be written "!f->l2t" + if (f->l2t == NULL) { CHECK: spaces preferred around that '/' (ctx:VxV) + fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr)/16)); Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	net/mlx4_core: Fix deadlock when switching between polling and event fw commands	Jack Morgenstein
	When switching from polling-based fw commands to event-based fw commands, there is a race condition which could cause a fw command in another task to hang: that task will keep waiting for the polling sempahore, but may never be able to acquire it. This is due to mlx4_cmd_use_events, which "down"s the sempahore back to 0. During driver initialization, this is not a problem, since no other tasks which invoke FW commands are active. However, there is a problem if the driver switches to polling mode and then back to event mode during normal operation. The "test_interrupts" feature does exactly that. Running "ethtool -t <eth device> offline" causes the PF driver to temporarily switch to polling mode, and then back to event mode. (Note that for VF drivers, such switching is not performed). Fix this by adding a read-write semaphore for protection when switching between modes. Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	net/mlx4_core: Use RCU to perform radix tree lookup for SRQ	Leon Romanovsky
	Radix tree lookup can be performed without locking. Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Suggested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	net/mlx4_en: Fix wrong indentation	Kamal Heib
	Use tabs instead of spaces before if statement, no functional change. Fixes: e7c1c2c46201 ("mlx4_en: Added self diagnostics test implementation") Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	net/mlx4_en: Add branch prediction hints in RX data-path	Tariq Toukan
	Add likely/unlikely hints to improve branch predictions in the RX data-path. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	Merge tag 'wireless-drivers-for-davem-2016-09-20' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.8 iwlwifi * fix to prevent firmware crash when sending off-channel frames ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nfp: bpf: add offload of TC direct action mode	Jakub Kicinski
	Add offload of TC in direct action mode. We just need to provide appropriate checks in the verifier and a new outro block to translate the exit codes to what data path expects Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nfp: bpf: add support for legacy redirect action	Jakub Kicinski
	Data path has redirect support so expressing redirect to the port frame came from is a trivial matter of setting the right result code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nfp: bpf: add packet marking support	Jakub Kicinski
	Add missing ABI defines and eBPF instructions to allow mark to be passed on and extend prepend parsing on the RX path to pick it up from packet metadata. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nfp: bpf: allow offloaded filters to update stats	Jakub Kicinski
	Periodically poll stats and call into offloaded actions to update them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nfp: bpf: add hardware bpf offload	Jakub Kicinski
	Add hardware bpf offload on our smart NICs. Detect if capable firmware is loaded and use it to load the code JITed with just added translator onto programmable engines. This commit only supports offloading cls_bpf in legacy mode (non-direct action). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nfp: add BPF to NFP code translator	Jakub Kicinski
	Add translator for JITing eBPF to operations which can be executed on NFP's programmable engines. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21	nvdimm: remove duplicate nd_mapping declaration	Dave Jiang
	Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-22	rtc: ac100: Add NULL checking for devm_kzalloc call	Axel Lin
	devm_kzalloc can return NULL, add NULL checking to prevent NULL pointer dereference. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-09-22	rtc: ds1347: changed raw spi calls to register map calls	Raghavendra Ganiga
	This patch changes calls of spi read write calls to register map read and write calls in rtc ds1347 Signed-off-by: Raghavendra Chandra Ganiga <ravi23ganiga@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-09-21	i2c: mux: pca954x: retry updating the mux selection on failure	Peter Rosin
	The cached value of the last selected channel prevents retries on the next call, even on failure to update the selected channel. Fix that. Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2016-09-21	i2c: octeon: Fix high-level controller status check	Jan Glauber
	In case the high-level controller (HLC) is used the status code is reported at a different location. Check that location after HLC write operations if the ready bit is not set and return an appropriate error code instead of always returning -EAGAIN. Signed-off-by: Jan Glauber <jglauber@cavium.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-09-21	i2c: octeon: Avoid sending STOP during recovery	Dmitry Bazhenov
	Due to a bug in the ThunderX I2C hardware sending STOP during a recovery attempt could lock up the hardware. To work around this problem do not send STOP at the beginning of the recovery but use the override registers to bring the TWSI including the high-level controller out of the bad state. Signed-off-by: Dmitry Bazhenov <dmitry.bazhenov@auriga.com> Signed-off-by: Jan Glauber <jglauber@cavium.com> [Changed commit message] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-09-21	i2c: octeon: Fix set SCL recovery function	Dmitry Bazhenov
	The set SCL recovery function unconditionally pulls the SCL line low. Only pull SCL line low according to val parameter. Signed-off-by: Dmitry Bazhenov <dmitry.bazhenov@auriga.com> Signed-off-by: Jan Glauber <jglauber@cavium.com> [Changed commit message] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-09-21	clk: nxp: clk-lpc32xx: Unmap region obtained by of_iomap	Arvind Yadav
	Free memory mapping, if lpc32xx_clk_init is not successful. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Vladimir Zapolskiy <vz@mleia.com> Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
2016-09-21	scsi: hpsa: correct call to hpsa_do_reset	Don Brace
	calling fill_cmd() using a MACRO definition not handled in switch statement causes BUG() to be called. Signed-off-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-09-21	Merge tag 'qcom-ebi2-arm-soc' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/drivers Pull "Qualcomm EBI2 bindings and bus driver" from Linus Walleij * tag 'qcom-ebi2-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator: bus: qcom: add EBI2 driver bus: qcom: add EBI2 device tree bindings Acked-by: Andy Gross <andy.gross@linaro.org>
2016-09-21	Merge tag 'mvebu-drivers-4.9-1' of git://git.infradead.org/linux-mvebu into ↵	Arnd Bergmann
	next/drivers Pull "mvebu drivers for 4.9 (part 1)" from Gregory CLEMENT: - Add pinctrl and clk support for the Orion5x SoC mv88f5181 variant * tag 'mvebu-drivers-4.9-1' of git://git.infradead.org/linux-mvebu: pinctrl: mvebu: orion5x: Generalise mv88f5181l support for 88f5181 clk: mvebu: Add clk support for the orion5x SoC mv88f5181
2016-09-21	scsi: ufs: Get a TM service response from the correct offset	Kiwoong Kim
	When any UFS host controller receives a TM(Task Management) response from a UFS device, UFS driver has been recognize like receiving a message of "Task Management Function Complete"(00h) in all cases, so far. That means there is no pending task for a tag of the TM request sent before in the UFS device. That's because the byte offset 6 in TM response which has been used to get a TM service response so far represents just whether or not a TM transmission passes. Regarding UFS spec, the correct byte offset to get TM service response is 15, not 6. I tested that UFS driver responds properly for the TM response from a UFS device with an reference board with exynos8890, as follow: No pending task -> Task Management Function Complete (00h) Pending task -> Task Management Function Succeeded (08h) [mkp: applied by hand] Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: HeonGwang Chu <hg.chu@samsung.com> Tested-by: : Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-09-21	rtc: cmos: Restore alarm after resume	Gabriele Mazzotta
	Some platform firmware may interfere with the RTC alarm over suspend, resulting in the kernel and hardware having different ideas about system state but also potentially causing problems with firmware that assumes the OS will clean this case up. This patch restores the RTC alarm on resume to ensure that kernel and hardware are in sync. The case we've seen is Intel Rapid Start, which is a firmware-mediated feature that automatically transitions systems from suspend-to-RAM to suspend-to-disk without OS involvement. It does this by setting the RTC alarm and a flag that indicates that on wake it should perform the transition rather than re-starting the OS. However, if the OS has set a wakeup alarm that would wake the machine earlier, it refuses to overwrite it and allows the system to wake instead. This fails in the following situation: 1) User configures Intel Rapid Start to transition after (say) 15 minutes 2) User suspends to RAM. Firmware sets the wakeup alarm for 15 minutes in the future 3) User resumes after 5 minutes. Firmware does not reset the alarm, and as such it is still set for 10 minutes in the future 4) User suspends after 5 minutes. Firmware notices that the alarm is set for 5 minutes in the future, which is less than the 15 minute transition threshold. It therefore assumes that the user wants the machine to wake in 5 minutes 5) System resumes after 5 minutes The worst case scenario here is that the user may have put the system in a bag between (4) and (5), resulting in it running in a confined space and potentially overheating. This seems reasonably important. The Rapid Start support code got added in 3.11, but it can be configured in the firmware regardless of kernel support. Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-09-21	rtc: cmos: Clear ACPI-driven alarms upon resume	Gabriele Mazzotta
	Currently ACPI-driven alarms are not cleared when they wake the system. As consequence, expired alarms must be manually cleared to program a new alarm. Fix this by correctly handling ACPI-driven alarms. More specifically, the ACPI specification [1] provides for two alternative implementations of the RTC. Depending on the implementation, the driver either clear the alarm from the resume callback or from ACPI interrupt handler: - The platform has the RTC wakeup status fixed in hardware (ACPI_FADT_FIXED_RTC is 0). In this case the driver can determine if the RTC was the reason of the wakeup from the resume callback by reading the RTC status register. - The platform has no fixed hardware feature event bits. In this case a GPE is used to wake the system and the driver clears the alarm from its handler. [1] http://www.acpi.info/DOWNLOADS/ACPI_5_Errata%20A.pdf Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-09-21	rtc: omap: Support ext_wakeup configuration	Marcin Niestroj
	Support configuration of ext_wakeup sources. This patch makes it possible to enable ext_wakeup and set it's polarity, depending on board configuration. AM335x's dedicated PMIC (tps65217) uses ext_wakeup to notify about power-button presses. Handling power-button presses enables to recover from RTC-only power states correctly. Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2016-09-21	acpi: Validate processor id when mapping the processor	Dou Liyang
	When we want to identify whether the proc_id is unreasonable or not, we can call the "acpi_processor_validate_proc_id" function. It will search in the duplicate IDs. If we find the proc_id in the IDs, we return true to the call function. Conversely, the false represents available. When we establish all possible cpuid <-> nodeid mapping to handle the cpu hotplugs, we will use the proc_id from ACPI table. We do validation when we get the proc_id. If the result is true, we will stop the mapping. [ tglx: Mark the new function __init ] Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: mika.j.penttila@gmail.com Cc: len.brown@intel.com Cc: rafael@kernel.org Cc: rjw@rjwysocki.net Cc: yasu.isimatu@gmail.com Cc: linux-mm@kvack.org Cc: linux-acpi@vger.kernel.org Cc: isimatu.yasuaki@jp.fujitsu.com Cc: gongzhaogang@inspur.com Cc: tj@kernel.org Cc: izumi.taku@jp.fujitsu.com Cc: cl@linux.com Cc: chen.tang@easystack.cn Cc: akpm@linux-foundation.org Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1472114120-3281-8-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21	acpi: Provide mechanism to validate processors in the ACPI tables	Dou Liyang
	[Problem] When we set cpuid <-> nodeid mapping to be persistent, it will use the DSDT As we know, the ACPI tables are just like user's input in that respect, and we don't crash if user's input is unreasonable. Such as, the mapping of the proc_id and pxm in some machine's ACPI table is like this: proc_id \| pxm -------------------- 0 <-> 0 1 <-> 0 2 <-> 1 3 <-> 1 89 <-> 0 89 <-> 0 89 <-> 0 89 <-> 1 89 <-> 1 89 <-> 2 89 <-> 3 ..... We can't be sure which one is correct to the proc_id 89. We may map a wrong node to a cpu. When pages are allocated, this may cause a kernal panic. So, we should provide mechanisms to validate the ACPI tables, just like we do validation to check user's input in web project. The mechanism is that the processor objects which have the duplicate IDs are not valid. [Solution] We add a validation function, like this: foreach Processor in DSDT proc_id = get_ACPI_Processor_number(Processor) if (proc_id exists ) mark both of them as being unreasonable; The function will record the unique or duplicate processor IDs. The duplicate processor IDs such as 89 are regarded as the unreasonable IDs which mean that the processor objects in question are not valid. [ tglx: Add __init[data] annotations ] Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: mika.j.penttila@gmail.com Cc: len.brown@intel.com Cc: rafael@kernel.org Cc: rjw@rjwysocki.net Cc: yasu.isimatu@gmail.com Cc: linux-mm@kvack.org Cc: linux-acpi@vger.kernel.org Cc: isimatu.yasuaki@jp.fujitsu.com Cc: gongzhaogang@inspur.com Cc: tj@kernel.org Cc: izumi.taku@jp.fujitsu.com Cc: cl@linux.com Cc: chen.tang@easystack.cn Cc: akpm@linux-foundation.org Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1472114120-3281-7-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21	x86/acpi: Set persistent cpuid <-> nodeid mapping when booting	Gu Zheng
	The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, when node online/offline happens, cache based on cpuid <-> nodeid mapping such as wq_numa_possible_cpumask will not cause any problem. It contains 4 steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. 3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid. 4. Establish all possible cpuid <-> nodeid mapping. This patch finishes step 4. This patch set the persistent cpuid <-> nodeid mapping for all enabled/disabled processors at boot time via an additional acpi namespace walk for processors. [ tglx: Remove the unneeded exports ] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: mika.j.penttila@gmail.com Cc: len.brown@intel.com Cc: rafael@kernel.org Cc: rjw@rjwysocki.net Cc: yasu.isimatu@gmail.com Cc: linux-mm@kvack.org Cc: linux-acpi@vger.kernel.org Cc: isimatu.yasuaki@jp.fujitsu.com Cc: gongzhaogang@inspur.com Cc: tj@kernel.org Cc: izumi.taku@jp.fujitsu.com Cc: cl@linux.com Cc: chen.tang@easystack.cn Cc: akpm@linux-foundation.org Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1472114120-3281-6-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21	x86/acpi: Enable MADT APIs to return disabled apicids	Gu Zheng
	The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, when node online/offline happens, cache based on cpuid <-> nodeid mapping such as wq_numa_possible_cpumask will not cause any problem. It contains 4 steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. 3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid. 4. Establish all possible cpuid <-> nodeid mapping. This patch finishes step 3. There are four mappings in the kernel: 1. nodeid (logical node id) <-> pxm (persistent) 2. apicid (physical cpu id) <-> nodeid (persistent) 3. cpuid (logical cpu id) <-> apicid (not persistent, now persistent by step 2) 4. cpuid (logical cpu id) <-> nodeid (not persistent) So, in order to setup persistent cpuid <-> nodeid mapping for all possible CPUs, we should: 1. Setup cpuid <-> apicid mapping for all possible CPUs, which has been done in step 1, 2. 2. Setup cpuid <-> nodeid mapping for all possible CPUs. But before that, we should obtain all apicids from MADT. All processors' apicids can be obtained by _MAT method or from MADT in ACPI. The current code ignores disabled processors and returns -ENODEV. After this patch, a new parameter will be added to MADT APIs so that caller is able to control if disabled processors are ignored. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: mika.j.penttila@gmail.com Cc: len.brown@intel.com Cc: rafael@kernel.org Cc: rjw@rjwysocki.net Cc: yasu.isimatu@gmail.com Cc: linux-mm@kvack.org Cc: linux-acpi@vger.kernel.org Cc: isimatu.yasuaki@jp.fujitsu.com Cc: gongzhaogang@inspur.com Cc: tj@kernel.org Cc: izumi.taku@jp.fujitsu.com Cc: cl@linux.com Cc: chen.tang@easystack.cn Cc: akpm@linux-foundation.org Cc: kamezawa.hiroyu@jp.fujitsu.com Cc: lenb@kernel.org Link: http://lkml.kernel.org/r/1472114120-3281-5-git-send-email-douly.fnst@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-21	carl9170: fix debugfs crashes	Christian Lamparter
	Ben Greear reported: > I see lots of instability as soon as I load up the carl9710 NIC. > My application is going to be poking at it's debugfs files... > > BUG: KASAN: slab-out-of-bounds in carl9170_debugfs_read+0xd5/0x2a0 > [carl9170] at addr 0xffff8801bc1208b0 > Read of size 8 by task btserver/5888 > ======================================================================= > BUG kmalloc-256 (Tainted: G W ): kasan: bad access detected > ----------------------------------------------------------------------- > > INFO: Allocated in seq_open+0x50/0x100 age=2690 cpu=2 pid=772 >... This breakage was caused by the introduction of intermediate fops in debugfs by commit 9fd4dcece43a ("debugfs: prevent access to possibly dead file_operations at file open") Thankfully, the original/real fops are still available in d_fsdata. Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Cc: stable <stable@vger.kernel.org> # 4.7+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-21	libnvdimm, namespace: debug invalid interleave-set-cookie values	Dan Williams
	If platform firmware fails to populate unique / non-zero serial number data for each nvdimm in an interleave-set it may cause pmem region initialization to fail. Add a debug message for this case. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-21	nfit: fail DSMs that return non-zero status by default	Dan Williams
	For the DSMs where the kernel knows the format of the output buffer and originates those DSMs from within the kernel, return -EIO for any non-zero status. If the BIOS is indicating a status that we do not know how to handle, fail the DSM. Cc: <stable@vger.kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-21	libnvdimm: fix devm_nvdimm_memremap() error path	Dan Williams
	The internal alloc_nvdimm_map() helper might fail, particularly if the memory region is already busy. Report request_mem_region() failures and check for the failure. Reported-by: Ryan Chen <ryan.chan105@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-21	usb: misc: legousbtower: Fix NULL pointer deference	Greg Kroah-Hartman
	This patch fixes a NULL pointer dereference caused by a race codition in the probe function of the legousbtower driver. It re-structures the probe function to only register the interface after successfully reading the board's firmware ID. The probe function does not deregister the usb interface after an error receiving the devices firmware ID. The device file registered (/dev/usb/legousbtower%d) may be read/written globally before the probe function returns. When tower_delete is called in the probe function (after an r/w has been initiated), core dev structures are deleted while the file operation functions are still running. If the 0 address is mappable on the machine, this vulnerability can be used to create a Local Priviege Escalation exploit via a write-what-where condition by remapping dev->interrupt_out_buffer in tower_write. A forged USB device and local program execution would be required for LPE. The USB device would have to delay the control message in tower_probe and accept the control urb in tower_open whilst guest code initiated a write to the device file as tower_delete is called from the error in tower_probe. This bug has existed since 2003. Patch tested by emulated device. Reported-by: James Patrick-Evans <james@jmp-e.com> Tested-by: James Patrick-Evans <james@jmp-e.com> Signed-off-by: James Patrick-Evans <james@jmp-e.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-21	raid5: handle register_shrinker failure	Shaohua Li
	register_shrinker() now can fail. When it happens, shrinker.nr_deferred is null. We use it to determine if unregister_shrinker is required. Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	raid5: fix to detect failure of register_shrinker	Chao Yu
	register_shrinker can fail after commit 1d3d4437eae1 ("vmscan: per-node deferred work"), we should detect the failure of it, otherwise we may fail to register shrinker after raid5 configuration was setup successfully. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	md: fix a potential deadlock	Shaohua Li
	lockdep reports a potential deadlock. Fix this by droping the mutex before md_import_device [ 1137.126601] ====================================================== [ 1137.127013] [ INFO: possible circular locking dependency detected ] [ 1137.127013] 4.8.0-rc4+ #538 Not tainted [ 1137.127013] ------------------------------------------------------- [ 1137.127013] mdadm/16675 is trying to acquire lock: [ 1137.127013] (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff81243cf3>] __blkdev_get+0x63/0x450 [ 1137.127013] but task is already holding lock: [ 1137.127013] (detected_devices_mutex){+.+.+.}, at: [<ffffffff81a5138c>] md_ioctl+0x2ac/0x1f50 [ 1137.127013] which lock already depends on the new lock. [ 1137.127013] the existing dependency chain (in reverse order) is: [ 1137.127013] -> #1 (detected_devices_mutex){+.+.+.}: [ 1137.127013] [<ffffffff810b6f19>] lock_acquire+0xb9/0x220 [ 1137.127013] [<ffffffff81c51647>] mutex_lock_nested+0x67/0x3d0 [ 1137.127013] [<ffffffff81a4eeaf>] md_autodetect_dev+0x3f/0x90 [ 1137.127013] [<ffffffff81595be8>] rescan_partitions+0x1a8/0x2c0 [ 1137.127013] [<ffffffff81590081>] __blkdev_reread_part+0x71/0xb0 [ 1137.127013] [<ffffffff815900e5>] blkdev_reread_part+0x25/0x40 [ 1137.127013] [<ffffffff81590c4b>] blkdev_ioctl+0x51b/0xa30 [ 1137.127013] [<ffffffff81242bf1>] block_ioctl+0x41/0x50 [ 1137.127013] [<ffffffff81214c96>] do_vfs_ioctl+0x96/0x6e0 [ 1137.127013] [<ffffffff81215321>] SyS_ioctl+0x41/0x70 [ 1137.127013] [<ffffffff81c56825>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 1137.127013] -> #0 (&bdev->bd_mutex){+.+.+.}: [ 1137.127013] [<ffffffff810b6af2>] __lock_acquire+0x1662/0x1690 [ 1137.127013] [<ffffffff810b6f19>] lock_acquire+0xb9/0x220 [ 1137.127013] [<ffffffff81c51647>] mutex_lock_nested+0x67/0x3d0 [ 1137.127013] [<ffffffff81243cf3>] __blkdev_get+0x63/0x450 [ 1137.127013] [<ffffffff81244307>] blkdev_get+0x227/0x350 [ 1137.127013] [<ffffffff812444f6>] blkdev_get_by_dev+0x36/0x50 [ 1137.127013] [<ffffffff81a46d65>] lock_rdev+0x35/0x80 [ 1137.127013] [<ffffffff81a49bb4>] md_import_device+0xb4/0x1b0 [ 1137.127013] [<ffffffff81a513d6>] md_ioctl+0x2f6/0x1f50 [ 1137.127013] [<ffffffff815909b3>] blkdev_ioctl+0x283/0xa30 [ 1137.127013] [<ffffffff81242bf1>] block_ioctl+0x41/0x50 [ 1137.127013] [<ffffffff81214c96>] do_vfs_ioctl+0x96/0x6e0 [ 1137.127013] [<ffffffff81215321>] SyS_ioctl+0x41/0x70 [ 1137.127013] [<ffffffff81c56825>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 1137.127013] other info that might help us debug this: [ 1137.127013] Possible unsafe locking scenario: [ 1137.127013] CPU0 CPU1 [ 1137.127013] ---- ---- [ 1137.127013] lock(detected_devices_mutex); [ 1137.127013] lock(&bdev->bd_mutex); [ 1137.127013] lock(detected_devices_mutex); [ 1137.127013] lock(&bdev->bd_mutex); [ 1137.127013] * DEADLOCK * Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	md/bitmap: fix wrong cleanup	Shaohua Li
	if bitmap_create fails, the bitmap is already cleaned up and the returned value is an error number. We can't do the cleanup again. Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	raid5: allow arbitrary max_hw_sectors	Shaohua Li
	raid5 will split bio to proper size internally, there is no point to use underlayer disk's max_hw_sectors. In my qemu system, without the change, the raid5 only receives 128k size bio, which reduces the chance of bio merge sending to underlayer disks. Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	md-cluster: make resync lock also could be interruptted	Guoqing Jiang
	When one node is perform resync or recovery, other nodes can't get resync lock and could block for a while before it holds the lock, so we can't stop array immediately for this scenario. To make array could be stop quickly, we check MD_CLOSING in dlm_lock_sync_interruptible to make us can interrupt the lock request. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang	Guoqing Jiang
	When some node leaves cluster, then it's bitmap need to be synced by another node, so "md*_recover" thread is triggered for the purpose. However, with below steps. we can find tasks hang happened either in B or C. 1. Node A create a resyncing cluster raid1, assemble it in other two nodes (B and C). 2. stop array in B and C. 3. stop array in A. linux44:~ # ps aux\|grep md\|grep D root 5938 0.0 0.1 19852 1964 pts/0 D+ 14:52 0:00 mdadm -S md0 root 5939 0.0 0.0 0 0 ? D 14:52 0:00 [md0_recover] linux44:~ # cat /proc/5939/stack [<ffffffffa04cf321>] dlm_lock_sync+0x71/0x90 [md_cluster] [<ffffffffa04d0705>] recover_bitmaps+0x125/0x220 [md_cluster] [<ffffffffa052105d>] md_thread+0x16d/0x180 [md_mod] [<ffffffff8107ad94>] kthread+0xb4/0xc0 [<ffffffff8152a518>] ret_from_fork+0x58/0x90 linux44:~ # cat /proc/5938/stack [<ffffffff8107afde>] kthread_stop+0x6e/0x120 [<ffffffffa0519da0>] md_unregister_thread+0x40/0x80 [md_mod] [<ffffffffa04cfd20>] leave+0x70/0x120 [md_cluster] [<ffffffffa0525e24>] md_cluster_stop+0x14/0x30 [md_mod] [<ffffffffa05269ab>] bitmap_free+0x14b/0x150 [md_mod] [<ffffffffa0523f3b>] do_md_stop+0x35b/0x5a0 [md_mod] [<ffffffffa0524e83>] md_ioctl+0x873/0x1590 [md_mod] [<ffffffff81288464>] blkdev_ioctl+0x214/0x7d0 [<ffffffff811dd3dd>] block_ioctl+0x3d/0x40 [<ffffffff811b92d4>] do_vfs_ioctl+0x2d4/0x4b0 [<ffffffff811b9538>] SyS_ioctl+0x88/0xa0 [<ffffffff8152a5c9>] system_call_fastpath+0x16/0x1b The problem is caused by recover_bitmaps can't reliably abort when the thread is unregistered. So dlm_lock_sync_interruptible is introduced to detect the thread's situation to fix the problem. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2016-09-21	md-cluster: convert the completion to wait queue	Guoqing Jiang
	Previously, we used completion to sync between require dlm lock and sync_ast, however we will have to expose completion.wait and completion.done in dlm_lock_sync_interruptible (introduced later), it is not a common usage for completion, so convert related things to wait queue. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>