summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-18r8169: fix NAPI handling under high loadHeiner Kallweit
rtl_rx() and rtl_tx() are called only if the respective bits are set in the interrupt status register. Under high load NAPI may not be able to process all data (work_done == budget) and it will schedule subsequent calls to the poll callback. rtl_ack_events() however resets the bits in the interrupt status register, therefore subsequent calls to rtl8169_poll() won't call rtl_rx() and rtl_tx() - chip interrupts are still disabled. Fix this by calling rtl_rx() and rtl_tx() independent of the bits set in the interrupt status register. Both functions will detect if there's nothing to do for them. Fixes: da78dbff2e05 ("r8169: remove work from irq handler.") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18docs: Fix typos in histogram.rstMasanari Iida
This patch fixes some spelling typos. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-10-18sparc: Revert unintended perf changes.David S. Miller
Some local debugging hacks accidently slipped into the VDSO commit. Sorry! Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18Merge branch 'sctp-fix-sk_wmem_queued-and-use-it-to-check-for-writable-space'David S. Miller
Xin Long says: ==================== sctp: fix sk_wmem_queued and use it to check for writable space sctp doesn't count and use asoc sndbuf_used, sk sk_wmem_alloc and sk_wmem_queued properly, which also causes some problem. This patchset is to improve it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18sctp: use sk_wmem_queued to check for writable spaceXin Long
sk->sk_wmem_queued is used to count the size of chunks in out queue while sk->sk_wmem_alloc is for counting the size of chunks has been sent. sctp is increasing both of them before enqueuing the chunks, and using sk->sk_wmem_alloc to check for writable space. However, sk_wmem_alloc is also increased by 1 for the skb allocked for sending in sctp_packet_transmit() but it will not wake up the waiters when sk_wmem_alloc is decreased in this skb's destructor. If msg size is equal to sk_sndbuf and sendmsg is waiting for sndbuf, the check 'msg_len <= sctp_wspace(asoc)' in sctp_wait_for_sndbuf() will keep waiting if there's a skb allocked in sctp_packet_transmit, and later even if this skb got freed, the waiting thread will never get waked up. This issue has been there since very beginning, so we change to use sk->sk_wmem_queued to check for writable space as sk_wmem_queued is not increased for the skb allocked for sending, also as TCP does. SOCK_SNDBUF_LOCK check is also removed here as it's for tx buf auto tuning which I will add in another patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18sctp: count both sk and asoc sndbuf with skb truesize and sctp_chunk sizeXin Long
Now it's confusing that asoc sndbuf_used is doing memory accounting with SCTP_DATA_SNDSIZE(chunk) + sizeof(sk_buff) + sizeof(sctp_chunk) while sk sk_wmem_alloc is doing that with skb->truesize + sizeof(sctp_chunk). It also causes sctp_prsctp_prune to count with a wrong freed memory when sndbuf_policy is not set. To make this right and also keep consistent between asoc sndbuf_used, sk sk_wmem_alloc and sk_wmem_queued, use skb->truesize + sizeof(sctp_chunk) for them. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18drm: Get ref on CRTC commit object when waiting for flip_doneLeo Li
This fixes a general protection fault, caused by accessing the contents of a flip_done completion object that has already been freed. It occurs due to the preemption of a non-blocking commit worker thread W by another commit thread X. X continues to clear its atomic state at the end, destroying the CRTC commit object that W still needs. Switching back to W and accessing the commit objects then leads to bad results. Worker W becomes preemptable when waiting for flip_done to complete. At this point, a frequently occurring commit thread X can take over. Here's an example where W is a worker thread that flips on both CRTCs, and X does a legacy cursor update on both CRTCs: ... 1. W does flip work 2. W runs commit_hw_done() 3. W waits for flip_done on CRTC 1 4. > flip_done for CRTC 1 completes 5. W finishes waiting for CRTC 1 6. W waits for flip_done on CRTC 2 7. > Preempted by X 8. > flip_done for CRTC 2 completes 9. X atomic_check: hw_done and flip_done are complete on all CRTCs 10. X updates cursor on both CRTCs 11. X destroys atomic state 12. X done 13. > Switch back to W 14. W waits for flip_done on CRTC 2 15. W raises general protection fault The error looks like so: general protection fault: 0000 [#1] PREEMPT SMP PTI **snip** Call Trace: lock_acquire+0xa2/0x1b0 _raw_spin_lock_irq+0x39/0x70 wait_for_completion_timeout+0x31/0x130 drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper] amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu] commit_tail+0x3d/0x70 [drm_kms_helper] process_one_work+0x212/0x650 worker_thread+0x49/0x420 kthread+0xfb/0x130 ret_from_fork+0x3a/0x50 Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O) gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt fb_sys_fops ttm(O) drm(O) Note that i915 has this issue masked, since hw_done is signaled after waiting for flip_done. Doing so will block the cursor update from happening until hw_done is signaled, preventing the cursor commit from destroying the state. v2: The reference on the commit object needs to be obtained before hw_done() is signaled, since that's the point where another commit is allowed to modify the state. Assuming that the new_crtc_state->commit object still exists within flip_done() is incorrect. Fix by getting a reference in setup_commit(), and releasing it during default_clear(). Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1539611200-6184-1-git-send-email-sunpeng.li@amd.com
2018-10-18docs: Introduce deprecated APIs listKees Cook
As discussed in the "API replacement/deprecation" thread[1], this makes an effort to document what things shouldn't get (re)added to the kernel, by introducing Documentation/process/deprecated.rst. [1] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2018-September/005282.html Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-10-18clk: qcom: gcc-sdm660: Add MODULE_LICENSEStephen Boyd
Add a module license to match the license at the top of this file and silence a build warning. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2018-10-18kernel-doc: fix declaration type determinationRandy Dunlap
Make declaration type determination more robust. When scripts/kernel-doc is deciding if some kernel-doc notation contains an enum, a struct, a union, a typedef, or a function, it does a pattern match on the beginning of the string, looking for a match with one of "struct", "union", "enum", or "typedef", and otherwise defaults to a function declaration type. However, if a function or a function-like macro has a name that begins with "struct" (e.g., struct_size()), then kernel-doc incorrectly decides that this is a struct declaration. Fix this by looking for the declaration type keywords having an ending word boundary (\b), so that "struct_size" will not match a struct declaration. I compared lots of html before/after output from core-api, driver-api, and networking. There were no differences in any of the files that I checked. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-10-18orangefs: no need to check for service_operation returns > 0Mike Marshall
service_operation returns > 0 is undefined. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-10-18doc: fix a typo in adding-syscalls.rstGuillaume Dore
There was a typo in adding-syscalls.rst that could mislead developers to add a C filename in a makefile instead of an object filename. This error, while not keeping developers from contributing could slow the development process down by introducing build errors. Signed-off-by: Guillaume Dore <corwin@poussif.eu> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-10-18orangefs: some error code paths missed kmem_cache_freeMike Marshall
If a slab cache object is allocated, it needs to be freed eventually, certainly before anyone unloads the module that allocated it. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-10-18orangefs: don't let orangefs_iget return NULL.Mike Marshall
Suggested by Dan Carpenter. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-10-18orangefs: don't let orangefs_new_inode return NULLMike Marshall
Suggested by Dan Carpenter Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-10-18usb: phy: ab8500: silence some uninitialized variable warningsDan Carpenter
Smatch complains that "reg" can be uninitialized if the abx500_get_register_interruptible() call fails. It's an interruptable function, so we should check if the user presses CTRL-C. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18usb: xhci: tegra: Add genpd supportJon Hunter
The generic power-domain framework has been updated to allow devices that require more than one power-domain to create a new device for each power-domain required and then link these new power-domain devices to the consumer device. Update the Tegra xHCI driver to use the new APIs provided by the generic power-domain framework so we can use the generic power-domain framework for managing the xHCI controllers power-domains. Please note that to maintain backward compatibility with older device-tree blobs these new generic power-domain APIs are only used if the 'power-domains' property is present and otherwise we fall back to using the legacy Tegra APIs for managing the power-domains. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18usb: xhci: tegra: Power-off power-domains on removalJon Hunter
Currently the XUSB power domains used by the Tegra xHCI controller are never powered off on the removal of the driver, however, they will be powered off on probe failure. Update the removal code to be consistent with the probe failure path to power off the XUSB power domains. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18usbip:vudc: BUG kmalloc-2048 (Not tainted): Poison overwrittenShuah Khan (Samsung OSG)
In rmmod path, usbip_vudc does platform_device_put() twice once from platform_device_unregister() and then from put_vudc_device(). The second put results in: BUG kmalloc-2048 (Not tainted): Poison overwritten error or BUG: KASAN: use-after-free in kobject_put+0x1e/0x230 if KASAN is enabled. [ 169.042156] calling init+0x0/0x1000 [usbip_vudc] @ 1697 [ 169.042396] ============================================================================= [ 169.043678] probe of usbip-vudc.0 returned 1 after 350 usecs [ 169.044508] BUG kmalloc-2048 (Not tainted): Poison overwritten [ 169.044509] ----------------------------------------------------------------------------- ... [ 169.057849] INFO: Freed in device_release+0x2b/0x80 age=4223 cpu=3 pid=1693 [ 169.057852] kobject_put+0x86/0x1b0 [ 169.057853] 0xffffffffc0c30a96 [ 169.057855] __x64_sys_delete_module+0x157/0x240 Fix it to call platform_device_del() instead and let put_vudc_device() do the platform_device_put(). Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18usbip: tools: fix atoi() on non-null terminated stringColin Ian King
Currently the call to atoi is being passed a single char string that is not null terminated, so there is a potential read overrun along the stack when parsing for an integer value. Fix this by instead using a 2 char string that is initialized to all zeros to ensure that a 1 char read into the string is always terminated with a \0. Detected by cppcheck: "Invalid atoi() argument nr 1. A nul-terminated string is required." Fixes: 3391ba0e2792 ("usbip: tools: Extract generic code to be shared with vudc backend") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18USB: misc: appledisplay: fix backlight update_status return codeMattias Jacobsson
Upon success the update_status handler returns a positive number corresponding to the number of bytes transferred by usb_control_msg. However the return code of the update_status handler should indicate if an error occurred(negative) or how many bytes of the user's input to sysfs that was consumed. Return code zero indicates all bytes were consumed. The bug can for example result in the update_status handler being called twice, the second time with only the "unconsumed" part of the user's input to sysfs. Effectively setting an incorrect brightness. Change the update_status handler to return zero for all successful transactions and forward usb_control_msg's error code upon failure. Signed-off-by: Mattias Jacobsson <2pi@mok.nu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18phy: phy-pxa-usb: add a new driverLubomir Rintel
Turned from arch/arm/mach-mmp/devices.c into a proper PHY driver, so that in can be instantiated from a DT. Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2018-10-17 This series adds support for the new igc driver. The igc driver is the new client driver supporting the Intel I225 Ethernet Controller, which supports 2.5GbE speeds. The reason for creating a new client driver, instead of adding support for the new device in e1000e, is that the silicon behaves more like devices supported in igb driver. It also did not make sense to add a client part, to the igb driver which supports only 1GbE server parts. This initial set of patches is designed for basic support (i.e. link and pass traffic). Follow-on patch series will add more advanced support like VLAN, Wake-on-LAN, etc.. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18Merge tag 'mlx5-updates-2018-10-17' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2018-10-17 ======================================================================== From Or Gerlitz <ogerlitz@mellanox.com>: This series from Paul adds support to mlx5 e-switch tc offloading of multiple priorities and chains. This is made of four building blocks (along with few minor driver refactors): [1] Split FDB fast path prio to multiple namespaces Currently the FDB name-space contains two priorities, fast path (p0) and slow path (p1). The slow path contains the per representor SQ send-to-vport TX rule and the match-all RX miss rule. As a pre-step to support multi-chains and priorities, we split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. [2] E-Switch chains and priorities A chain is a group of priorities. We use the fdb parallel sub-namespaces to implement chains, and a flow table for each priority in them. Because these namespaces are parallel and in series to the slow path fdb, the chains aren't connected to each other (but to the slow path), and one must use a explicit goto action to reach a different chain. Flow tables for the priorities are created on demand and destroyed once not used. [3] Add a no-append flow insertion mode, use it for TC offloads Enhance the driver fs core, such that if a no-append flag is set by the caller, we add a new FTE, instead of appending the actions of the inserted rule when the same match already exists. For encap rules, we defer the HW offloading till we have a valid neighbor. This can result in the packet hitting a lower priority rule in the HW DP. Use the no-append API to push these packets to the slow path FDB table, so they go to the TC kernel DP as done before priorities where supported. [4] Offloading tc priorities and chains for eswitch flows Using [1], [2] and [3] above we add the support for offloading both chains and priorities. To get to a new chain, use the tc goto action. We support a fixed prio range 1-16, and chains 0-3. ============================================================================= Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18Merge tag 'v4.20-rockchip-clk1' of ↵Stephen Boyd
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into clk-rockchip Pull Rockchip clk driver updates from Heiko Stuebner: Fixes for static checker warning and a wrong shift value for the mmc on rk3328, as well as setting a hdmi-id on rk3066 for a later driver for the hdmi encoder on it and some adapted rk3288 pll rates to support more and better hdmi rates. * tag 'v4.20-rockchip-clk1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call clk: rockchip: use the newly added clock-id for hdmi on RK3066 clk: rockchip: add clock-id for HCLK_HDMI on rk3066 clk: rockchip: fix wrong mmc sample phase shift for rk3328 clk: rockchip: improve rk3288 pll rates for better hdmi output
2018-10-18Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-10-18 1) Remove an unnecessary dev->tstats check in xfrmi_get_stats64. From Li RongQing. 2) We currently do a sizeof(element) instead of a sizeof(array) check when initializing the ovec array of the secpath. Currently this array can have only one element, so code is OK but error-prone. Change this to do a sizeof(array) check so that we can add more elements in future. From Li RongQing. 3) Improve xfrm IPv6 address hashing by using the complete IPv6 addresses for a hash. From Michal Kubecek. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-10-18 1) Free the xfrm interface gro_cells when deleting the interface, otherwise we leak it. From Li RongQing. 2) net/core/flow.c does not exist anymore, so remove it from the MAINTAINERS file. 3) Fix a slab-out-of-bounds in _decode_session6. From Alexei Starovoitov. 4) Fix RCU protection when policies inserted into thei bydst lists. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-18PCI: aardvark: Implement emulated root PCI bridge config spaceZachary Zhang
The PCI controller in the Marvell Armada 3720 does not implement a software-accessible root port PCI bridge configuration space. This causes a number of problems when using PCIe switches or when the Max Payload size needs to be aligned between the root complex and the endpoint. Implementing an emulated root PCI bridge, like is already done in the pci-mvebu driver for older Marvell platforms allows to solve those issues, and also to support features such as ASR, PME, VC, HP. Signed-off-by: Zachary Zhang <zhangzg@marvell.com> [Thomas: convert to the common emulated PCI bridge logic.] Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-10-18PCI: mvebu: Convert to PCI emulated bridge config spaceThomas Petazzoni
Convert the pci-mvebu driver to use the pci-bridge-emul logic, that helps emulating a root port PCI bridge configuration space. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-10-18PCI: mvebu: Drop unused PCI express capability codeThomas Petazzoni
Commit dc0352ab0b2a0 ("PCI: mvebu: Add PCI Express root complex capability block") added support for emulating the PCI Express capability block. As part of this, the pcie_sltcap, pcie_devctl and pcie_rtctl fields were added to the mvebu_sw_pci_bridge structure, and used when reading the corresponding PCI Express capability block registers. However, those structure members are never set to any value other than zero. This makes them unneeded because: - pcie_devctl is used to OR *value, so with pcie_devctl always zero, it has no effect. - for pcie_sltcap and pcie_rtstl, the mvebu_sw_pci_bridge_read() function always returns 0 for registers that are not explicitly handled. In preparation for reworking the PCI bridge emulation logic in pci-mvebu, let's simplify the code by dropping those structure members. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-10-18PCI: Introduce PCI bridge emulated config space common logicThomas Petazzoni
Some PCI host controllers do not expose a configuration space for the root port PCI bridge. Due to this, the Marvell Armada 370/38x/XP PCI controller driver (pci-mvebu) emulates a root port PCI bridge configuration space, and uses that to (among other things) dynamically create the memory windows that correspond to the PCI MEM and I/O regions. Since we now need to add a very similar logic for the Marvell Armada 37xx PCI controller driver (pci-aardvark), instead of duplicating the code, we create in this commit a common logic called pci-bridge-emul. The idea of this logic is to emulate a root port PCI bridge configuration space by providing configuration space read/write operations, and faking behind the scenes the configuration space of a PCI bridge. A PCI host controller driver simply has to call pci_bridge_emul_conf_read() and pci_bridge_emul_conf_write() to read/write the configuration space of the bridge. By default, the PCI bridge configuration space is simply emulated by a chunk of memory, but the PCI host controller can override the behavior of the read and write operations on a per-register basis to do additional actions if needed. We take care of complying with the behavior of the PCI configuration space registers in terms of bits that are read-write, read-only, reserved and write-1-to-clear. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-10-18md-cluster: remove suspend_infoGuoqing Jiang
Previously, we allow multiple nodes can resync device, but we had changed it to only support one node can do resync at one time, but suspend_info is still used. Now, let's remove the structure and use suspend_lo/hi to record the range. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster: send BITMAP_NEEDS_SYNC message if reshaping is interruptedGuoqing Jiang
We need to continue the reshaping if it was interrupted in original node. So original node should call resync_bitmap in case reshaping is aborted. Then BITMAP_NEEDS_SYNC message is broadcasted to other nodes, node which continues the reshaping should restart reshape from mddev->reshape_position instead of from the first beginning. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster/bitmap: don't call md_bitmap_sync_with_cluster during reshaping stageGuoqing Jiang
When reshape is happening in one node, other nodes could receive lots of RESYNCING messages, so md_bitmap_sync_with_cluster is called. Since the resyncing window is typically small in these RESYNCING messages, so WARN is always triggered, so we should not call the func when reshape is happening. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster/raid10: don't call remove_and_add_spares during reshaping stageGuoqing Jiang
remove_and_add_spares is not needed if reshape is happening in another node, because raid10_add_disk called inside raid10_start_reshape would handle the role changes of disk. Plus, remove_and_add_spares can't deal with the role change due to reshape. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster/raid10: call update_size in md_reap_sync_threadGuoqing Jiang
We need to change the capacity in all nodes after one node finishs reshape. And as we did before, we can't change the capacity directly in md_do_sync, instead, the capacity should be only changed in update_size or received CHANGE_CAPACITY msg. So master node calls update_size after completes reshape in md_reap_sync_thread, but we need to skip ops->update_size if MD_CLOSING is set since reshaping could not be finish. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster: introduce resync_info_get interface for sanity checkGuoqing Jiang
Since the resync region from suspend_info means one node is reshaping this area, so the position of reshape_progress should be included in the area. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster/raid10: support add disk under grow modeGuoqing Jiang
For clustered raid10 scenario, we need to let all the nodes know about that a new disk is added to the array, and the reshape caused by add new member just need to be happened in one node, but other nodes should know about the change. Since reshape means read data from somewhere (which is already used by array) and write data to unused region. Obviously, it is awful if one node is reading data from address while another node is writing to the same address. Considering we have implemented suspend writes in the resyncing area, so we can just broadcast the reading address to other nodes to avoid the trouble. For master node, it would call reshape_request then update sb during the reshape period. To avoid above trouble, we call resync_info_update to send RESYNC message in reshape_request. Then from slave node's view, it receives two type messages: 1. RESYNCING message Slave node add the address (where master node reading data from) to suspend list. 2. METADATA_UPDATED message Once slave nodes know the reshaping is started in master node, it is time to update reshape position and call start_reshape to follow master node's step. After reshape is done, only reshape position is need to be updated, so the majority task of reshaping is happened on the master node. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18md-cluster/raid10: resize all the bitmaps before start reshapeGuoqing Jiang
To support add disk under grow mode, we need to resize all the bitmaps of each node before reshape, so that we can ensure all nodes have the same view of the bitmap of the clustered raid. So after the master node resized the bitmap, it broadcast a message to other slave nodes, and it checks the size of each bitmap are same or not by compare pages. We can only continue the reshaping after all nodes update the bitmap to the same size (by checking the pages), otherwise revert bitmap size to previous value. The resize_bitmaps interface and BITMAP_RESIZE message are introduced in md-cluster.c for the purpose. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2018-10-18PCI: vmd: Detach resources after stopping root busJon Derrick
The VMD removal path calls pci_stop_root_busi(), which tears down the pcie tree, including detaching all of the attached drivers. During driver detachment, devices may use pci_release_region() to release resources. This path relies on the resource being accessible in resource tree. By detaching the child domain from the parent resource domain prior to stopping the bus, we are preventing the list traversal from finding the resource to be freed. If we instead detach the resource after stopping the bus, we will have properly freed the resource and detaching is simply accounting at that point. Without this order, the resource is never freed and is orphaned on VMD removal, leading to a warning: [ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff> Fixes: 2c2c5c5cd213 ("x86/PCI: VMD: Attach VMD resources to parent domain's resource tree") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-10-18dm crypt: make workqueue names device-specificMichał Mirosław
Make cpu-usage debugging easier by naming workqueues per device. Example ps output: root 413 0.0 0.0 0 0 ? I< paź02 0:00 [kcryptd_io/253:0] root 414 0.0 0.0 0 0 ? I< paź02 0:00 [kcryptd/253:0] root 415 0.0 0.0 0 0 ? S paź02 1:10 [dmcrypt_write/253:0] root 465 0.0 0.0 0 0 ? I< paź02 0:00 [kcryptd_io/253:2] root 466 0.0 0.0 0 0 ? I< paź02 0:00 [kcryptd/253:2] root 467 0.0 0.0 0 0 ? S paź02 2:06 [dmcrypt_write/253:2] root 15359 0.2 0.0 0 0 ? I< 19:43 0:25 [kworker/u17:8-kcryptd/253:0] root 16563 0.2 0.0 0 0 ? I< 20:10 0:18 [kworker/u17:0-kcryptd/253:2] root 23205 0.1 0.0 0 0 ? I< 21:21 0:04 [kworker/u17:4-kcryptd/253:0] root 13383 0.1 0.0 0 0 ? I< 21:32 0:02 [kworker/u17:2-kcryptd/253:2] root 2610 0.1 0.0 0 0 ? I< 21:42 0:01 [kworker/u17:12-kcryptd/253:2] root 20124 0.1 0.0 0 0 ? I< 21:56 0:01 [kworker/u17:1-kcryptd/253:2] Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-10-18softirq: Fix typo in __do_softirq() commentsYangtao Li
s/s/as [ mingo: Also add a missing 'the', add proper punctuation and clarify what 'swap' means here. ] Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.levin@verizon.com Cc: frederic@kernel.org Cc: joel@joelfernandes.org Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20181018142133.12341-1-tiny.windzz@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-18dm: add dm_table_device_name()Michał Mirosław
Add a shortcut for dm_device_name(dm_table_get_md(t)). Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-10-18dm ioctl: harden copy_params()'s copy_from_user() from malicious usersWenwen Wang
In copy_params(), the struct 'dm_ioctl' is first copied from the user space buffer 'user' to 'param_kernel' and the field 'data_size' is checked against 'minimum_data_size' (size of 'struct dm_ioctl' payload up to its 'data' member). If the check fails, an error code EINVAL will be returned. Otherwise, param_kernel->data_size is used to do a second copy, which copies from the same user-space buffer to 'dmi'. After the second copy, only 'dmi->data_size' is checked against 'param_kernel->data_size'. Given that the buffer 'user' resides in the user space, a malicious user-space process can race to change the content in the buffer between the two copies. This way, the attacker can inject inconsistent data into 'dmi' (versus previously validated 'param_kernel'). Fix redundant copying of 'minimum_data_size' from user-space buffer by using the first copy stored in 'param_kernel'. Also remove the 'data_size' check after the second copy because it is now unnecessary. Cc: stable@vger.kernel.org Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2018-10-19powerpc/io: remove old GCC version implementationChristophe Leroy
GCC 4.6 is the minimum supported now. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc: Add -Werror at arch/powerpc levelMichael Ellerman
Back when I added -Werror in commit ba55bd74360e ("powerpc: Add configurable -Werror for arch/powerpc") I did it by adding it to most of the arch Makefiles. At the time we excluded math-emu, because apparently it didn't build cleanly. But that seems to have been fixed somewhere in the interim. So move the -Werror addition to the top-level of the arch, this saves us from repeating it in every Makefile and means we won't forget to add it to any new sub-dirs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc: Move core kernel logic into arch/powerpc/KbuildMichael Ellerman
This is a nice cleanup, arch/powerpc/Makefile is long and messy so moving this out helps a little. It also allows us to do: $ make arch/powerpc Which can be helpful if you just want to compile test some changes to arch code and not link everything. Finally it also gives us a single place to do subdir-cc-flags assignments which affect the whole of arch/powerpc, which we will do in a future patch. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19macintosh/windfarm_smu_sat: Fix debug outputBenjamin Herrenschmidt
There's some antiquated debug output that's trying to do a hand-made hexdump and turning into horrible 1-byte-per-line output these days. Use print_hex_dump() instead Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/traps: remove redundant in_interrupt panic in die()Christophe Leroy
do_exit() already includes a test to panic() is in_interrupt() This patch removes powerpc one which is redundant. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-19powerpc/prom_init: Generate "phandle" instead of "linux, phandle"Benjamin Herrenschmidt
When creating the boot-time FDT from an actual Open Firmware live tree, let's generate "phandle" properties for the phandles instead of the old deprecated "linux,phandle". Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [mpe: Unsplit warning printf()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>