summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-09-29ax88179_178a: Check for supported Wake-on-LAN modesFlorian Fainelli
The driver currently silently accepts unsupported Wake-on-LAN modes (other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user, which is confusing. Fixes: e2ca90c276e1 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29asix: Check for supported Wake-on-LAN modesFlorian Fainelli
The driver currently silently accepts unsupported Wake-on-LAN modes (other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user, which is confusing. Fixes: 2e55cc7210fe ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29Merge branch 'ieee802154-for-davem-2018-09-28' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2018-09-28 An update from ieee802154 for your *net* tree. Some cleanup patches throughout the drivers from the Huawei tag team Yue Haibing and Zhong Jiang. Xue is replacing some magic numbers with defines in his mcr20a driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29Merge tag 'rxrpc-fixes-20180928' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fixes Here are some miscellaneous fixes for AF_RXRPC: (1) Remove a duplicate variable initialisation. (2) Fix one of the checks made when we decide to set up a new incoming service call in which a flag is being checked in the wrong field of the packet header. This check is abstracted out into helper functions. (3) Fix RTT gathering. The code has been trying to make use of socket timestamps, but wasn't actually enabling them. The code has also been recording a transmit time for the outgoing packet for which we're going to measure the RTT after sending the message - but we can get the incoming packet before we get to that and record a negative RTT. (4) Fix the emission of BUSY packets (we are emitting ABORTs instead). (5) Improve error checking on incoming packets. (6) Try to fix a bug in new service call handling whereby a BUG we should never be able to reach somehow got triggered. Do this by moving much of the checking as early as possible and not repeating it later (depends on (5) above). (7) Fix the sockopts set on a UDP6 socket to include the ones set on a UDP4 socket so that we receive UDP4 errors and packet-too-large notifications too. (8) Fix the distribution of errors so that we do it at the point of receiving an error in the UDP callback rather than deferring it thereby cutting short any transmissions that would otherwise occur in the window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29Merge branch 'tipc-next'David S. Miller
Jon Maloy says: ==================== tipc: make connection setup more robust In this series we make a few improvements to the connection setup and probing mechanism, culminating in the last commit where we make it possible for a client socket to make multiple setup attempts in case it encounters receive buffer overflow at the listener socket. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29tipc: buffer overflow handling in listener socketTung Nguyen
Default socket receive buffer size for a listener socket is 2Mb. For each arriving empty SYN, the linux kernel allocates a 768 bytes buffer. This means that a listener socket can serve maximum 2700 simultaneous empty connection setup requests before it hits a receive buffer overflow, and much fewer if the SYN is carrying any significant amount of data. When this happens the setup request is rejected, and the client receives an ECONNREFUSED error. This commit mitigates this problem by letting the client socket try to retransmit the SYN message multiple times when it sees it rejected with the code TIPC_ERR_OVERLOAD. Retransmission is done at random intervals in the range of [100 ms, setup_timeout / 4], as many times as there is room for within the setup timeout limit. Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29tipc: add SYN bit to connection setup messagesJon Maloy
Messages intended for intitating a connection are currently indistinguishable from regular datagram messages. The TIPC protocol specification defines bit 17 in word 0 as a SYN bit to allow sanity check of such messages in the listening socket, but this has so far never been implemented. We do that in this commit. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29tipc: refactor function tipc_sk_filter_connect()Jon Maloy
We refactor the function tipc_sk_filter_connect(), both to make it more readable and as a preparation for the next commit. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29tipc: refactor function tipc_sk_timeout()Jon Maloy
We refactor this function as a preparation for the coming commits in the same series. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29tipc: refactor function tipc_msg_reverse()Jon Maloy
The function tipc_msg_reverse() is reversing the header of a message while reusing the original buffer. We have seen at several occasions that this may have unfortunate side effects when the buffer to be reversed is a clone. In one of the following commits we will again need to reverse cloned buffers, so this is the right time to permanently eliminate this problem. In this commit we let the said function always consume the original buffer and replace it with a new one when applicable. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29tcp: up initial rmem to 128KB and SYN rwin to around 64KBYuchung Cheng
Previously TCP initial receive buffer is ~87KB by default and the initial receive window is ~29KB (20 MSS). This patch changes the two numbers to 128KB and ~64KB (rounding down to the multiples of MSS) respectively. The patch also simplifies the calculations s.t. the two numbers are directly controlled by sysctl tcp_rmem[1]: 1) Initial receiver buffer budget (sk_rcvbuf): while this should be configured via sysctl tcp_rmem[1], previously tcp_fixup_rcvbuf() always override and set a larger size when a new connection establishes. 2) Initial receive window in SYN: previously it is set to 20 packets if MSS <= 1460. The number 20 was based on the initial congestion window of 10: the receiver needs twice amount to avoid being limited by the receive window upon out-of-order delivery in the first window burst. But since this only applies if the receiving MSS <= 1460, connection using large MTU (e.g. to utilize receiver zero-copy) may be limited by the receive window. With this patch TCP memory configuration is more straight-forward and more properly sized to modern high-speed networks by default. Several popular stacks have been announcing 64KB rwin in SYNs as well. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29hns3: Another build fix.David S. Miller
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function ‘hclge_get_sset_count’: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:496:31: error: ‘HNAE3_REVISION_ID_21’ undeclared (first use in this function); did you mean ‘FADT2_REVISION_ID’? if (hdev->pdev->revision >= HNAE3_REVISION_ID_21 || ^~~~~~~~~~~~~~~~~~~~ FADT2_REVISION_ID Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29hns3: Fix the build.David S. Miller
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: In function ‘hns3_self_test’: drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c:278:15: error: ‘HNS3_SELF_TEST_TYPE_NUM’ undeclared (first use in this function); did you mean ‘HNS3_SELF_TEST_TPYE_NUM’? int st_param[HNS3_SELF_TEST_TYPE_NUM][2]; ^~~~~~~~~~~~~~~~~~~~~~~ HNS3_SELF_TEST_TPYE_NUM Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-29Merge tag 'pm-4.19-rc6' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Rafael writes: "Power management fix for 4.19-rc6 Fix incorrect __init and __exit annotations in the Qualcomm Kryo cpufreq driver (Nathan Chancellor)." * tag 'pm-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: qcom-kryo: Fix section annotations
2018-09-29cpufreq: qcom-kryo: Fix section annotationsNathan Chancellor
There is currently a warning when building the Kryo cpufreq driver into the kernel image: WARNING: vmlinux.o(.text+0x8aa424): Section mismatch in reference from the function qcom_cpufreq_kryo_probe() to the function .init.text:qcom_cpufreq_kryo_get_msm_id() The function qcom_cpufreq_kryo_probe() references the function __init qcom_cpufreq_kryo_get_msm_id(). This is often because qcom_cpufreq_kryo_probe lacks a __init annotation or the annotation of qcom_cpufreq_kryo_get_msm_id is wrong. Remove the '__init' annotation from qcom_cpufreq_kryo_get_msm_id so that there is no more mismatch warning. Additionally, Nick noticed that the remove function was marked as '__init' when it should really be marked as '__exit'. Fixes: 46e2856b8e18 (cpufreq: Add Kryo CPU scaling driver) Fixes: 5ad7346b4ae2 (cpufreq: kryo: Add module remove and exit) Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.18+ <stable@vger.kernel.org> # 4.18+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-09-29Merge tag 'dma-mapping-4.19-3' of git://git.infradead.org/users/hch/dma-mappingGreg Kroah-Hartman
Christoph writes: "dma mapping fix for 4.19-rc6 fix a missing Kconfig symbol for commits introduced in 4.19-rc" * tag 'dma-mapping-4.19-3' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration
2018-09-29iomap: set page dirty after partial delalloc on mkwriteBrian Foster
The iomap page fault mechanism currently dirties the associated page after the full block range of the page has been allocated. This leaves the page susceptible to delayed allocations without ever being set dirty on sub-page block sized filesystems. For example, consider a page fault on a page with one preexisting real (non-delalloc) block allocated in the middle of the page. The first iomap_apply() iteration performs delayed allocation on the range up to the preexisting block, the next iteration finds the preexisting block, and the last iteration attempts to perform delayed allocation on the range after the prexisting block to the end of the page. If the first allocation succeeds and the final allocation fails with -ENOSPC, iomap_apply() returns the error and iomap_page_mkwrite() fails to dirty the page having already performed partial delayed allocation. This eventually results in the page being invalidated without ever converting the delayed allocation to real blocks. This problem is reliably reproduced by generic/083 on XFS on ppc64 systems (64k page size, 4k block size). It results in leaked delalloc blocks on inode reclaim, which triggers an assert failure in xfs_fs_destroy_inode() and filesystem accounting inconsistency. Move the set_page_dirty() call from iomap_page_mkwrite() to the actor callback, similar to how the buffer head implementation works. The actor callback is called iff ->iomap_begin() returns success, so ensures the page is dirtied as soon as possible after an allocation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: remove invalid log recovery first/last cycle checkBrian Foster
One of the first steps of log recovery is to check for the special case of a zeroed log. If the first cycle in the log is zero or the tail portion of the log is zeroed, the head is set to the first instance of cycle 0. xlog_find_zeroed() includes a sanity check that enforces that the first cycle in the log must be 1 if the last cycle is 0. While this is true in most cases, the check is not totally valid because it doesn't consider the case where the filesystem crashed after a partial/out of order log buffer completion that wraps around the end of the physical log. For example, consider a filesystem that has completed most of the first cycle of the log, reaches the end of the physical log and splits the next single log buffer write into two in order to wrap around the end of the log. If these I/Os are reordered, the second (wrapped) I/O completes and the first happens to fail, the log is left in a state where the last cycle of the log is 0 and the first cycle is 2. This causes the xlog_find_zeroed() sanity check to fail and prevents the filesystem from mounting. This situation has been reproduced on particular systems via repeated runs of generic/475. This is an expected state that log recovery already knows how to deal with, however. Since the log is still partially zeroed, the head is detected correctly and points to a valid tail. The subsequent stale block detection clears blocks beyond the head up to the tail (within a maximum range), with the express purpose of clearing such out of order writes. As expected, this removes the out of order cycle 2 blocks at the physical start of the log. In other words, the only thing that prevents a clean mount and recovery of the filesystem in this scenario is the specific (last == 0 && first != 1) sanity check in xlog_find_zeroed(). Since the log head/tail are now independently validated via cycle, log record and CRC checks, this highly specific first cycle check is of dubious value. Remove it and rely on the higher level validation to determine whether log content is sane and recoverable. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: validate inode di_forkoffEric Sandeen
Verify the inode di_forkoff, lifted from xfs_repair's process_check_inode_forkoff(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: skip delalloc COW blocks in xfs_reflink_end_cowChristoph Hellwig
The iomap direct I/O code issues a single ->end_io call for the whole I/O request, and if some of the extents cowered needed a COW operation it will call xfs_reflink_end_cow over the whole range. When we do AIO writes we drop the iolock after doing the initial setup, but before the I/O completion. Between dropping the lock and completing the I/O we can have a racing buffered write create new delalloc COW fork extents in the region covered by the outstanding direct I/O write, and thus see delalloc COW fork extents in xfs_reflink_end_cow. As concurrent writes are fundamentally racy and no guarantees are given we can simply skip those. This can be easily reproduced with xfstests generic/208 in always_cow mode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: don't treat unknown di_flags2 as corruption in scrubEric Sandeen
xchk_inode_flags2() currently treats any di_flags2 values that the running kernel doesn't recognize as corruption, and calls xchk_ino_set_corrupt() if they are set. However, it's entirely possible that these flags were set in some newer kernel and are quite valid, but ignored in this kernel. (Validators don't care one bit about unknown di_flags2.) Call xchk_ino_set_warning instead, because this may or may not actually indicate a problem. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: remove duplicated include from alloc.cYueHaibing
Remove duplicated include xfs_alloc.h Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: don't bring in extents in xfs_bmap_punch_delalloc_rangeChristoph Hellwig
This function is only used to punch out delayed allocations on I/O failure, which means we need to have read the extents earlier. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: fix transaction leak in xfs_reflink_allocate_cow()Dave Chinner
When xfs_reflink_allocate_cow() allocates a transaction, it drops the ILOCK to perform the operation. This Introduces a race condition where another thread modifying the file can perform the COW allocation operation underneath us. This result in the retry loop finding an allocated block and jumping straight to the conversion code. It does not, however, cancel the transaction it holds and so this gets leaked. This results in a lockdep warning: ================================================ WARNING: lock held when returning to user space! 4.18.5 #1 Not tainted ------------------------------------------------ worker/6123 is leaving the kernel with locks still held! 1 lock held by worker/6123: #0: 000000009eab4f1b (sb_internal#2){.+.+}, at: xfs_trans_alloc+0x17c/0x220 And eventually the filesystem deadlocks because it runs out of log space that is reserved by the leaked transaction and never gets released. The logic flow in xfs_reflink_allocate_cow() is a convoluted mess of gotos - it's no surprise that it has bug where the flow through several goto jumps then fails to clean up context from a non-obvious logic path. CLean up the logic flow and make sure every path does the right thing. Reported-by: Alexander Y. Fomichev <git.user@gmail.com> Tested-by: Alexander Y. Fomichev <git.user@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200981 Signed-off-by: Dave Chinner <dchinner@redhat.com> [hch: slight refactor] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: avoid lockdep false positives in xfs_trans_allocDave Chinner
We've had a few reports of lockdep tripping over memory reclaim context vs filesystem freeze "deadlocks". They all have looked to be false positives on analysis, but it seems that they are being tripped because we take freeze references before we run a GFP_KERNEL allocation for the struct xfs_trans. We can avoid this false positive vector just by re-ordering the operations in xfs_trans_alloc(). That is. we need allocate the structure before we take the freeze reference and enter the GFP_NOFS allocation context that follows the xfs_trans around. This prevents lockdep from seeing the GFP_KERNEL allocation inside the transaction context, and that prevents it from triggering the freeze level vs alloc context vs reclaim warnings. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-09-29xfs: refactor xfs_buf_log_item reference count handlingBrian Foster
The xfs_buf_log_item structure has a reference counter with slightly tricky semantics. In the common case, a buffer is logged and committed in a transaction, committed to the on-disk log (added to the AIL) and then finally written back and removed from the AIL. The bli refcount covers two potentially overlapping timeframes: 1. the bli is held in an active transaction 2. the bli is pinned by the log The caveat to this approach is that the reference counter does not purely dictate the lifetime of the bli. IOW, when a dirty buffer is physically logged and unpinned, the bli refcount may go to zero as the log item is inserted into the AIL. Only once the buffer is written back can the bli finally be freed. The above semantics means that it is not enough for the various refcount decrementing contexts to release the bli on decrement to zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and unpin (->iop_unpin()) must all drop the associated reference and make additional checks to determine if the current context is responsible for freeing the item. For example, if a transaction holds but does not dirty a particular bli, the commit may drop the refcount to zero. If the bli itself is clean, it is also not AIL resident and must be freed at this time. The same is true for xfs_trans_brelse(). If the transaction dirties a bli and then aborts or an unpin results in an abort due to a log I/O error, the last reference count holder is expected to explicitly remove the item from the AIL and release it (since an abort means filesystem shutdown and metadata writeback will never occur). This leads to fairly complex checks being replicated in a few different places. Since ->iop_unlock() and xfs_trans_brelse() are nearly identical, refactor the logic into a common helper that implements and documents the semantics in one place. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: clean up xfs_trans_brelse()Brian Foster
xfs_trans_brelse() is a bit of a historical mess, similar to xfs_buf_item_unlock(). It is unnecessarily verbose, has snippets of commented out code, inconsistency with regard to stale items, etc. Clean up xfs_trans_brelse() to use similar logic and flow as xfs_buf_item_unlock() with regard to bli reference count handling. This patch makes no functional changes, but facilitates further refactoring of the common bli reference count handling code. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: don't unlock invalidated buf on aborted tx commitBrian Foster
xfstests generic/388,475 occasionally reproduce assertion failures in xfs_buf_item_unpin() when the final bli reference is dropped on an invalidated buffer and the buffer is not locked as it is expected to be. Invalidated buffers should remain locked on transaction commit until the final unpin, at which point the buffer is removed from the AIL and the bli is freed since stale buffers are not written back. The assert failures are associated with filesystem shutdown, typically due to log I/O errors injected by the test. The problematic situation can occur if the shutdown happens to cause a race between an active transaction that has invalidated a particular buffer and an I/O error on a log buffer that contains the bli associated with the same (now stale) buffer. Both transaction and log contexts acquire a bli reference. If the transaction has already invalidated the buffer by the time the I/O error occurs and ends up aborting due to shutdown, the transaction and log hold the last two references to a stale bli. If the transaction cancel occurs first, it treats the buffer as non-stale due to the aborted state: the bli reference is dropped and the buffer is released/unlocked. The log buffer I/O error handling eventually calls into xfs_buf_item_unpin(), drops the final reference to the bli and treats it as stale. The buffer wasn't left locked by xfs_buf_item_unlock(), however, so the assert fails and the buffer is double unlocked. The latter problem is mitigated by the fact that the fs is shutdown and no further damage is possible. ->iop_unlock() of an invalidated buffer should behave consistently with respect to the bli refcount, regardless of aborted state. If the refcount remains elevated on commit, we know the bli is awaiting an unpin (since it can't be in another transaction) and will be handled appropriately on log buffer completion. If the final bli reference of an invalidated buffer is dropped in ->iop_unlock(), we can assume the transaction has aborted because invalidation implies a dirty transaction. In the non-abort case, the log would have acquired a bli reference in ->iop_pin() and prevented bli release at ->iop_unlock() time. In the abort case the item must be freed and buffer unlocked because it wasn't pinned by the log. Rework xfs_buf_item_unlock() to simplify the currently circuitous and duplicate logic and leave invalidated buffers locked based on bli refcount, regardless of aborted state. This ensures that a pinned, stale buffer is always found locked when eventually unpinned. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: remove last of unnecessary xfs_defer_cancel() callersBrian Foster
Now that deferred operations are completely managed via transactions, it's no longer necessary to cancel the dfops in error paths that already cancel the associated transaction. There are a few such calls lingering throughout the codebase. Remove all remaining unnecessary calls to xfs_defer_cancel(). This leaves xfs_defer_cancel() calls in two places. The first is the call in the transaction cancel path itself, which facilitates this patch. The second is made via the xfs_defer_finish() error path to provide consistent error semantics with transaction commit. For example, xfs_trans_commit() expects an xfs_defer_finish() failure to clean up the dfops structure before it returns. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: don't crash the vfs on a garbage inline symlinkDarrick J. Wong
The VFS routine that calls ->get_link blindly copies whatever's returned into the user's buffer. If we return a NULL pointer, the vfs will crash on the null pointer. Therefore, return -EFSCORRUPTED instead of blowing up the kernel. [dgc: clean up with hch's suggestions] Reported-by: wen.xu@gatech.edu Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-28Merge branch 'for-linus' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Dmitry writes: "Input updates for v4.19-rc5 Just a few driver fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: uinput - allow for max == min during input_absinfo validation Input: elantech - enable middle button of touchpad on ThinkPad P72 Input: atakbd - fix Atari CapsLock behaviour Input: atakbd - fix Atari keymap Input: egalax_ts - add system wakeup support Input: gpio-keys - fix a documentation index issue
2018-09-28Merge tag 'spi-fix-v4.19-rc5' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Mark writes: "spi: Fixes for v4.19 Quite a few fixes for the Renesas drivers in here, plus a fix for the Tegra driver and some documentation fixes for the recently added spi-mem code. The Tegra fix is relatively large but fairly straightforward and mechanical, it runs on probe so it's been reasonably well covered in -next testing." * tag 'spi-fix-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-mem: Move the DMA-able constraint doc to the kerneldoc header spi: spi-mem: Add missing description for data.nbytes field spi: rspi: Fix interrupted DMA transfers spi: rspi: Fix invalid SPI use during system suspend spi: sh-msiof: Fix handling of write value for SISTR register spi: sh-msiof: Fix invalid SPI use during system suspend spi: gpio: Fix copy-and-paste error spi: tegra20-slink: explicitly enable/disable clock
2018-09-28Merge tag 'regulator-v4.19-rc5' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Mark writes: "regulator: Fixes for 4.19 A collection of fairly minor bug fixes here, a couple of driver specific ones plus two core fixes. There's one fix for the new suspend state code which fixes some confusion with constant values that are supposed to indicate noop operation and another fixing a race condition with the creation of sysfs files on new regulators." * tag 'regulator-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: fix crash caused by null driver data regulator: Fix 'do-nothing' value for regulators without suspend state regulator: da9063: fix DT probing with constraints regulator: bd71837: Disable voltage monitoring for LDO3/4
2018-09-28Merge tag 'powerpc-4.19-3' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Michael writes: "powerpc fixes for 4.19 #3 A reasonably big batch of fixes due to me being away for a few weeks. A fix for the TM emulation support on Power9, which could result in corrupting the guest r11 when running under KVM. Two fixes to the TM code which could lead to userspace GPR corruption if we take an SLB miss at exactly the wrong time. Our dynamic patching code had a bug that meant we could patch freed __init text, which could lead to corrupting userspace memory. csum_ipv6_magic() didn't work on little endian platforms since we optimised it recently. A fix for an endian bug when reading a device tree property telling us how many storage keys the machine has available. Fix a crash seen on some configurations of PowerVM when migrating the partition from one machine to another. A fix for a regression in the setup of our CPU to NUMA node mapping in KVM guests. A fix to our selftest Makefiles to make them work since a recent change to the shared Makefile logic." * tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix Makefiles for headers_install change powerpc/numa: Use associativity if VPHN hcall is successful powerpc/tm: Avoid possible userspace r1 corruption on reclaim powerpc/tm: Fix userspace r13 corruption powerpc/pseries: Fix unitialized timer reset on migration powerpc/pkeys: Fix reading of ibm, processor-storage-keys property powerpc: fix csum_ipv6_magic() on little endian platforms powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again) powerpc: Avoid code patching freed init sections KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds
2018-09-28Merge tag 'pinctrl-v4.19-4' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Linus writes: "Pin control fixes for v4.19: - Fixes to x86 hardware: - AMD interrupt debounce issues - Faulty Intel cannonlake register offset - Revert pin translation IRQ locking" * tag 'pinctrl-v4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: Revert "pinctrl: intel: Do pin translation when lock IRQ" pinctrl: cannonlake: Fix HOSTSW_OWN register offset of H variant pinctrl/amd: poll InterruptEnable bits in amd_gpio_irq_set_type
2018-09-28MAINTAINERS: Remove obsolete drivers/pci pattern from ACPI sectionBjorn Helgaas
Prior to 256a45937093 ("PCI/AER: Squash aerdrv_acpi.c into aerdrv.c"), drivers/pci/pcie/aer/aerdrv_acpi.c contained code to parse the ACPI HEST table. That code now lives in drivers/pci/pcie/aer.c. Remove the "F: drivers/pci/*/*/*acpi*" pattern because it matches nothing. We could add a "F: drivers/pci/pcie/aer.c" pattern to the ACPI APEI section, but that file sees a lot of changes, almost none of which are of interest to the ACPI folks. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-09-28perf/core: Add sanity check to deal with pinned event failureReinette Chatre
It is possible that a failure can occur during the scheduling of a pinned event. The initial portion of perf_event_read_local() contains the various error checks an event should pass before it can be considered valid. Ensure that the potential scheduling failure of a pinned event is checked for and have a credible error. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: acme@kernel.org Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/6486385d1f30336e9973b24c8c65f5079543d3d3.1537377064.git.reinette.chatre@intel.com
2018-09-28Bluetooth: Fix debugfs NULL pointer dereferenceMatias Karhumaa
Fix crash caused by NULL pointer dereference when debugfs functions le_max_key_read, le_max_key_size_write, le_min_key_size_read or le_min_key_size_write and Bluetooth adapter was powered off. Fix is to move max_key_size and min_key_size from smp_dev to hci_dev. At the same time they were renamed to le_max_key_size and le_min_key_size. BUG: unable to handle kernel NULL pointer dereference at 00000000000002e8 PGD 0 P4D 0 Oops: 0000 [#24] SMP PTI CPU: 2 PID: 6255 Comm: cat Tainted: G D OE 4.18.9-200.fc28.x86_64 #1 Hardware name: LENOVO 4286CTO/4286CTO, BIOS 8DET76WW (1.46 ) 06/21/2018 RIP: 0010:le_max_key_size_read+0x45/0xb0 [bluetooth] Code: 00 00 00 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 48 8b 87 c8 00 00 00 48 8d 7c 24 04 48 8b 80 48 0a 00 00 <48> 8b 80 e8 02 00 00 0f b6 48 52 e8 fb b6 b3 ed be 04 00 00 00 48 RSP: 0018:ffffab23c3ff3df0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00007f0b4ca2e000 RCX: ffffab23c3ff3f08 RDX: ffffffffc0ddb033 RSI: 0000000000000004 RDI: ffffab23c3ff3df4 RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 R10: ffffab23c3ff3ed8 R11: 0000000000000000 R12: ffffab23c3ff3f08 R13: 00007f0b4ca2e000 R14: 0000000000020000 R15: ffffab23c3ff3f08 FS: 00007f0b4ca0f540(0000) GS:ffff91bd5e280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000002e8 CR3: 00000000629fa006 CR4: 00000000000606e0 Call Trace: full_proxy_read+0x53/0x80 __vfs_read+0x36/0x180 vfs_read+0x8a/0x140 ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Matias Karhumaa <matias.karhumaa@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2018-09-28ice: fix changing of ring descriptor size (ethtool -G)Bruce Allan
rx_mini_pending was set to an incorrect value. This was causing EINVAL to always be returned to 'ethtool -G'. The driver does not support mini or jumbo rings so the respective settings should be zero. Also, change the valid range of the number of descriptors in the rings to make the code simpler and easier for users to understand (this removes the valid settings of 8 and 16). Add a system log message indicating when the number is rounded-up from what the user specifies with the 'ethtool -G' command (i.e. when it is not a multiple of 32), and update the log message when a user-provided value is out of range to also indicate the stride. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: Update to capabilities admin queue commandAnirudh Venkataramanan
This patch makes a couple of changes in the way the driver uses the "get capabilities" command. 1. Get device capabilities in addition to function capabilities 2. Align to latest spec by using cap_count to determine size of the buffer in case of length error. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: Query the Tx scheduler node before adding itAnirudh Venkataramanan
Query the Tx scheduler tree node information from FW before adding it to the driver's software database. This will keep the node information current in driver. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: Update comment for ice_fltr_mgmt_list_entryBrett Creeley
Previously the comment stated that VSI lists should be used when a second VSI becomes a subscriber to the "VLAN address". VSI lists are always used for VLAN membership, so replace "VLAN address" with "MAC address". Also note that VLAN(s) always use VSI list rules. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: update fw version check logicJacob Keller
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a firmware version is newer than expected. This is currently tested by comparing each field separately. Thus, we compare the branch field against the MAX_FW_API_VER_BRANCH, and so forth. This means that currently, if we suppose that the max firmware version is defined as 0.2.1, i.e. Then firmware 0.1.3 will fail to load. This is because the minor version 3 is greater than the max minor version 1. This is not intuitive, because of the notion that increasing the major firmware version to 2 should mean any firmware version with a major version is less than 2 should be considered older than 2... In order to allow both 0.2.1 and 0.1.3 to load, you would have to define the "max" firmware version as 0.2.3.. It is possible that such a firmware version doesn't even exist yet! Fix this by replacing the current logic with an updated check that behaves as follows: First, we check the major version. If it is greater than the expected version, then we prevent driver load. Additionally, a warning message is logged to indicate to the system administrator that they need to update their driver. This is now the only case where the driver will refuse to load. Second, if the major version is less than the expected version, we log an information message indicating the NVM should be updated. Third, if the major version is exact, we'll then check the minor version. If the minor version is more than two versions less than expected, we log an information message indicating the NVM should be updated. If it is more than two versions greater than the expected version, we log an information message that the driver should be updated. To support this, the ice_aq_ver_check function needs its signature updated to pass the HW structure. Since we now pass this structure, there is no need to pass the firmware API versions separately. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: update branding strings and supported device idsBruce Allan
Update branding strings and remove device ids 0x1594 and 0x1595. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: replace unnecessary memcpy with direct assignmentBruce Allan
Direct assignment is preferred over a memcpy() Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28ice: use [sr]q.count when checking if queue is initializedJacob Keller
When shutting down the controlqs, we check if they are initialized before we shut them down and destroy the lock. This is important, as it prevents attempts to access the lock of an already shutdown queue. Unfortunately, we checked rq.head and sq.head as the value to determine if the queue was initialized. This doesn't work, because head is not reset when the queue is shutdown. In some flows, the adminq will have already been shut down prior to calling ice_shutdown_all_ctrlqs. This can result in a crash due to attempting to access the already destroyed mutex. Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq and ice_shutdown_rq already indicate that this is the value we should be using to determine of the queue was initialized. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-28qed: fix spelling mistake "b_cb_registred" -> "b_cb_registered"Colin Ian King
Trivial fix to spelling mistake struct field name, rename it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28Merge branch 'netpoll-second-round-of-fixes'David S. Miller
Eric Dumazet says: ==================== netpoll: second round of fixes. As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture, showing one ksoftirqd eating all cycles can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. It seems that all networking drivers that do use NAPI for their TX completions, should not provide a ndo_poll_controller() : Most NAPI drivers have netpoll support already handled in core networking stack, since netpoll_poll_dev() uses poll_napi(dev) to iterate through registered NAPI contexts for a device. First patch is a fix in poll_one_napi(). Then following patches take care of ten drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28ibmvnic: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ibmvnic uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. ibmvnic_netpoll_controller() was completely wrong anyway, as it was scheduling NAPI to service RX queues (instead of TX), so I doubt netpoll ever worked on this driver. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28sfc-falcon: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. sfc-falcon uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Edward Cree <ecree@solarflare.com> Cc: Bert Kenward <bkenward@solarflare.com> Acked-By: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>