summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-09-29xfs: avoid lockdep false positives in xfs_trans_allocDave Chinner
We've had a few reports of lockdep tripping over memory reclaim context vs filesystem freeze "deadlocks". They all have looked to be false positives on analysis, but it seems that they are being tripped because we take freeze references before we run a GFP_KERNEL allocation for the struct xfs_trans. We can avoid this false positive vector just by re-ordering the operations in xfs_trans_alloc(). That is. we need allocate the structure before we take the freeze reference and enter the GFP_NOFS allocation context that follows the xfs_trans around. This prevents lockdep from seeing the GFP_KERNEL allocation inside the transaction context, and that prevents it from triggering the freeze level vs alloc context vs reclaim warnings. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-09-29xfs: refactor xfs_buf_log_item reference count handlingBrian Foster
The xfs_buf_log_item structure has a reference counter with slightly tricky semantics. In the common case, a buffer is logged and committed in a transaction, committed to the on-disk log (added to the AIL) and then finally written back and removed from the AIL. The bli refcount covers two potentially overlapping timeframes: 1. the bli is held in an active transaction 2. the bli is pinned by the log The caveat to this approach is that the reference counter does not purely dictate the lifetime of the bli. IOW, when a dirty buffer is physically logged and unpinned, the bli refcount may go to zero as the log item is inserted into the AIL. Only once the buffer is written back can the bli finally be freed. The above semantics means that it is not enough for the various refcount decrementing contexts to release the bli on decrement to zero. xfs_trans_brelse(), transaction commit (->iop_unlock()) and unpin (->iop_unpin()) must all drop the associated reference and make additional checks to determine if the current context is responsible for freeing the item. For example, if a transaction holds but does not dirty a particular bli, the commit may drop the refcount to zero. If the bli itself is clean, it is also not AIL resident and must be freed at this time. The same is true for xfs_trans_brelse(). If the transaction dirties a bli and then aborts or an unpin results in an abort due to a log I/O error, the last reference count holder is expected to explicitly remove the item from the AIL and release it (since an abort means filesystem shutdown and metadata writeback will never occur). This leads to fairly complex checks being replicated in a few different places. Since ->iop_unlock() and xfs_trans_brelse() are nearly identical, refactor the logic into a common helper that implements and documents the semantics in one place. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: clean up xfs_trans_brelse()Brian Foster
xfs_trans_brelse() is a bit of a historical mess, similar to xfs_buf_item_unlock(). It is unnecessarily verbose, has snippets of commented out code, inconsistency with regard to stale items, etc. Clean up xfs_trans_brelse() to use similar logic and flow as xfs_buf_item_unlock() with regard to bli reference count handling. This patch makes no functional changes, but facilitates further refactoring of the common bli reference count handling code. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: don't unlock invalidated buf on aborted tx commitBrian Foster
xfstests generic/388,475 occasionally reproduce assertion failures in xfs_buf_item_unpin() when the final bli reference is dropped on an invalidated buffer and the buffer is not locked as it is expected to be. Invalidated buffers should remain locked on transaction commit until the final unpin, at which point the buffer is removed from the AIL and the bli is freed since stale buffers are not written back. The assert failures are associated with filesystem shutdown, typically due to log I/O errors injected by the test. The problematic situation can occur if the shutdown happens to cause a race between an active transaction that has invalidated a particular buffer and an I/O error on a log buffer that contains the bli associated with the same (now stale) buffer. Both transaction and log contexts acquire a bli reference. If the transaction has already invalidated the buffer by the time the I/O error occurs and ends up aborting due to shutdown, the transaction and log hold the last two references to a stale bli. If the transaction cancel occurs first, it treats the buffer as non-stale due to the aborted state: the bli reference is dropped and the buffer is released/unlocked. The log buffer I/O error handling eventually calls into xfs_buf_item_unpin(), drops the final reference to the bli and treats it as stale. The buffer wasn't left locked by xfs_buf_item_unlock(), however, so the assert fails and the buffer is double unlocked. The latter problem is mitigated by the fact that the fs is shutdown and no further damage is possible. ->iop_unlock() of an invalidated buffer should behave consistently with respect to the bli refcount, regardless of aborted state. If the refcount remains elevated on commit, we know the bli is awaiting an unpin (since it can't be in another transaction) and will be handled appropriately on log buffer completion. If the final bli reference of an invalidated buffer is dropped in ->iop_unlock(), we can assume the transaction has aborted because invalidation implies a dirty transaction. In the non-abort case, the log would have acquired a bli reference in ->iop_pin() and prevented bli release at ->iop_unlock() time. In the abort case the item must be freed and buffer unlocked because it wasn't pinned by the log. Rework xfs_buf_item_unlock() to simplify the currently circuitous and duplicate logic and leave invalidated buffers locked based on bli refcount, regardless of aborted state. This ensures that a pinned, stale buffer is always found locked when eventually unpinned. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: remove last of unnecessary xfs_defer_cancel() callersBrian Foster
Now that deferred operations are completely managed via transactions, it's no longer necessary to cancel the dfops in error paths that already cancel the associated transaction. There are a few such calls lingering throughout the codebase. Remove all remaining unnecessary calls to xfs_defer_cancel(). This leaves xfs_defer_cancel() calls in two places. The first is the call in the transaction cancel path itself, which facilitates this patch. The second is made via the xfs_defer_finish() error path to provide consistent error semantics with transaction commit. For example, xfs_trans_commit() expects an xfs_defer_finish() failure to clean up the dfops structure before it returns. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-29xfs: don't crash the vfs on a garbage inline symlinkDarrick J. Wong
The VFS routine that calls ->get_link blindly copies whatever's returned into the user's buffer. If we return a NULL pointer, the vfs will crash on the null pointer. Therefore, return -EFSCORRUPTED instead of blowing up the kernel. [dgc: clean up with hch's suggestions] Reported-by: wen.xu@gatech.edu Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-09-28Merge branch 'for-linus' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Dmitry writes: "Input updates for v4.19-rc5 Just a few driver fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: uinput - allow for max == min during input_absinfo validation Input: elantech - enable middle button of touchpad on ThinkPad P72 Input: atakbd - fix Atari CapsLock behaviour Input: atakbd - fix Atari keymap Input: egalax_ts - add system wakeup support Input: gpio-keys - fix a documentation index issue
2018-09-28Merge tag 'spi-fix-v4.19-rc5' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Mark writes: "spi: Fixes for v4.19 Quite a few fixes for the Renesas drivers in here, plus a fix for the Tegra driver and some documentation fixes for the recently added spi-mem code. The Tegra fix is relatively large but fairly straightforward and mechanical, it runs on probe so it's been reasonably well covered in -next testing." * tag 'spi-fix-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-mem: Move the DMA-able constraint doc to the kerneldoc header spi: spi-mem: Add missing description for data.nbytes field spi: rspi: Fix interrupted DMA transfers spi: rspi: Fix invalid SPI use during system suspend spi: sh-msiof: Fix handling of write value for SISTR register spi: sh-msiof: Fix invalid SPI use during system suspend spi: gpio: Fix copy-and-paste error spi: tegra20-slink: explicitly enable/disable clock
2018-09-28Merge tag 'regulator-v4.19-rc5' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Mark writes: "regulator: Fixes for 4.19 A collection of fairly minor bug fixes here, a couple of driver specific ones plus two core fixes. There's one fix for the new suspend state code which fixes some confusion with constant values that are supposed to indicate noop operation and another fixing a race condition with the creation of sysfs files on new regulators." * tag 'regulator-v4.19-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: fix crash caused by null driver data regulator: Fix 'do-nothing' value for regulators without suspend state regulator: da9063: fix DT probing with constraints regulator: bd71837: Disable voltage monitoring for LDO3/4
2018-09-28Merge tag 'powerpc-4.19-3' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Michael writes: "powerpc fixes for 4.19 #3 A reasonably big batch of fixes due to me being away for a few weeks. A fix for the TM emulation support on Power9, which could result in corrupting the guest r11 when running under KVM. Two fixes to the TM code which could lead to userspace GPR corruption if we take an SLB miss at exactly the wrong time. Our dynamic patching code had a bug that meant we could patch freed __init text, which could lead to corrupting userspace memory. csum_ipv6_magic() didn't work on little endian platforms since we optimised it recently. A fix for an endian bug when reading a device tree property telling us how many storage keys the machine has available. Fix a crash seen on some configurations of PowerVM when migrating the partition from one machine to another. A fix for a regression in the setup of our CPU to NUMA node mapping in KVM guests. A fix to our selftest Makefiles to make them work since a recent change to the shared Makefile logic." * tag 'powerpc-4.19-3' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix Makefiles for headers_install change powerpc/numa: Use associativity if VPHN hcall is successful powerpc/tm: Avoid possible userspace r1 corruption on reclaim powerpc/tm: Fix userspace r13 corruption powerpc/pseries: Fix unitialized timer reset on migration powerpc/pkeys: Fix reading of ibm, processor-storage-keys property powerpc: fix csum_ipv6_magic() on little endian platforms powerpc/powernv/ioda2: Reduce upper limit for DMA window size (again) powerpc: Avoid code patching freed init sections KVM: PPC: Book3S HV: Fix guest r11 corruption with POWER9 TM workarounds
2018-09-28Merge tag 'pinctrl-v4.19-4' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Linus writes: "Pin control fixes for v4.19: - Fixes to x86 hardware: - AMD interrupt debounce issues - Faulty Intel cannonlake register offset - Revert pin translation IRQ locking" * tag 'pinctrl-v4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: Revert "pinctrl: intel: Do pin translation when lock IRQ" pinctrl: cannonlake: Fix HOSTSW_OWN register offset of H variant pinctrl/amd: poll InterruptEnable bits in amd_gpio_irq_set_type
2018-09-28MAINTAINERS: Remove obsolete drivers/pci pattern from ACPI sectionBjorn Helgaas
Prior to 256a45937093 ("PCI/AER: Squash aerdrv_acpi.c into aerdrv.c"), drivers/pci/pcie/aer/aerdrv_acpi.c contained code to parse the ACPI HEST table. That code now lives in drivers/pci/pcie/aer.c. Remove the "F: drivers/pci/*/*/*acpi*" pattern because it matches nothing. We could add a "F: drivers/pci/pcie/aer.c" pattern to the ACPI APEI section, but that file sees a lot of changes, almost none of which are of interest to the ACPI folks. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-09-28perf/core: Add sanity check to deal with pinned event failureReinette Chatre
It is possible that a failure can occur during the scheduling of a pinned event. The initial portion of perf_event_read_local() contains the various error checks an event should pass before it can be considered valid. Ensure that the potential scheduling failure of a pinned event is checked for and have a credible error. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: fenghua.yu@intel.com Cc: tony.luck@intel.com Cc: acme@kernel.org Cc: gavin.hindman@intel.com Cc: jithu.joseph@intel.com Cc: dave.hansen@intel.com Cc: hpa@zytor.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/6486385d1f30336e9973b24c8c65f5079543d3d3.1537377064.git.reinette.chatre@intel.com
2018-09-28Merge branch 'netpoll-second-round-of-fixes'David S. Miller
Eric Dumazet says: ==================== netpoll: second round of fixes. As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture, showing one ksoftirqd eating all cycles can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. It seems that all networking drivers that do use NAPI for their TX completions, should not provide a ndo_poll_controller() : Most NAPI drivers have netpoll support already handled in core networking stack, since netpoll_poll_dev() uses poll_napi(dev) to iterate through registered NAPI contexts for a device. First patch is a fix in poll_one_napi(). Then following patches take care of ten drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28ibmvnic: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ibmvnic uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. ibmvnic_netpoll_controller() was completely wrong anyway, as it was scheduling NAPI to service RX queues (instead of TX), so I doubt netpoll ever worked on this driver. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28sfc-falcon: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. sfc-falcon uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Edward Cree <ecree@solarflare.com> Cc: Bert Kenward <bkenward@solarflare.com> Acked-By: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28sfc: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. sfc uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Edward Cree <ecree@solarflare.com> Cc: Bert Kenward <bkenward@solarflare.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Acked-By: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28net: ena: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ena uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Netanel Belgazal <netanel@amazon.com> Cc: Saeed Bishara <saeedb@amazon.com> Cc: Zorik Machulsky <zorik@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28qlogic: netxen: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. netxen uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Manish Chopra <manish.chopra@cavium.com> Cc: Rahul Verma <rahul.verma@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28qlcnic: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. qlcnic uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Harish Patil <harish.patil@cavium.com> Cc: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28virtio_net: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. virto_net uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28net: hns: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. hns uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28ehea: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. ehea uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Douglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28hinic: remove ndo_poll_controllerEric Dumazet
As diagnosed by Song Liu, ndo_poll_controller() can be very dangerous on loaded hosts, since the cpu calling ndo_poll_controller() might steal all NAPI contexts (for all RX/TX queues of the NIC). This capture can last for unlimited amount of time, since one cpu is generally not able to drain all the queues under load. hinic uses NAPI for TX completions, so we better let core networking stack call the napi->poll() to avoid the capture. Note that hinic_netpoll() was incorrectly scheduling NAPI on both RX and TX queues. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28netpoll: do not test NAPI_STATE_SCHED in poll_one_napi()Eric Dumazet
Since we do no longer require NAPI drivers to provide an ndo_poll_controller(), napi_schedule() has not been done before poll_one_napi() invocation. So testing NAPI_STATE_SCHED is likely to cause early returns. While we are at it, remove outdated comment. Note to future bisections : This change might surface prior bugs in drivers. See commit 73f21c653f93 ("bnxt_en: Fix TX timeout during netpoll.") for one occurrence. Fixes: ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional") Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Song Liu <songliubraving@fb.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28Merge tag 'mac80211-for-davem-2018-09-27' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== More patches than I'd like perhaps, but each seems reasonable: * two new spectre-v1 mitigations in nl80211 * TX status fix in general, and mesh in particular * powersave vs. offchannel fix * regulatory initialization fix * fix for a queue hang due to a bad return value * allocate TXQs for active monitor interfaces, fixing my earlier patch to avoid unnecessary allocations where I missed this case needed them * fix TDLS data frames priority assignment * fix scan results processing to take into account duplicate channel numbers (over different operating classes, but we don't necessarily know the operating class) * various hwsim fixes for radio destruction and new radio announcement messages * remove an extraneous kernel-doc line ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28qed: Fix shmem structure inconsistency between driver and the mfw.Sudarsana Reddy Kalluru
The structure shared between driver and the management FW (mfw) differ in sizes. This would lead to issues when driver try to access the structure members which are not-aligned with the mfw copy e.g., data_ptr usage in the case of mfw_tlv request. Align the driver structure with mfw copy, add reserved field(s) to driver structure for the members not used by the driver. Fixes: dd006921d67f ("qed: Add MFW interfaces for TLV request support.) Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
2018-09-28Update maintainers for bnx2/bnx2x/qlge/qlcnic drivers.Sudarsana Reddy Kalluru
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ameen Rahman <Ameen.Rahman@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28MAINTAINERS: change bridge maintainersStephen Hemminger
I haven't been doing reviews only but not active development on bridge code for several years. Roopa and Nikolay have been doing most of the new features and have agreed to take over as new co-maintainers. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
2018-09-28Merge branch 's390-qeth-fixes'David S. Miller
Julian Wiedmann says: ==================== s390/qeth: fixes 2019-09-26 please apply two qeth patches for -net. The first is a trivial cleanup required for patch #2 by Jean, which fixes a potential endless loop. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28s390: qeth: Fix potential array overrun in cmd/rc lookupJean Delvare
Functions qeth_get_ipa_msg and qeth_get_ipa_cmd_name are modifying the last member of global arrays without any locking that I can see. If two instances of either function are running at the same time, it could cause a race ultimately leading to an array overrun (the contents of the last entry of the array is the only guarantee that the loop will ever stop). Performing the lookups without modifying the arrays is admittedly slower (two comparisons per iteration instead of one) but these are operations which are rare (should only be needed in error cases or when debugging, not during successful operation) and it seems still less costly than introducing a mutex to protect the arrays in question. As a side bonus, it allows us to declare both arrays as const data. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Julian Wiedmann <jwi@linux.ibm.com> Cc: Ursula Braun <ubraun@linux.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28s390: qeth_core_mpc: Use ARRAY_SIZE instead of reimplementing its functionzhong jiang
Use the common code ARRAY_SIZE macro instead of a private implementation. Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28Merge tag 'drm-fixes-2018-09-28' of git://anongit.freedesktop.org/drm/drmGreg Kroah-Hartman
Dave writes: "drm fixes for 4.19-rc6 Looks like a pretty normal week for graphics, core: syncobj fix, panel link regression revert amd: suspend/resume fixes, EDID emulation fix mali-dp: NV12 writeback and vblank reset fixes etnaviv: DMA setup fix" * tag 'drm-fixes-2018-09-28' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Fix Edid emulation for linux drm/amd/display: Fix Vega10 lightup on S3 resume drm/amdgpu: Fix vce work queue was not cancelled when suspend Revert "drm/panel: Add device_link from panel device to DRM device" drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set drm/malidp: Fix writeback in NV12 drm: mali-dp: Call drm_crtc_vblank_reset on device init drm/etnaviv: add DMA configuration for etnaviv platform device
2018-09-28Merge tag 'riscv-for-linus-4.19-rc6' of ↵Greg Kroah-Hartman
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Palmer writes: "A Single RISC-V Update for 4.19-rc6 The Debian guys have been pushing on our port and found some unversioned symbols leaking into modules. This PR contains a single fix for that issue." * tag 'riscv-for-linus-4.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: RISC-V: include linux/ftrace.h in asm-prototypes.h
2018-09-28Merge tag 'pci-v4.19-fixes-2' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Bjorn writes: "PCI fixes: - Fix ACPI hotplug issue that causes black screen crash at boot (Mika Westerberg) - Fix DesignWare "scheduling while atomic" issues (Jisheng Zhang) - Add PPC contacts to MAINTAINERS for PCI core error handling (Bjorn Helgaas) - Sort Mobiveil MAINTAINERS entry (Lorenzo Pieralisi)" * tag 'pci-v4.19-fixes-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge PCI: dwc: Fix scheduling while atomic issues MAINTAINERS: Move mobiveil PCI driver entry where it belongs MAINTAINERS: Update PPC contacts for PCI core error handling
2018-09-28mmc: slot-gpio: Fix debounce time to use miliseconds againMarek Szyprowski
The debounce value passed to mmc_gpiod_request_cd() function is in microseconds, but msecs_to_jiffies() requires the value to be in miliseconds to properly calculate the delay, so adjust the value stored in cd_debounce_delay_ms context entry. Fixes: 1d71926bbd59 ("mmc: core: Fix debounce time to use microseconds") Fixes: bfd694d5e21c ("mmc: core: Add tunable delay before detecting card after card is inserted") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-09-28Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-linusJens Axboe
Pull NVMe fix from Christoph. * 'nvme-4.19' of git://git.infradead.org/nvme: nvme: properly propagate errors in nvme_mpath_init
2018-09-28xen/blkfront: correct purging of persistent grantsJuergen Gross
Commit a46b53672b2c2e3770b38a4abf90d16364d2584b ("xen/blkfront: cleanup stale persistent grants") introduced a regression as purged persistent grants were not pu into the list of free grants again. Correct that. Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-28Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"Jens Axboe
Fix didn't work for all cases, reverting to add a (hopefully) better fix. This reverts commit f151ba989d149bbdfc90e5405724bbea094f9b17. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-28bpf: harden flags check in cgroup_storage_update_elem()Roman Gushchin
cgroup_storage_update_elem() shouldn't accept any flags argument values except BPF_ANY and BPF_EXIST to guarantee the backward compatibility, had a new flag value been added. Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps") Signed-off-by: Roman Gushchin <guro@fb.com> Reported-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-28netfilter: xt_socket: check sk before checking for netns.Flavio Leitner
Only check for the network namespace if the socket is available. Fixes: f564650106a6 ("netfilter: check if the socket netns is correct.") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-09-28netfilter: avoid erronous array bounds warningFlorian Westphal
Unfortunately some versions of gcc emit following warning: $ make net/xfrm/xfrm_output.o linux/compiler.h:252:20: warning: array subscript is above array bounds [-Warray-bounds] hook_head = rcu_dereference(net->nf.hooks_arp[hook]); ^~~~~~~~~~~~~~~~~~~~~ xfrm_output_resume passes skb_dst(skb)->ops->family as its 'pf' arg so compiler can't know that we'll never access hooks_arp[]. (NFPROTO_IPV4 or NFPROTO_IPV6 are only possible cases). Avoid this by adding an explicit WARN_ON_ONCE() check. This patch has no effect if the family is a compile-time constant as gcc will remove the switch() construct entirely. Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-09-28netfilter: nft_set_rbtree: add missing rb_erase() in GC routineTaehee Yoo
The nft_set_gc_batch_check() checks whether gc buffer is full. If gc buffer is full, gc buffer is released by the nft_set_gc_batch_complete() internally. In case of rbtree, the rb_erase() should be called before calling the nft_set_gc_batch_complete(). therefore the rb_erase() should be called before calling the nft_set_gc_batch_check() too. test commands: table ip filter { set set1 { type ipv4_addr; flags interval, timeout; gc-interval 10s; timeout 1s; elements = { 1-2, 3-4, 5-6, ... 10000-10001, } } } %nft -f test.nft splat looks like: [ 430.273885] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 430.282158] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 430.283116] CPU: 1 PID: 190 Comm: kworker/1:2 Tainted: G B 4.18.0+ #7 [ 430.283116] Workqueue: events_power_efficient nft_rbtree_gc [nf_tables_set] [ 430.313559] RIP: 0010:rb_next+0x81/0x130 [ 430.313559] Code: 08 49 bd 00 00 00 00 00 fc ff df 48 bb 00 00 00 00 00 fc ff df 48 85 c0 75 05 eb 58 48 89 d4 [ 430.313559] RSP: 0018:ffff88010cdb7680 EFLAGS: 00010207 [ 430.313559] RAX: 0000000000b84854 RBX: dffffc0000000000 RCX: ffffffff83f01973 [ 430.313559] RDX: 000000000017090c RSI: 0000000000000008 RDI: 0000000000b84864 [ 430.313559] RBP: ffff8801060d4588 R08: fffffbfff09bc349 R09: fffffbfff09bc349 [ 430.313559] R10: 0000000000000001 R11: fffffbfff09bc348 R12: ffff880100f081a8 [ 430.313559] R13: dffffc0000000000 R14: ffff880100ff8688 R15: dffffc0000000000 [ 430.313559] FS: 0000000000000000(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000 [ 430.313559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 430.313559] CR2: 0000000001551008 CR3: 000000005dc16000 CR4: 00000000001006e0 [ 430.313559] Call Trace: [ 430.313559] nft_rbtree_gc+0x112/0x5c0 [nf_tables_set] [ 430.313559] process_one_work+0xc13/0x1ec0 [ 430.313559] ? _raw_spin_unlock_irq+0x29/0x40 [ 430.313559] ? pwq_dec_nr_in_flight+0x3c0/0x3c0 [ 430.313559] ? set_load_weight+0x270/0x270 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __switch_to_asm+0x40/0x70 [ 430.313559] ? __switch_to_asm+0x34/0x70 [ 430.313559] ? __schedule+0x6d3/0x1f50 [ 430.313559] ? find_held_lock+0x39/0x1c0 [ 430.313559] ? __sched_text_start+0x8/0x8 [ 430.313559] ? cyc2ns_read_end+0x10/0x10 [ 430.313559] ? save_trace+0x300/0x300 [ 430.313559] ? sched_clock_local+0xd4/0x140 [ 430.313559] ? find_held_lock+0x39/0x1c0 [ 430.313559] ? worker_thread+0x353/0x1120 [ 430.313559] ? worker_thread+0x353/0x1120 [ 430.313559] ? lock_contended+0xe70/0xe70 [ 430.313559] ? __lock_acquire+0x4500/0x4500 [ 430.535635] ? do_raw_spin_unlock+0xa5/0x330 [ 430.535635] ? do_raw_spin_trylock+0x101/0x1a0 [ 430.535635] ? do_raw_spin_lock+0x1f0/0x1f0 [ 430.535635] ? _raw_spin_lock_irq+0x10/0x70 [ 430.535635] worker_thread+0x15d/0x1120 [ ... ] Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-09-28rxrpc: Fix error distributionDavid Howells
Fix error distribution by immediately delivering the errors to all the affected calls rather than deferring them to a worker thread. The problem with the latter is that retries and things can happen in the meantime when we want to stop that sooner. To this end: (1) Stop the error distributor from removing calls from the error_targets list so that peer->lock isn't needed to synchronise against other adds and removals. (2) Require the peer's error_targets list to be accessed with RCU, thereby avoiding the need to take peer->lock over distribution. (3) Don't attempt to affect a call's state if it is already marked complete. Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socketDavid Howells
It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on IP_RECVERR, so neither local errors nor ICMP-transported remote errors from IPv4 peer addresses are returned to the AF_RXRPC protocol. Make the sockopt setting code in rxrpc_open_socket() fall through from the AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in the AF_INET6 case. Fixes: f2aeed3a591f ("rxrpc: Fix error reception on AF_INET6 sockets") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28rxrpc: Make service call handling more robustDavid Howells
Make the following changes to improve the robustness of the code that sets up a new service call: (1) Cache the rxrpc_sock struct obtained in rxrpc_data_ready() to do a service ID check and pass that along to rxrpc_new_incoming_call(). This means that I can remove the check from rxrpc_new_incoming_call() without the need to worry about the socket attached to the local endpoint getting replaced - which would invalidate the check. (2) Cache the rxrpc_peer struct, thereby allowing the peer search to be done once. The peer is passed to rxrpc_new_incoming_call(), thereby saving the need to repeat the search. This also reduces the possibility of rxrpc_publish_service_conn() BUG()'ing due to the detection of a duplicate connection, despite the initial search done by rxrpc_find_connection_rcu() having turned up nothing. This BUG() shouldn't ever get hit since rxrpc_data_ready() *should* be non-reentrant and the result of the initial search should still hold true, but it has proven possible to hit. I *think* this may be due to __rxrpc_lookup_peer_rcu() cutting short the iteration over the hash table if it finds a matching peer with a zero usage count, but I don't know for sure since it's only ever been hit once that I know of. Another possibility is that a bug in rxrpc_data_ready() that checked the wrong byte in the header for the RXRPC_CLIENT_INITIATED flag might've let through a packet that caused a spurious and invalid call to be set up. That is addressed in another patch. (3) Fix __rxrpc_lookup_peer_rcu() to skip peer records that have a zero usage count rather than stopping and returning not found, just in case there's another peer record behind it in the bucket. (4) Don't search the peer records in rxrpc_alloc_incoming_call(), but rather either use the peer cached in (2) or, if one wasn't found, preemptively install a new one. Fixes: 8496af50eb38 ("rxrpc: Use RCU to access a peer's service connection tree") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28rxrpc: Improve up-front incoming packet checkingDavid Howells
Do more up-front checking on incoming packets to weed out invalid ones and also ones aimed at services that we don't support. Whilst we're at it, replace the clearing of call and skew if we don't find a connection with just initialising the variables to zero at the top of the function. Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28rxrpc: Emit BUSY packets when supposed to rather than ABORTsDavid Howells
In the input path, a received sk_buff can be marked for rejection by setting RXRPC_SKB_MARK_* in skb->mark and, if needed, some auxiliary data (such as an abort code) in skb->priority. The rejection is handled by queueing the sk_buff up for dealing with in process context. The output code reads the mark and priority and, theoretically, generates an appropriate response packet. However, if RXRPC_SKB_MARK_BUSY is set, this isn't noticed and an ABORT message with a random abort code is generated (since skb->priority wasn't set to anything). Fix this by outputting the appropriate sort of packet. Also, whilst we're at it, most of the marks are no longer used, so remove them and rename the remaining two to something more obvious. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28rxrpc: Fix RTT gatheringDavid Howells
Fix RTT information gathering in AF_RXRPC by the following means: (1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS. (2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready() collects it, set it at that point. (3) Allow ACKs to be requested on the last packet of a client call, but not a service call. We need to be careful lest we undo: bf7d620abf22c321208a4da4f435e7af52551a21 Author: David Howells <dhowells@redhat.com> Date: Thu Oct 6 08:11:51 2016 +0100 rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase but that only really applies to service calls that we're handling, since the client side gets to send the final ACK (or not). (4) When about to transmit an ACK or DATA packet, record the Tx timestamp before only; don't update the timestamp afterwards. (5) Switch the ordering between recording the serial and recording the timestamp to always set the serial number first. The serial number shouldn't be seen referenced by an ACK packet until we've transmitted the packet bearing it - so in the Rx path, we don't need the timestamp until we've checked the serial number. Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker") Signed-off-by: David Howells <dhowells@redhat.com>
2018-09-28rxrpc: Fix checks as to whether we should set up a new callDavid Howells
There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED flag in the packet type field rather than in the packet flags field. Fix this by creating a pair of helper functions to check whether the packet is going to the client or to the server and use them generally. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com>