linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2018-12-20	igb: Change RXPBSIZE size when setting Qav mode	Jesus Sanchez-Palencia
	Section 4.5.9 of the datasheet says that the total size of all packet buffers combined (TxPB 0 + 1 + 2 + 3 + RxPB + BMC2OS + OS2BMC) must not exceed 60KB. Today we are configuring a total of 62KB, so reduce the RxPB from 32KB to 30KB in order to respect that. The choice of changing RxPBSIZE here is mainly because it seems more correct to give more priority to the transmit packet buffers over the receiver ones when running in Qav mode. Also, the BMC2OS and OS2BMC sizes are already too short. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20	kyber: use sbitmap add_wait_queue/list_del wait helpers	Jens Axboe
	sbq_wake_ptr() checks sbq->ws_active to know if it needs to loop the wait indexes or not. This requires the use of the sbitmap waitqueue wrappers, but kyber doesn't use those for its domain token waitqueue handling. Convert kyber to use the helpers. This fixes a hang with waiting for domain tokens. Fixes: 5d2ee7122c73 ("sbitmap: optimize wakeup check") Tested-by: Ming Lei <ming.lei@redhat.com> Reported-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	sbitmap: add helpers for add/del wait queue handling	Jens Axboe
	After commit 5d2ee7122c73, users of sbitmap that need wait queue handling must use the provided helpers. But we only added prepare_to_wait()/finish_wait() style helpers, add the equivalent add_wait_queue/list_del wrappers as we.. This is needed to ensure kyber plays by the sbitmap waitqueue rules. Tested-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	igb: reduce CPU0 latency when updating statistics	Jeff Kirsher
	This change is based off of the work and suggestion of Jan Jablonsky <jan.jablonsky@thalesgroup.com>. The Watchdog workqueue in igb driver is scheduled every 2s for each network interface. That includes updating a statistics protected by spinlock. Function igb_update_stats in this case will be protected against preemption. According to number of a statistics registers (cca 60), processing this function might cause additional cpu load on CPU0. In case of statistics spinlock may be replaced with mutex, which reduce latency on CPU0. CC: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com> CC: Jan Jablonsky <jan.jablonsky@thalesgroup.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20	MIPS: math-emu: Write-protect delay slot emulation pages	Paul Burton
	Mapping the delay slot emulation page as both writeable & executable presents a security risk, in that if an exploit can write to & jump into the page then it can be used as an easy way to execute arbitrary code. Prevent this by mapping the page read-only for userland, and using access_process_vm() with the FOLL_FORCE flag to write to it from mips_dsemul(). This will likely be less efficient due to copy_to_user_page() performing cache maintenance on a whole page, rather than a single line as in the previous use of flush_cache_sigtramp(). However this delay slot emulation code ought not to be running in any performance critical paths anyway so this isn't really a problem, and we can probably do better in copy_to_user_page() anyway in future. A major advantage of this approach is that the fix is small & simple to backport to stable kernels. Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot instructions") Cc: stable@vger.kernel.org # v4.8+ Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Rich Felker <dalias@libc.org> Cc: David Daney <david.daney@cavium.com>
2018-12-20	security: integrity: partial revert of make ima_main explicitly non-modular	Paul Gortmaker
	In commit 4f83d5ea643a ("security: integrity: make ima_main explicitly non-modular") I'd removed <linux/module.h> after assuming that the function is_module_sig_enforced() was an LSM function and not a core kernel module function. Unfortunately the typical .config selections used in build testing provide an implicit <linux/module.h> presence, and so normal/typical build testing did not immediately reveal my incorrect assumption. Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: linux-ima-devel@lists.sourceforge.net Cc: linux-security-module@vger.kernel.org Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-12-20	Merge tag 'perf-core-for-mingo-4.21-20181218' of ↵	Ingo Molnar
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Implement BPF based syscall filtering in 'perf trace', using BPF maps and the augmented_raw_syscalls.c BPF proggie (Arnaldo Carvalho de Melo) - Allow specifying in .perfconfig a set of events use in 'perf trace' in addition to any other specified from the command line. This initially will be used to always use the augmented_raw_syscalls.o precompiled BPF program for getting pointer contents. (Arnaldo Carvalho de Melo) - Allow fine grained control about how the syscall output should be formatted. This will be used to allow producing the same output produced by the 'strace' tool, to then use in regression tests comparing the output of 'perf trace' with the one produced from 'strace' (Arnaldo Carvalho de Melo) - Beautify the renameat2 olddirfd, newdirfd and flags arguments (Arnaldo Carvalho de Melo) - Beautify arch_prctl 'code' syscall arg (Arnaldo Carvalho de Melo) - Beautify fadvise64 'advice' syscall arg (Arnaldo Carvalho de Melo) - Relax checks on perf-PID.map ownership, resulting in symbols in executable anonymous maps setup by JITs in things like node.js to be resolved in a 'perf top' session run by root without the need for --force to be used (Arnaldo Carvalho de Melo) - Update asm-generic/unistd.h copy (Arnaldo Carvalho de Melo) - Do not use the first and last symbols when setting up address filters in auxtrace, this fails when we don't have a symbol table, filter the entire area based on the dso size. (Adrian Hunter) - Do not use kernel headers to build libsubcmd, we shouldn't use anything from outside tools/, fixes the build with the Android NDK (Arnaldo Carvalho de Melo) - Add several prototypes for systems lacking those, such as open_memstream(), sigqueue(), fixing warnings building with Android's bionic libc that were preventing the use of -Werror there (Arnaldo Carvalho de Melo) - Use LDFLAGS in the libtraceevent build commands, allowing developers to override its values (Jiri Olsa) - Link libperf-jvmti.so with LDFLAGS variable, allowing distro packages to propagate its settings when building this library (Jiri Olsa) - cs-etm (ARM CoreSight) fixes: (Leo Yan) - Correct packets swapping in cs_etm__flush() - Avoid stale branch samples when flush packet - Remove unused 'trace_on' in cs_etm_decoder - Refactor enumeration cs_etm_sample_type - Rename CS_ETM_TRACE_ON to CS_ETM_DISCONTINUITY - Treat NO_SYNC element as trace discontinuity - Treat EO_TRACE element as trace discontinuity - Generate branch sample for exception packet - Use shebangs in the 'perf test' shell scripts, making them identifiable as shell scripts (Michael Petlan) - Avoid segfaults caused by negated options in 'perf stat' (Michael Petlan) - Fix processing of dereferenced args in bprintk events in libtracevent (Steven Rostedt) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-12-20	Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git	Kalle Valo
	ath.git patches for 4.21. Major changes: ath10k * add amsdu support for QCA6174 monitor mode * report tx rate using the new ieee80211_tx_rate_update() API * wcn3990 support is not experimental anymore
2018-12-20	Merge tag 'drm-misc-fixes-2018-12-20' of ↵	Daniel Vetter
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Fix spectre v1 vuln in drm_ioctl Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20181220165740.GA42344@art_vandelay
2018-12-20	staging: mt7621-mmc: Correct spelling mistakes in comments	Jona Crasselt
	Changed "avaiable" to "available" and "interupt" to "interrupt". Signed-off-by: Jona Crasselt <jona.crasselt@fau.de> Signed-off-by: Felix Windsheimer <felix.windsheimer@fau.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-20	ath10k: add support to configure BB timing over wmi	Bhagavathi Perumal S
	Add wmi configuration cmd to configure base band(BB) power amplifier(PA) off timing values in hardware. The default PA off timings were fine tuned to make proper DFS radar detection in QCA reference design. If ODM uses different PA in their design, then the same default PA off timing values cannot be used, it requires different settling time to detect radar pulses very sooner and avoid radar detection problems. In that case it provides provision to select proper PA off timing values based on the PA hardware used. The PA component is part of FEM hardware and new device tree entry "ext-fem-name" is used to indentify the FEM hardware. And this wmi configuration cmd is enabled via wmi service flag "WMI_SERVICE_BB_TIMING_CONFIG_SUPPORT". Other way is to apply these values through calibration data, but recalibration of all boards out there might not be feasible. This change tested on firmware ver 10.2.4-1.0-00042 in QCA988X chipset. Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	dt-bindings: net: ath10k: add new dt entry to identify external FEM	Bhagavathi Perumal S
	This adds new dt entry ext-fem-name, it is used by ath10k driver to select correct timing parameters and configure it in target wifi hardware. The Front End Module(FEM) normally includes tx power amplifier(PA) and rx low noise amplifier(LNA). The default timing parameters like tx end to PA off timing values were fine tuned for internal FEM used in reference design. And these timing values can not be same if ODM modifies hardware design with different external FEM. This DT entry helps to choose correct timing values in driver if different external FEM hardware used. Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	dt-bindings: net: ath10k: fix node name and device type in qcom ath10k example	Bhagavathi Perumal S
	In qcom,ath10k documentation, ath10k is used as node name in the example of pci based device. Normally, node name should be class of device and not the model name, so fix it to node name "wifi". And remove the property device_type pci since only pci bridges should have this property. Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: fix tx_stats memory leak	Zhi Chen
	Memory of tx_stats was allocated when a STA was added. But it's not freed if the STA failed to be added to driver. This issue could be seen in MDK3 attack case when STA number reached the limit. Tested: QCA9984 with firmware ver 10.4-3.9.0.1-00005 Signed-off-by: Zhi Chen <zhichen@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: fix peer stats null pointer dereference	Zhi Chen
	There was a race condition in SMP that an ath10k_peer was created but its member sta was null. Following are procedures of ath10k_peer creation and member sta access in peer statistics path. 1. Peer creation: ath10k_peer_create() =>ath10k_wmi_peer_create() =>ath10k_wait_for_peer_created() ... # another kernel path, RX from firmware ath10k_htt_t2h_msg_handler() =>ath10k_peer_map_event() =>wake_up() # ar->peer_map[id] = peer //add peer to map #wake up original path from waiting ... # peer->sta = sta //sta assignment 2. RX path of statistics ath10k_htt_t2h_msg_handler() =>ath10k_update_per_peer_tx_stats() =>ath10k_htt_fetch_peer_stats() # peer->sta //sta accessing Any access of peer->sta after peer was added to peer_map but before sta was assigned could cause a null pointer issue. And because these two steps are asynchronous, no proper lock can protect them. So both peer and sta need to be checked before access. Tested: QCA9984 with firmware ver 10.4-3.9.0.1-00005 Signed-off-by: Zhi Chen <zhichen@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	dt: bindings: ath10k: add bindings for wifi iommu node	Govind Singh
	WCN3990 wifi module can optionally make use of the IOMMU. Add binding documentation for phandle to the IOMMU and the stream id of wifi iommu block. Signed-off-by: Govind Singh <govinds@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	dt: bindings: ath10k: add missing dt properties for WCN3990 wifi node	Govind Singh
	Add missing optional properties in WCN3990 wifi node. Signed-off-by: Govind Singh <govinds@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: remove an unnecessary NULL check	Dan Carpenter
	The "survey" pointer is the address of an array element. We know that it can't be NULL so this check can be removed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: move non-fatal warn logs to dbg level	Govind Singh
	During driver load below warn logs are printed in the console. Since driver may not implement all wmi events sent by fw and all of them are non-fatal, move this log to debug level to remove un-necessary warn message on console. [ 361.887230] ath10k_snoc a000000.wifi: Unknown eventid: 16393 [ 361.907037] ath10k_snoc a000000.wifi: Unknown eventid: 237569 Signed-off-by: Govind Singh <govinds@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: fix a NULL vs IS_ERR() check	Dan Carpenter
	The devm_memremap() function doesn't return NULLs, it returns error pointers. Fixes: ba94c753ccb4 ("ath10k: add QMI message handshake for wcn3990 client") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: remove work in progress logs from snoc driver	Govind Singh
	All the necessary patches to make wifi running (over SNOC) are merged and tested on SDM845/QCS404 platform with WCN3990 wifi module, hence remove work in progress debug from snoc driver and Kconfig. Signed-off-by: Govind Singh <govinds@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: fix warning due to msdu limit error	Bhagavathi Perumal S
	Some hardwares variants (QCA99x0) are limiting msdu deaggregation with some threshold value(default limit in QCA99x0 is 64 msdus), it was introduced to avoid excessive MSDU-deaggregation in error cases. When number of sub frames exceeds the limit, target hardware will send all msdus starting from present msdu in RAW format as a single msdu packet and it will be indicated with error status bit "RX_MSDU_END_INFO0_MSDU_LIMIT_ERR" set in rx descriptor. This msdu frame is a partial raw MSDU and does't have first msdu and ieee80211 header. It caused below warning message. [ 320.151332] ------------[ cut here ]------------ [ 320.155006] WARNING: CPU: 0 PID: 3 at drivers/net/wireless/ath/ath10k/htt_rx.c:1188 In our issue case, MSDU limit error happened due to FCS error and generated this warning message. This fixes the warning by handling the MSDU limit error. If msdu limit error happens, driver adds first MSDU's ieee80211 header and sets A-MSDU present bit in QOS header so that upper layer processes this frame if it is valid or drop it if FCS error set. And removed the warning message, hence partial msdus without first msdu is expected in msdu limit error cases. Tested on QCA9984, Firmware 10.4-3.6-00104 Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: disable 4addr source port learning in 10.4 FW by default	Sathishkumar Muruganandam
	Currently in 10.4 FW, all the received 4addr frames are processed for source port learning which is enabled by default. This learning can't be disabled by default in FW since it breaks backward compatibility. Since ath10k uses mac80211 based 4addr mode, source port learning done in 10.4 FW is redundant and also causes issues when 3addr frames are transmitted/received for a 4addr station. One such visible functional impact is when GTK rekey frame from hostapd based AP to 4addr STA is dropped in AP's 10.4 FW. This is since GTK rekey EAPOL frame is 3addr frame on AP interface and STA enabled with 4addr is already allowed for receiving 3addr EAPOL frames. Source port learning implementation in 10.4 FW drops this 3addr GTK rekey frame in AP destinated for 4addr STA causing disassociation and re-association for every GTK rekey session. GTK rekey issue is not seen when learning is disabled in FW. To prevent such issues without breaking backward compatibility, FW advertises new service bit making the source port learning configurable and this learning is being currently disabled during ath10k vdev creation. * Tested HW: QCA9984 * Tested FW: 10.4-3.6.0.1-00004 Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: report tx rate using ieee80211_tx_rate_update()	Anilkumar Kolli
	Mesh path metric needs tx rate information from ieee80211_tx_status() call but in ath10k there is no mechanism to report tx rate information via ieee80211_tx_status(), the tx rate is only accessible via sta_statiscs() op. Per peer tx stats has tx rate info available, Tx rate is available to ath10k driver after every 4 PPDU sent in the air. For each PPDU, ath10k driver updates rate informattion to mac80211 using ieee80211_tx_rate_update(). Per peer txrate information is updated through per peer statistics and is available for QCA9888/QCA9984/QCA4019/QCA998X only Tested on QCA9984 with firmware-5.bin_10.4-3.5.3-00053 Tested on QCA998X with firmware-5.bin_10.2.4-1.0-00036 Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	ath10k: add amsdu support for monitor mode	Yu Wang
	When processing HTT_T2H_MSG_TYPE_RX_IN_ORD_PADDR_IND, if the length of a msdu is larger than the tailroom of the rx skb, skb_over_panic issue will happen when calling skb_put. In monitor mode, amsdu will be handled in this path, and msdu_len of the first msdu_desc is the length of the entire amsdu, which might be larger than the maximum length of a skb, in such case, it will hit the issue upon. To fix this issue, process msdu list separately for monitor mode. Successfully tested with: QCA6174 (FW version: RM.4.4.1.c2-00057-QCARMSWP-1). Signed-off-by: Yu Wang <yyuwang@codeaurora.org> [kvalo@codeaurora.org: cosmetic cleanup] Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	drbd: Change drbd_request_detach_interruptible's return type to int	Nathan Chancellor
	Clang warns when an implicit conversion is done between enumerated types: drivers/block/drbd/drbd_state.c:708:8: warning: implicit conversion from enumeration type 'enum drbd_ret_code' to different enumeration type 'enum drbd_state_rv' [-Wenum-conversion] rv = ERR_INTR; ~ ^~~~~~~~ drbd_request_detach_interruptible's only call site is in the return statement of adm_detach, which returns an int. Change the return type of drbd_request_detach_interruptible to match, silencing Clang's warning. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: Avoid Clang warning about pointless switch statment	Nathan Chancellor
	There are several warnings from Clang about no case statement matching the constant 0: In file included from drivers/block/drbd/drbd_receiver.c:48: In file included from drivers/block/drbd/drbd_int.h:48: In file included from ./include/linux/drbd_genl_api.h:54: In file included from ./include/linux/genl_magic_struct.h:236: ./include/linux/drbd_genl.h:321:1: warning: no case matching constant switch condition '0' GENL_struct(DRBD_NLA_HELPER, 24, drbd_helper_info, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/genl_magic_struct.h:220:10: note: expanded from macro 'GENL_struct' switch (0) { ^ Silence this warning by adding a 'case 0:' statement. Additionally, adjust the alignment of the statements in the ct_assert_unique macro to avoid a checkpatch warning. This solution was originally sent by Arnd Bergmann with a default case statement: https://lore.kernel.org/patchwork/patch/756723/ Link: https://github.com/ClangBuiltLinux/linux/issues/43 Suggested-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")	Lars Ellenberg
	And also re-enable partial-zero-out + discard aligned. With the introduction of REQ_OP_WRITE_ZEROES, we started to use that for both WRITE_ZEROES and DISCARDS, hoping that WRITE_ZEROES would "do what we want", UNMAP if possible, zero-out the rest. The example scenario is some LVM "thin" backend. While an un-allocated block on dm-thin reads as zeroes, on a dm-thin with "skip_block_zeroing=true", after a partial block write allocated that block, that same block may well map "undefined old garbage" from the backends on LBAs that have not yet been written to. If we cannot distinguish between zero-out and discard on the receiving side, to avoid "undefined old garbage" to pop up randomly at later times on supposedly zero-initialized blocks, we'd need to map all discards to zero-out on the receiving side. But that would potentially do a full alloc on thinly provisioned backends, even when the expectation was to unmap/trim/discard/de-allocate. We need to distinguish on the protocol level, whether we need to guarantee zeroes (and thus use zero-out, potentially doing the mentioned full-alloc), or if we want to put the emphasis on discard, and only do a "best effort zeroing" (by "discarding" blocks aligned to discard-granularity, and zeroing only potential unaligned head and tail clippings to at least try to avoid "false positives" in an online-verify later), hoping that someone set skip_block_zeroing=false. For some discussion regarding this on dm-devel, see also https://www.mail-archive.com/dm-devel%40redhat.com/msg07965.html https://www.redhat.com/archives/dm-devel/2018-January/msg00271.html For backward compatibility, P_TRIM means zero-out, unless the DRBD_FF_WZEROES feature flag is agreed upon during handshake. To have upper layers even try to submit WRITE ZEROES requests, we need to announce "efficient zeroout" independently. We need to fixup max_write_zeroes_sectors after blk_queue_stack_limits(): if we can handle "zeroes" efficiently on the protocol, we want to do that, even if our backend does not announce max_write_zeroes_sectors itself. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: skip spurious timeout (ping-timeo) when failing promote	Lars Ellenberg
	If you try to promote a Secondary while connected to a Primary and allow-two-primaries is NOT set, we will wait for "ping-timeout" to give this node a chance to detect a dead primary, in case the cluster manager noticed faster than we did. But if we then are still connected to a Primary, we fail (after an additional timeout of ping-timout). This change skips the spurious second timeout. Most people won't notice really, since "ping-timeout" by default is half a second. But in some installations, ping-timeout may be 10 or 20 seconds or more, and spuriously delaying the error return becomes annoying. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: don't retry connection if peers do not agree on "authentication" settings	Lars Ellenberg
	emma: "Unexpected data packet AuthChallenge (0x0010)" ava: "expected AuthChallenge packet, received: ReportProtocol (0x000b)" "Authentication of peer failed, trying again." Pattern repeats. There is no point in retrying the handshake, if we expect to receive an AuthChallenge, but the peer is not even configured to expect or use a shared secret. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: fix print_st_err()'s prototype to match the definition	Luc Van Oostenryck
	print_st_err() is defined with its 4th argument taking an 'enum drbd_state_rv' but its prototype use an int for it. Fix this by using 'enum drbd_state_rv' in the prototype too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: avoid spurious self-outdating with concurrent disconnect / down	Lars Ellenberg
	If peers are "simultaneously" told to disconnect from each other, either explicitly, or implicitly by taking down the resource, with bad timing, one side may see its disconnect "fail" with a result of "state change failed by peer", and interpret this as "please oudate yourself". Try to catch this by checking for current connection status, and possibly retry as local-only state change instead. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: do not block when adjusting "disk-options" while IO is frozen	Lars Ellenberg
	"suspending" IO is overloaded. It can mean "do not allow new requests" (obviously), but it also may mean "must not complete pending IO", for example while the fencing handlers do their arbitration. When adjusting disk options, we suspend io (disallow new requests), then wait for the activity-log to become unused (drain all IO completions), and possibly replace it with a new activity log of different size. If the other "suspend IO" aspect is active, pending IO completions won't happen, and we would block forever (unkillable drbdsetup process). Fix this by skipping the activity log adjustment if the "al-extents" setting did not change. Also, in case it did change, fail early without blocking if it looks like we would block forever. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: fix comment typos	Lars Ellenberg
	Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: reject attach of unsuitable uuids even if connected	Lars Ellenberg
	Multiple failure scenario: a) all good Connected Primary/Secondary UpToDate/UpToDate b) lose disk on Primary, Connected Primary/Secondary Diskless/UpToDate c) continue to write to the device, changes only make it to the Secondary storage. d) lose disk on Secondary, Connected Primary/Secondary Diskless/Diskless e) now try to re-attach on Primary This would have succeeded before, even though that is clearly the wrong data set to attach to (missing the modifications from c). Because we only compared our "effective" and the "to-be-attached" data generation uuid tags if (device->state.conn < C_CONNECTED). Fix: change that constraint to (device->state.pdsk != D_UP_TO_DATE) compare the uuids, and reject the attach. This patch also tries to improve the reverse scenario: first lose Secondary, then Primary disk, then try to attach the disk on Secondary. Before this patch, the attach on the Secondary succeeds, but since commit drbd: disconnect, if the wrong UUIDs are attached on a connected peer the Primary will notice unsuitable data, and drop the connection hard. Though unfortunately at a point in time during the handshake where we cannot easily abort the attach on the peer without more refactoring of the handshake. We now reject any attach to "unsuitable" uuids, as long as we can see a Primary role, unless we already have access to "good" data. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: attach on connected diskless peer must not shrink a consistent device	Lars Ellenberg
	If we would reject a new handshake, if the peer had attached first, and then connected, we should force disconnect if the peer first connects, and only then attaches. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: fix confusing error message during attach	Lars Ellenberg
	If we attach a (consistent) backing device, which knows about a last-agreed effective size, and that effective size is larger than the currently requested size, we refused to attach with ERR_DISK_TOO_SMALL Failure: (111) Low.dev. smaller than requested DRBD-dev. size. which is confusing to say the least. This patch changes the error code in that case to ERR_IMPLICIT_SHRINK Failure: (170) Implicit device shrinking not allowed. See kernel log. additional info from kernel: To-be-attached device has last effective > current size, and is consistent (9999 > 7777 sectors). Refusing to attach. It also allows to attach with an explicit size. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: disconnect, if the wrong UUIDs are attached on a connected peer	Lars Ellenberg
	With "on-no-data-accessible suspend-io", DRBD requires the next attach or connect to be to the very same data generation uuid tag it lost last. If we first lost connection to the peer, then later lost connection to our own disk, we would usually refuse to re-connect to the peer, because it presents the wrong data set. However, if the peer first connects without a disk, and then attached its disk, we accepted that same wrong data set, which would be "unexpected" by any user of that DRBD and cause "undefined results" (read: very likely data corruption). The fix is to forcefully disconnect as soon as we notice that the peer attached to the "wrong" dataset. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: ignore "all zero" peer volume sizes in handshake	Lars Ellenberg
	During handshake, if we are diskless ourselves, we used to accept any size presented by the peer. Which could be zero if that peer was just brought up and connected to us without having a disk attached first, in which case both peers would just "flip" their volume sizes. Now, even a diskless node will ignore "zero" sizes presented by a diskless peer. Also a currently Diskless Primary will refuse to shrink during handshake: it may be frozen, and waiting for a "suitable" local disk or peer to re-appear (on-no-data-accessible suspend-io). If the peer is smaller than what we used to be, it is not suitable. The logic for a diskless node during handshake is now supposed to be: believe the peer, if - I don't have a current size myself - we agree on the size anyways - I do have a current size, am Secondary, and he has the only disk - I do have a current size, am Primary, and he has the only disk, which is larger than my current size Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: centralize printk reporting of new size into drbd_set_my_capacity()	Lars Ellenberg
	Previously, some implicit resizes that happend during handshake have not been reported as prominently as explicit resize. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: must not use connection after kref_put(&connection->kref)	Lars Ellenberg
	Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	drbd: narrow rcu_read_lock in drbd_sync_handshake	Roland Kammerer
	So far there was the possibility that we called genlmsg_new(GFP_NOIO)/mutex_lock() while holding an rcu_read_lock(). This included cases like: drbd_sync_handshake (acquire the RCU lock) drbd_asb_recover_1p drbd_khelper drbd_bcast_event genlmsg_new(GFP_NOIO) --> may sleep drbd_sync_handshake (acquire the RCU lock) drbd_asb_recover_1p drbd_khelper notify_helper genlmsg_new(GFP_NOIO) --> may sleep drbd_sync_handshake (acquire the RCU lock) drbd_asb_recover_1p drbd_khelper notify_helper mutex_lock --> may sleep While using GFP_ATOMIC whould have been possible in the first two cases, the real fix is to narrow the rcu_read_lock. Reported-by: Jia-Ju Bai <baijiaju1990@163.com> Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-20	ath10k: fix kernel panic due to use after free	Karthikeyan Periyasamy
	This issue arise in a race condition between ath10k_sta_state() and ath10k_htt_fetch_peer_stats(), explained in below scenario Steps: 1. In ath10k_sta_state(), arsta->tx_stats get deallocated before peer deletion when the station moves from IEEE80211_STA_NONE to IEEE80211_STA_NOTEXIST state. 2. Meanwhile ath10k receive HTT_T2H_MSG_TYPE_PEER_STATS message. In ath10k_htt_fetch_peer_stats(), arsta->tx_stats get accessed after the peer validation check. Since arsta->tx_stats get freed before the peer deletion [1]. ath10k_htt_fetch_peer_stats() ended up in "use after free" situation. Fixed this issue by moving the arsta->tx_stats free handling after the peer deletion. so that ath10k_htt_fetch_peer_stats() will not end up in "use after free" situation. Kernel Panic: Unable to handle kernel NULL pointer dereference at virtual address 00000286 pgd = d8754000 [00000286] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM ... CPU: 0 PID: 6245 Comm: hostapd Not tainted task: dc44cac0 ti: d4a38000 task.ti: d4a38000 PC is at kmem_cache_alloc+0x7c/0x114 LR is at ath10k_sta_state+0x190/0xd58 [ath10k_core] pc : [<c02bdc50>] lr : [<bf916b78>] psr: 20000013 sp : d4a39b88 ip : 00000000 fp : 00000001 r10: 00000000 r9 : 1d3bc000 r8 : 00000dc0 r7 : 000080d0 r6 : d4a38000 r5 : dd401b00 r4 : 00000286 r3 : 00000000 r2 : d4a39ba0 r1 : 000080d0 r0 : dd401b00 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5787d Table: 5a75406a DAC: 00000015 Process hostapd (pid: 6245, stack limit = 0xd4a38238) Stack: (0xd4a39b88 to 0xd4a3a000) ... [<c02bdc50>] (kmem_cache_alloc) from [<bf916b78>] (ath10k_sta_state+0x190/0xd58 [ath10k_core]) [<bf916b78>] (ath10k_sta_state [ath10k_core]) from [<bf870d4c>] (sta_info_insert_rcu+0x418/0x61c [mac80211]) [<bf870d4c>] (sta_info_insert_rcu [mac80211]) from [<bf88634c>] (ieee80211_add_station+0xf0/0x134 [mac80211]) [<bf88634c>] (ieee80211_add_station [mac80211]) from [<bf83f3c4>] (nl80211_new_station+0x330/0x36c [cfg80211]) [<bf83f3c4>] (nl80211_new_station [cfg80211]) from [<bf6c4040>] (extack_doit+0x2c/0x74 [compat]) [<bf6c4040>] (extack_doit [compat]) from [<c05c285c>] (genl_rcv_msg+0x274/0x30c) [<c05c285c>] (genl_rcv_msg) from [<c05c1d98>] (netlink_rcv_skb+0x58/0xac) [<c05c1d98>] (netlink_rcv_skb) from [<c05c25d4>] (genl_rcv+0x20/0x34) [<c05c25d4>] (genl_rcv) from [<c05c1750>] (netlink_unicast+0x11c/0x204) [<c05c1750>] (netlink_unicast) from [<c05c1be0>] (netlink_sendmsg+0x30c/0x370) [<c05c1be0>] (netlink_sendmsg) from [<c0587e90>] (sock_sendmsg+0x70/0x84) [<c0587e90>] (sock_sendmsg) from [<c058970c>] (___sys_sendmsg.part.3+0x188/0x228) [<c058970c>] (___sys_sendmsg.part.3) from [<c058a594>] (__sys_sendmsg+0x4c/0x70) [<c058a594>] (__sys_sendmsg) from [<c0208c80>] (ret_fast_syscall+0x0/0x44) Code: ebfffec1 e1a04000 ea00001b e5953014 (e7940003) ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon Hardware tested: QCA9984 Firmware tested: 10.4-3.6.0.1-00004 Fixes: a904417fc ("ath10k: add extended per sta tx statistics support") Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	dma-mapping: fix inverted logic in dma_supported	Thierry Reding
	The cleanup in commit 356da6d0cde3 ("dma-mapping: bypass indirect calls for dma-direct") accidentally inverted the logic in the check for the presence of a ->dma_supported() callback. Switch this back to the way it was to prevent a crash on boot. Fixes: 356da6d0cde3 ("dma-mapping: bypass indirect calls for dma-direct") Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-20	ath10k: remove set but not used variable 'num_tdls_vifs'	YueHaibing
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_sta_state': drivers/net/wireless/ath/ath10k/mac.c:6238:7: warning: variable 'num_tdls_vifs' set but not used [-Wunused-but-set-variable] 'num_tdls_vifs' not used any more after 9a993cc1ea95 ("ath10k: fix the logic of limiting tdls peer counts") Also, remove the single called function ath10k_mac_tdls_vifs_count and ath10k_mac_tdls_vifs_count_iter. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-12-20	Revert "arm64: defconfig: Enable FSL_MC_BUS and FSL_MC_DPIO"	Horia Geantă
	This reverts commit d9678adbe733a770428a98651beaa2817d503ed3. Received below report from Stefan. Revert the commit until CAAM driver dependency cycles are fixed. this patch in next-20181214 breaks "make modules_install" for arm64/defconfig on my Ubuntu machine: DEPMOD 4.20.0-rc6-next-20181214 depmod: ERROR: Found 6 modules in dependency cycles! depmod: ERROR: Cycle detected: caamalg_desc -> dpaa2_caam -> authenc depmod: ERROR: Cycle detected: caamalg_desc -> dpaa2_caam -> fsl_mc_dpio depmod: ERROR: Cycle detected: dpaa2_caam -> caamhash_desc -> dpaa2_caam depmod: ERROR: Cycle detected: caamalg_desc -> dpaa2_caam -> caamhash_desc -> error depmod: ERROR: Cycle detected: caamalg_desc -> dpaa2_caam -> caamhash_desc -> caamalg_desc Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-12-20	Merge tag 'renesas-fixes2-for-v4.20' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Second Round of Renesas ARM Based SoC Fixes for v4.20 * R-Car D3 (r8a77995) SoC based Draak board - Correct CVBS input to allow probing of the VIN * tag 'renesas-fixes2-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: arm64: dts: renesas: draak: Fix CVBS input
2018-12-20	selinux: expand superblock_doinit() calls	Al Viro
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Howells <dhowells@redhat.com>
2018-12-20	vfs: Separate changing mount flags full remount	David Howells
	Separate just the changing of mount flags (MS_REMOUNT\|MS_BIND) from full remount because the mount data will get parsed with the new fs_context stuff prior to doing a remount - and this causes the syscall to fail under some circumstances. To quote Eric's explanation: [...] mount(..., MS_REMOUNT\|MS_BIND, ...) now validates the mount options string, which breaks systemd unit files with ProtectControlGroups=yes (e.g. systemd-networkd.service) when systemd does the following to change a cgroup (v1) mount to read-only: mount(NULL, "/run/systemd/unit-root/sys/fs/cgroup/systemd", NULL, MS_RDONLY\|MS_NOSUID\|MS_NODEV\|MS_NOEXEC\|MS_REMOUNT\|MS_BIND, NULL) ... when the kernel has CONFIG_CGROUPS=y but no cgroup subsystems enabled, since in that case the error "cgroup1: Need name or subsystem set" is hit when the mount options string is empty. Probably it doesn't make sense to validate the mount options string at all in the MS_REMOUNT\|MS_BIND case, though maybe you had something else in mind. This is also worthwhile doing because we will need to add a mount_setattr() syscall to take over the remount-bind function. Reported-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Howells <dhowells@redhat.com>
2018-12-20	vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled	David Howells
	Only the mount namespace code that implements mount(2) should be using the MS_* flags. Suppress them inside the kernel unless uapi/linux/mount.h is included. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: David Howells <dhowells@redhat.com>