linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-07-08	dm writecache: reject asynchronous pmem devices	Michal Suchanek
	DM writecache does not handle asynchronous pmem. Reject it when supplied as cache. Link: https://lore.kernel.org/linux-nvdimm/87lfk5hahc.fsf@linux.ibm.com/ Fixes: 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08	dm: use bio_uninit instead of bio_disassociate_blkg	Christoph Hellwig
	bio_uninit is the proper API to clean up a BIO that has been allocated on stack or inside a structure that doesn't come from the BIO allocator. Switch dm to use that instead of bio_disassociate_blkg, which really is an implementation detail. Note that the bio_uninit calls are also moved to the two callers of __send_empty_flush, so that they better pair with the bio_init calls used to initialize them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-08	regmap: add missing dependency on SoundWire	Pierre-Louis Bossart
	CONFIG_REGMAP is not selected when no other serial bus is supported. It's largely academic since CONFIG_I2C is usually selected e.g. by DRM, but still this can break randconfig so let's be explicit. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20200707202628.113142-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-08	mmc: sdhci-msm: Override DLL_CONFIG only if the valid value is supplied	Veerabhadrarao Badiganti
	During DLL initialization, the DLL_CONFIG register value would be updated with the value supplied from the device-tree. Override this register only if a valid value is supplied. Fixes: 03591160ca19 ("mmc: sdhci-msm: Read and use DLL Config property from device tree file") Signed-off-by: Veerabhadrarao Badiganti <vbadigan@codeaurora.org> Link: https://lore.kernel.org/r/1594213888-2780-1-git-send-email-vbadigan@codeaurora.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-07-08	RDMA/siw: Fix reporting vendor_part_id	Kamal Heib
	Move the initialization of the vendor_part_id to be before calling ib_register_device(), this is needed because the query_device() callback is called from the context of ib_register_device() before initializing the vendor_part_id, so the reported value is wrong. Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Link: https://lore.kernel.org/r/20200707130931.444724-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-08	powerpc/64s/exception: Fix 0x1500 interrupt handler crash	Nicholas Piggin
	A typo caused the interrupt handler to branch immediately to the common "unknown interrupt" handler and skip the special case test for denormal cause. This does not affect KVM softpatch handling (e.g., for POWER9 TM assist) because the KVM test was moved to common code by commit 9600f261acaa ("powerpc/64s/exception: Move KVM test to common code") just before this bug was introduced. Fixes: 3f7fbd97d07d ("powerpc/64s/exception: Clean up SRR specifiers") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> [mpe: Split selftest into a separate patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200708074942.1713396-1-npiggin@gmail.com
2020-07-08	sched: Fix unreliable rseq cpu_id for new tasks	Mathieu Desnoyers
	While integrating rseq into glibc and replacing glibc's sched_getcpu implementation with rseq, glibc's tests discovered an issue with incorrect __rseq_abi.cpu_id field value right after the first time a newly created process issues sched_setaffinity. For the records, it triggers after building glibc and running tests, and then issuing: for x in {1..2000} ; do posix/tst-affinity-static & done and shows up as: error: Unexpected CPU 2, expected 0 error: Unexpected CPU 2, expected 0 error: Unexpected CPU 2, expected 0 error: Unexpected CPU 2, expected 0 error: Unexpected CPU 138, expected 0 error: Unexpected CPU 138, expected 0 error: Unexpected CPU 138, expected 0 error: Unexpected CPU 138, expected 0 This is caused by the scheduler invoking __set_task_cpu() directly from sched_fork() and wake_up_new_task(), thus bypassing rseq_migrate() which is done by set_task_cpu(). Add the missing rseq_migrate() to both functions. The only other direct use of __set_task_cpu() is done by init_idle(), which does not involve a user-space task. Based on my testing with the glibc test-case, just adding rseq_migrate() to wake_up_new_task() is sufficient to fix the observed issue. Also add it to sched_fork() to keep things consistent. The reason why this never triggered so far with the rseq/basic_test selftest is unclear. The current use of sched_getcpu(3) does not typically require it to be always accurate. However, use of the __rseq_abi.cpu_id field within rseq critical sections requires it to be accurate. If it is not accurate, it can cause corruption in the per-cpu data targeted by rseq critical sections in user-space. Reported-By: Florian Weimer <fweimer@redhat.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-By: Florian Weimer <fweimer@redhat.com> Cc: stable@vger.kernel.org # v4.18+ Link: https://lkml.kernel.org/r/20200707201505.2632-1-mathieu.desnoyers@efficios.com
2020-07-08	sched: Fix loadavg accounting race	Peter Zijlstra
	The recent commit: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") moved these lines in ttwu(): p->sched_contributes_to_load = !!task_contributes_to_load(p); p->state = TASK_WAKING; up before: smp_cond_load_acquire(&p->on_cpu, !VAL); into the 'p->on_rq == 0' block, with the thinking that once we hit schedule() the current task cannot change it's ->state anymore. And while this is true, it is both incorrect and flawed. It is incorrect in that we need at least an ACQUIRE on 'p->on_rq == 0' to avoid weak hardware from re-ordering things for us. This can fairly easily be achieved by relying on the control-dependency already in place. The second problem, which makes the flaw in the original argument, is that while schedule() will not change prev->state, it will read it a number of times (arguably too many times since it's marked volatile). The previous condition 'p->on_cpu == 0' was sufficient because that indicates schedule() has completed, and will no longer read prev->state. So now the trick is to make this same true for the (much) earlier 'prev->on_rq == 0' case. Furthermore, in order to make the ordering stick, the 'prev->on_rq = 0' assignment needs to he a RELEASE, but adding additional ordering to schedule() is an unwelcome proposition at the best of times, doubly so for mere accounting. Luckily we can push the prev->state load up before rq->lock, with the only caveat that we then have to re-read the state after. However, we know that if it changed, we no longer have to worry about the blocking path. This gives us the required ordering, if we block, we did the prev->state load before an (effective) smp_mb() and the p->on_rq store needs not change. With this we end up with the effective ordering: LOAD p->state LOAD-ACQUIRE p->on_rq == 0 MB STORE p->on_rq, 0 STORE p->state, TASK_WAKING which ensures the TASK_WAKING store happens after the prev->state load, and all is well again. Fixes: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu") Reported-by: Dave Jones <davej@codemonkey.org.uk> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Dave Jones <davej@codemonkey.org.uk> Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com> Link: https://lkml.kernel.org/r/20200707102957.GN117543@hirez.programming.kicks-ass.net
2020-07-08	drm/hisilicon/hibmc: Move drm_fbdev_generic_setup() down to avoid the splat	Zenghui Yu
	The HiSilicon hibmc driver triggers a splat at boot time as below [ 14.137806] ------------[ cut here ]------------ [ 14.142405] hibmc-drm 0000:0a:00.0: Device has not been registered. [ 14.148661] WARNING: CPU: 0 PID: 496 at drivers/gpu/drm/drm_fb_helper.c:2233 drm_fbdev_generic_setup+0x15c/0x1b8 [ 14.158787] [...] [ 14.278307] Call trace: [ 14.280742] drm_fbdev_generic_setup+0x15c/0x1b8 [ 14.285337] hibmc_pci_probe+0x354/0x418 [ 14.289242] local_pci_probe+0x44/0x98 [ 14.292974] work_for_cpu_fn+0x20/0x30 [ 14.296708] process_one_work+0x1c4/0x4e0 [ 14.300698] worker_thread+0x2c8/0x528 [ 14.304431] kthread+0x138/0x140 [ 14.307646] ret_from_fork+0x10/0x18 [ 14.311205] ---[ end trace a2000ec2d838af4d ]--- This turned out to be due to the fbdev device hasn't been registered when drm_fbdev_generic_setup() is invoked. Let's fix the splat by moving it down after drm_dev_register() which will follow the "Display driver example" documented by commit de99f0600a79 ("drm/drv: DOC: Add driver example code"). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200706144713.1123-1-yuzenghui@huawei.com
2020-07-08	smb3: fix unneeded error message on change notify	Steve French
	We should not be logging a warning repeatedly on change notify. CC: Stable <stable@vger.kernel.org> # v5.6+ Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-07-08	Merge tag 'iio-fixes-for-5.8a' of ↵	Greg Kroah-Hartman
	git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus Jonathan writes: First set of IIO and counter fixes in the 5.8 cycle. The buffer alignment fixes continue to trickle through as we get reviews in. The rest are the standard mixed bag of long term issues just discovered an things we missed in this cycle. IIO fixes * core - Add missing IIO_MOD_H2 and ETHANOL strings. Somehow got missed when drivers were added using these in attribute names. * afe4403, afe4404, ak8974, hdc100x, hts221, ms5611 - Fix a recently identified issue with alignment when using iio_push_to_buffers_with_timestamp which assumes the timestamp is 8 byte aligned. * ad7780 - Fix a some premature / excess cleanup in an error path. * adi-axi-adc - Fix reference counting on the wrong object. * ak8974 - Fix unbalance runtime pm. * mma8452 - Fix missing iio_device_unregister in error path. * zp2326 - Error handling for pm_runtime_get_sync failing. counter fixes * Add lock guards in 104-quad-8 to protect against races - done in 2 patches to allow easy back porting. * tag 'iio-fixes-for-5.8a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio: adc: ad7780: Fix a resource handling path in 'ad7780_probe()' iio:pressure:ms5611 Fix buffer element alignment iio:humidity:hts221 Fix alignment and data leak issues iio:humidity:hdc100x Fix alignment and data leak issues iio:magnetometer:ak8974: Fix alignment and data leak issues iio: adc: adi-axi-adc: Fix object reference counting iio: pressure: zpa2326: handle pm_runtime_get_sync failure counter: 104-quad-8: Add lock guards - filter clock prescaler counter: 104-quad-8: Add lock guards - differential encoder iio: core: add missing IIO_MOD_H2/ETHANOL string identifiers iio: magnetometer: ak8974: Fix runtime PM imbalance on error iio: mma8452: Add missed iio_device_unregister() call in mma8452_probe() iio:health:afe4404 Fix timestamp alignment and prevent data leak. iio:health:afe4403 Fix timestamp alignment and prevent data leak.
2020-07-08	xtensa: simplify xtensa_pmu_irq_handler	Xu Wang
	Use for_each_set_bit() instead of open-coding it to simplify the code. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Message-Id: <20200708062023.7986-1-vulab@iscas.ac.cn> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2020-07-08	fs: remove __vfs_read	Christoph Hellwig
	Fold it into the two callers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	fs: implement kernel_read using __kernel_read	Christoph Hellwig
	Consolidate the two in-kernel read helpers to make upcoming changes easier. The only difference are the missing call to rw_verify_area in kernel_read, and an access_ok check that doesn't make sense for kernel buffers to start with. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	integrity/ima: switch to using __kernel_read	Christoph Hellwig
	__kernel_read has a bunch of additional sanity checks, and this moves the set_fs out of non-core code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	fs: add a __kernel_read helper	Christoph Hellwig
	This is the counterpart to __kernel_write, and skip the rw_verify_area call compared to kernel_read. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	fs: remove __vfs_write	Christoph Hellwig
	Fold it into the two callers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	fs: implement kernel_write using __kernel_write	Christoph Hellwig
	Consolidate the two in-kernel write helpers to make upcoming changes easier. The only difference are the missing call to rw_verify_area in kernel_write, and an access_ok check that doesn't make sense for kernel buffers to start with. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	fs: check FMODE_WRITE in __kernel_write	Christoph Hellwig
	Add a WARN_ON_ONCE if the file isn't actually open for write. This matches the check done in vfs_write, but actually warn warns as a kernel user calling write on a file not opened for writing is a pretty obvious programming error. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	fs: unexport __kernel_write	Christoph Hellwig
	This is a very special interface that skips sb_writes protection, and not used by modules anymore. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	bpfilter: switch to kernel_write	Christoph Hellwig
	While pipes don't really need sb_writers projection, __kernel_write is an interface better kept private, and the additional rw_verify_area does not hurt here. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08	autofs: switch to kernel_write	Christoph Hellwig
	While pipes don't really need sb_writers projection, __kernel_write is an interface better kept private, and the additional rw_verify_area does not hurt here. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Ian Kent <raven@themaw.net>
2020-07-08	cachefiles: switch to kernel_write	Christoph Hellwig
	__kernel_write doesn't take a sb_writers references, which we need here. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Howells <dhowells@redhat.com>
2020-07-08	scsi: dh: Add Fujitsu device to devinfo and dh lists	Steve Schremmer
	Add FUJITSU ETERNUS_AHB Link: https://lore.kernel.org/r/DM6PR06MB5276CCA765336BD312C4282E8C660@DM6PR06MB5276.namprd06.prod.outlook.com Signed-off-by: Steve Schremmer <steve.schremmer@netapp.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-07	cifs: remove the retry in cifs_poxis_lock_set	yangerkun
	The caller of cifs_posix_lock_set will do retry(like fcntl_setlk64->do_lock_file_wait) if we will wait for any file_lock. So the retry in cifs_poxis_lock_set seems duplicated, remove it to make a cleanup. Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.de>
2020-07-07	scsi: mpt3sas: Fix error returns in BRM_status_show	Johannes Thumshirn
	BRM_status_show() has several error branches, but none of them record the error in the error return. Also while at it remove the manual mutex_unlock() of the pci_access_mutex in case of an ongoing pci error recovery or host removal and jump to the cleanup label instead. Note: We can safely jump to out from here as io_unit_pg3 is initialized to NULL and if it hasn't been allocated, kfree() skips the NULL pointer. [mkp: compilation warning] Link: https://lore.kernel.org/r/20200701131454.5255-1-johannes.thumshirn@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-08	drm/nouveau/nouveau: fix page fault on device private memory	Ralph Campbell
	If system memory is migrated to device private memory and no GPU MMU page table entry exists, the GPU will fault and call hmm_range_fault() to get the PFN for the page. Since the .dev_private_owner pointer in struct hmm_range is not set, hmm_range_fault returns an error which results in the GPU program stopping with a fatal fault. Fix this by setting .dev_private_owner appropriately. Fixes: 08ddddda667b ("mm/hmm: check the device private page owner in hmm_range_fault()") Cc: stable@vger.kernel.org Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-08	drm/nouveau/svm: fix migrate page regression	Ralph Campbell
	The patch to add zero page migration to GPU memory inadvertently included part of a future change which broke normal page migration to GPU memory by copying too much data and corrupting GPU memory. Fix this by only copying one page instead of a byte count. Fixes: 9d4296a7d4b3 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU") Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-08	drm/nouveau/i2c/g94-: increase NV_PMGR_DP_AUXCTL_TRANSACTREQ timeout	Ben Skeggs
	Tegra TRM says worst-case reply time is 1216us, and this should fix some spurious timeouts that have been popping up. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-08	drm/nouveau/kms/nv50-: bail from nv50_audio_disable() early if audio not enabled	Ben Skeggs
	Prevents "snd_hda_codec_hdmi hdaudioC1D0: HDMI: pin nid 5 not registered" that occur on some configurations. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-07-07	drm/i915/gt: Pin the rings before marking active	Chris Wilson
	On eviction, we acquire the vm->mutex and then wait on the vma->active. Therefore when binding and pinning the vma, we must follow the same sequence, lock/pin the vma then mark it active. Otherwise, we mark the vma as active, then wait for the vm->mutex, and meanwhile the evictor holding the mutex waits upon us to complete our activity. Fixes: 8ccfc20a7d56 ("drm/i915/gt: Mark ring->vma as active while pinned") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200706170138.8993-1-chris@chris-wilson.co.uk (cherry picked from commit 8567774e87e23a57155e5102f81208729b992ae6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-07-07	smb3: fix access denied on change notify request to some servers	Steve French
	read permission, not just read attributes permission, is required on the directory. See MS-SMB2 (protocol specification) section 3.3.5.19. Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> # v5.6+ Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2020-07-07	ionic: centralize queue reset code	Shannon Nelson
	The queue reset pattern is used in a couple different places, only slightly different from each other, and could cause issues if one gets changed and the other didn't. This puts them together so that only one version is needed, yet each can have slighty different effects by passing in a pointer to a work function to do whatever configuration twiddling is needed in the middle of the reset. This specifically addresses issues seen where under loops of changing ring size or queue count parameters we could occasionally bump into the netdev watchdog. v2: added more commit message commentary Fixes: 4d03e00a2140 ("ionic: Add initial ethtool support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	vlan: consolidate VLAN parsing code and limit max parsing depth	Toke Høiland-Jørgensen
	Toshiaki pointed out that we now have two very similar functions to extract the L3 protocol number in the presence of VLAN tags. And Daniel pointed out that the unbounded parsing loop makes it possible for maliciously crafted packets to loop through potentially hundreds of tags. Fix both of these issues by consolidating the two parsing functions and limiting the VLAN tag parsing to a max depth of 8 tags. As part of this, switch over __vlan_get_protocol() to use skb_header_pointer() instead of pskb_may_pull(), to avoid the possible side effects of the latter and keep the skb pointer 'const' through all the parsing functions. v2: - Use limit of 8 tags instead of 32 (matching XMIT_RECURSION_LIMIT) Reported-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Reported-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	net: qed: fix buffer overflow on ethtool -d	Alexander Lobakin
	When generating debug dump, driver firstly collects all data in binary form, and then performs per-feature formatting to human-readable if it is supported. For ethtool -d, this is roughly incorrect for two reasons. First of all, drivers should always provide only original raw dumps to Ethtool without any changes. The second, and more critical, is that Ethtool's output buffer size is strictly determined by ethtool_ops::get_regs_len(), and all data must fit in it. The current version of driver always returns the size of raw data, but the size of the formatted buffer exceeds it in most cases. This leads to out-of-bound writes and memory corruption. Address both issues by adding an option to return original, non-formatted debug data, and using it for Ethtool case. v2: - Expand commit message to make it more clear; - No functional changes. Fixes: c965db444629 ("qed: Add support for debug data collection") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	Merge tag 'perf-tools-fixes-2020-07-07' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tooling fixes from Arnaldo Carvalho de Melo: - Intel PT fixes for PEBS-via-PT with registers - Fixes for Intel PT python based GUI - Avoid duplicated sideband events with Intel PT in system wide tracing - Remove needless 'dummy' event from TUI menu, used when synthesizing meta data events for pre-existing processes - Fix corner case segfault when pressing enter in a screen without entries in the TUI for report/top - Fixes for time stamp handling in libtraceevent - Explicitly set utf-8 encoding in perf flamegraph - Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy', silencing perf build warning * tag 'perf-tools-fixes-2020-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf report TUI: Remove needless 'dummy' event from menu perf intel-pt: Fix PEBS sample for XMM registers perf intel-pt: Fix displaying PEBS-via-PT with registers perf intel-pt: Fix recording PEBS-via-PT with registers perf report TUI: Fix segmentation fault in perf_evsel__hists_browse() tools lib traceevent: Add proper KBUFFER_TYPE_TIME_STAMP handling tools lib traceevent: Add API to read time information from kbuffer perf scripts python: exported-sql-viewer.py: Fix time chart call tree perf scripts python: exported-sql-viewer.py: Fix zero id in call tree 'Find' result perf scripts python: exported-sql-viewer.py: Fix zero id in call graph 'Find' result perf scripts python: exported-sql-viewer.py: Fix unexpanded 'Find' result perf record: Fix duplicated sideband events with Intel PT system wide tracing perf scripts python: export-to-postgresql.py: Fix struct.pack() int argument tools arch: Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy' perf flamegraph: Explicitly set utf-8 encoding
2020-07-07	bridge: mcast: Fix MLD2 Report IPv6 payload length check	Linus Lüssing
	Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling") introduced a bug in the IPv6 header payload length check which would potentially lead to rejecting a valid MLD2 Report: The check needs to take into account the 2 bytes for the "Number of Sources" field in the "Multicast Address Record" before reading it. And not the size of a pointer to this field. Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling") Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	net: Added pointer check for dst->ops->neigh_lookup in dst_neigh_lookup_skb	Martin Varghese
	The packets from tunnel devices (eg bareudp) may have only metadata in the dst pointer of skb. Hence a pointer check of neigh_lookup is needed in dst_neigh_lookup_skb Kernel crashes when packets from bareudp device is processed in the kernel neighbour subsytem. [ 133.384484] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 133.385240] #PF: supervisor instruction fetch in kernel mode [ 133.385828] #PF: error_code(0x0010) - not-present page [ 133.386603] PGD 0 P4D 0 [ 133.386875] Oops: 0010 [#1] SMP PTI [ 133.387275] CPU: 0 PID: 5045 Comm: ping Tainted: G W 5.8.0-rc2+ #15 [ 133.388052] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 133.391076] RIP: 0010:0x0 [ 133.392401] Code: Bad RIP value. [ 133.394029] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246 [ 133.396656] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00 [ 133.399018] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400 [ 133.399685] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000 [ 133.400350] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400 [ 133.401010] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003 [ 133.401667] FS: 00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000 [ 133.402412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 133.402948] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0 [ 133.403611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 133.404270] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 133.404933] Call Trace: [ 133.405169] <IRQ> [ 133.405367] __neigh_update+0x5a4/0x8f0 [ 133.405734] arp_process+0x294/0x820 [ 133.406076] ? __netif_receive_skb_core+0x866/0xe70 [ 133.406557] arp_rcv+0x129/0x1c0 [ 133.406882] __netif_receive_skb_one_core+0x95/0xb0 [ 133.407340] process_backlog+0xa7/0x150 [ 133.407705] net_rx_action+0x2af/0x420 [ 133.408457] __do_softirq+0xda/0x2a8 [ 133.408813] asm_call_on_stack+0x12/0x20 [ 133.409290] </IRQ> [ 133.409519] do_softirq_own_stack+0x39/0x50 [ 133.410036] do_softirq+0x50/0x60 [ 133.410401] __local_bh_enable_ip+0x50/0x60 [ 133.410871] ip_finish_output2+0x195/0x530 [ 133.411288] ip_output+0x72/0xf0 [ 133.411673] ? __ip_finish_output+0x1f0/0x1f0 [ 133.412122] ip_send_skb+0x15/0x40 [ 133.412471] raw_sendmsg+0x853/0xab0 [ 133.412855] ? insert_pfn+0xfe/0x270 [ 133.413827] ? vvar_fault+0xec/0x190 [ 133.414772] sock_sendmsg+0x57/0x80 [ 133.415685] __sys_sendto+0xdc/0x160 [ 133.416605] ? syscall_trace_enter+0x1d4/0x2b0 [ 133.417679] ? __audit_syscall_exit+0x1d9/0x280 [ 133.418753] ? __prepare_exit_to_usermode+0x5d/0x1a0 [ 133.419819] __x64_sys_sendto+0x24/0x30 [ 133.420848] do_syscall_64+0x4d/0x90 [ 133.421768] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 133.422833] RIP: 0033:0x7fe013689c03 [ 133.423749] Code: Bad RIP value. [ 133.424624] RSP: 002b:00007ffc7288f418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 133.425940] RAX: ffffffffffffffda RBX: 000056151fc63720 RCX: 00007fe013689c03 [ 133.427225] RDX: 0000000000000040 RSI: 000056151fc63720 RDI: 0000000000000003 [ 133.428481] RBP: 00007ffc72890b30 R08: 000056151fc60500 R09: 0000000000000010 [ 133.429757] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 [ 133.431041] R13: 000056151fc636e0 R14: 000056151fc616bc R15: 0000000000000080 [ 133.432481] Modules linked in: mpls_iptunnel act_mirred act_tunnel_key cls_flower sch_ingress veth mpls_router ip_tunnel bareudp ip6_udp_tunnel udp_tunnel macsec udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xt_MASQUERADE iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc ebtable_filter ebtables overlay ip6table_filter ip6_tables iptable_filter sunrpc ext4 mbcache jbd2 pcspkr i2c_piix4 virtio_balloon joydev ip_tables xfs libcrc32c ata_generic qxl pata_acpi drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ata_piix libata virtio_net net_failover virtio_console failover virtio_blk i2c_core virtio_pci virtio_ring serio_raw floppy virtio dm_mirror dm_region_hash dm_log dm_mod [ 133.444045] CR2: 0000000000000000 [ 133.445082] ---[ end trace f4aeee1958fd1638 ]--- [ 133.446236] RIP: 0010:0x0 [ 133.447180] Code: Bad RIP value. [ 133.448152] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246 [ 133.449363] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00 [ 133.450835] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400 [ 133.452237] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000 [ 133.453722] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400 [ 133.455149] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003 [ 133.456520] FS: 00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000 [ 133.458046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 133.459342] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0 [ 133.460782] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 133.462240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 133.463697] Kernel panic - not syncing: Fatal exception in interrupt [ 133.465226] Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 133.467025] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: aaa0c23cb901 ("Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug") Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	net/sched: act_ct: add miss tcf_lastuse_update.	wenxu
	When tcf_ct_act execute the tcf_lastuse_update should be update or the used stats never update filter protocol ip pref 3 flower chain 0 filter protocol ip pref 3 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 1.1.1.1 ip_flags frag/firstfrag skip_hw not_in_hw action order 1: ct zone 1 nat pipe index 1 ref 1 bind 1 installed 103 sec used 103 sec Action statistics: Sent 151500 bytes 101 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie 4519c04dc64a1a295787aab13b6a50fb Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	net/mlx5e: Do not include rwlock.h directly	Sebastian Andrzej Siewior
	rwlock.h should not be included directly. Instead linux/splinlock.h should be included. Including it directly will break the RT build. Fixes: 549c243e4e010 ("net/mlx5e: Extract neigh-specific code from en_rep.c to rep/neigh.c") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	mptcp: fix DSS map generation on fin retransmission	Paolo Abeni
	The RFC 8684 mandates that no-data DATA FIN packets should carry a DSS with 0 sequence number and data len equal to 1. Currently, on FIN retransmission we re-use the existing mapping; if the previous fin transmission was part of a partially acked data packet, we could end-up writing in the egress packet a non-compliant DSS. The above will be detected by a "Bad mapping" warning on the receiver side. This change addresses the issue explicitly checking for 0 len packet when adding the DATA_FIN option. Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets") Reported-by: syzbot+42a07faa5923cfaeb9c9@syzkaller.appspotmail.com Tested-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg	Sabrina Dubroca
	IPv4 ping sockets don't set fl4.fl4_icmp_{type,code}, which leads to incomplete IPsec ACQUIRE messages being sent to userspace. Currently, both raw sockets and IPv6 ping sockets set those fields. Expected output of "ip xfrm monitor": acquire proto esp sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 8 code 0 dev ens4 policy src 10.0.2.15/32 dst 8.8.8.8/32 <snip> Currently with ping sockets: acquire proto esp sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 0 code 0 dev ens4 policy src 10.0.2.15/32 dst 8.8.8.8/32 <snip> The Libreswan test suite found this problem after Fedora changed the value for the sysctl net.ipv4.ping_group_range. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Paul Wouters <pwouters@redhat.com> Tested-by: Paul Wouters <pwouters@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	net: ethernet: fec: prevent tx starvation under high rx load	Tobias Waldekranz
	In the ISR, we poll the event register for the queues in need of service and then enter polled mode. After this point, the event register will never be read again until we exit polled mode. In a scenario where a UDP flow is routed back out through the same interface, i.e. "router-on-a-stick" we'll typically only see an rx queue event initially. Once we start to process the incoming flow we'll be locked polled mode, but we'll never clean the tx rings since that event is never caught. Eventually the netdev watchdog will trip, causing all buffers to be dropped and then the process starts over again. Rework the NAPI poll to keep trying to consome the entire budget as long as new events are coming in, making sure to service all rx/tx queues, in priority order, on each pass. Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Tested-by: Fugang Duan <fugang.duan@nxp.com> Reviewed-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	net: sky2: initialize return of gm_phy_read	Tom Rix
	clang static analysis flags this garbage return drivers/net/ethernet/marvell/sky2.c:208:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return v; ^~~~~~~~ static inline u16 gm_phy_read( ... { u16 v; __gm_phy_read(hw, port, reg, &v); return v; } __gm_phy_read can return without setting v. So handle similar to skge.c's gm_phy_read, initialize v. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-07	Merge tag 'mtd/fixes-for-5.8-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fixes from Miquel Raynal: "MTD: - Set a missing master partition panic write flag Raw NAND: - Fix build issue in the xway driver - Fix a wrong return code" * tag 'mtd/fixes-for-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: xway: Fix build issue mtd: set master partition panic write flag nandsim: Fix return code testing of ns_find_operation()
2020-07-07	gfs2: Rework read and page fault locking	Andreas Gruenbacher
	So far, gfs2 has taken the inode glocks inside the ->readpage and ->readahead address space operations. Since commit d4388340ae0b ("fs: convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed the pages to read ahead locked. With that, the current holder of the inode glock may be trying to lock one of those pages while gfs2_readahead is trying to take the inode glock, resulting in a deadlock. Fix that by moving the lock taking to the higher-level ->read_iter file and ->fault vm operations. This also gets rid of an ugly lock inversion workaround in gfs2_readpage. The cache consistency model of filesystems like gfs2 is such that if data is found in the page cache, the data is up to date and can be used without taking any filesystem locks. If a page is not cached, filesystem locks must be taken before populating the page cache. To avoid taking the inode glock when the data is already cached, gfs2_file_read_iter first tries to read the data with the IOCB_NOIO flag set. If that fails, the inode glock is taken and the operation is retried with the IOCB_NOIO flag cleared. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-07-07	fs: Add IOCB_NOIO flag for generic_file_read_iter	Andreas Gruenbacher
	Add an IOCB_NOIO flag that indicates to generic_file_read_iter that it shouldn't trigger any filesystem I/O for the actual request or for readahead. This allows to do tentative reads out of the page cache as some filesystems allow, and to take the appropriate locks and retry the reads only if the requested pages are not cached. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-07-07	Merge tag 'for-5.8-rc4-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - regression fix of a leak in global block reserve accounting - fix a (hard to hit) race of readahead vs releasepage that could lead to crash - convert all remaining uses of comment fall through annotations to the pseudo keyword - fix crash when mounting a fuzzed image with -o recovery * tag 'for-5.8-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: reset tree root pointer after error in init_tree_roots btrfs: fix reclaim_size counter leak after stealing from global reserve btrfs: fix fatal extent_buffer readahead vs releasepage race btrfs: convert comments to fallthrough annotations
2020-07-07	Merge tag 'arc-5.8-rc5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - User build systems to pass -mcpu - Fix potential EFA clobber in syscall handler - Fix ARCompact 2 levels of interrupts build - Detect newer HS CPU releases - misc other fixes * tag 'arc-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARCv2: support loop buffer (LPB) disabling ARC: build: remove deprecated toggle for arc700 builds ARC: build: allow users to specify -mcpu ARCv2: boot log: detect newer/upconing HS3x/HS4x releases ARC: elf: use right ELF_ARCH ARC: [arcompact] fix bitrot with 2 levels of interrupt ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE
2020-07-07	cgroup: fix cgroup_sk_alloc() for sk_clone_lock()	Cong Wang
	When we clone a socket in sk_clone_lock(), its sk_cgrp_data is copied, so the cgroup refcnt must be taken too. And, unlike the sk_alloc() path, sock_update_netprioidx() is not called here. Therefore, it is safe and necessary to grab the cgroup refcnt even when cgroup_sk_alloc is disabled. sk_clone_lock() is in BH context anyway, the in_interrupt() would terminate this function if called there. And for sk_alloc() skcd->val is always zero. So it's safe to factor out the code to make it more readable. The global variable 'cgroup_sk_alloc_disabled' is used to determine whether to take these reference counts. It is impossible to make the reference counting correct unless we save this bit of information in skcd->val. So, add a new bit there to record whether the socket has already taken the reference counts. This obviously relies on kmalloc() to align cgroup pointers to at least 4 bytes, ARCH_KMALLOC_MINALIGN is certainly larger than that. This bug seems to be introduced since the beginning, commit d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets") tried to fix it but not compeletely. It seems not easy to trigger until the recent commit 090e28b229af ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") was merged. Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Reported-by: Cameron Berkenpas <cam@neo-zeon.de> Reported-by: Peter Geis <pgwipeout@gmail.com> Reported-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> Reported-by: Daniël Sonck <dsonck92@gmail.com> Reported-by: Zhang Qiang <qiang.zhang@windriver.com> Tested-by: Cameron Berkenpas <cam@neo-zeon.de> Tested-by: Peter Geis <pgwipeout@gmail.com> Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Zefan Li <lizefan@huawei.com> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>