Age | Commit message (Collapse) | Author |
|
This addresses the following sparse warning:
lib/memregion.c:8:5: warning: symbol 'memregion_alloc' was not declared. Should it be static?
lib/memregion.c:14:6: warning: symbol 'memregion_free' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200921142852.875312-1-yanaijie@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to
`stpcpy(dest, str) - dest`.
This generally avoids the machinery involved in parsing format strings.
`stpcpy` is just like `strcpy` except it returns the pointer to the new
tail of `dest`. This optimization was introduced into clang-12.
Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.
Similar to last year's fire drill with: commit 5f074f3e192f
("lib/string.c: implement a basic bcmp")
The kernel is somewhere between a "freestanding" environment (no full
libc) and "hosted" environment (many symbols from libc exist with the
same type, function signature, and semantics).
As Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather
than opt-out.
Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.
Masahiro summarizes the subtle distinction between compilers justly:
To prevent transformation from foo() into bar(), there are two ways in
Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is
only one in GCC; -fno-buitin-foo.
(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)
Masahiro also notes:
We want to disable optimization from foo() to bar(),
but we may still benefit from the optimization from
foo() into something else. If GCC implements the same transform, we
would run into a problem because it is not -fno-builtin-bar, but
-fno-builtin-foo that disables that optimization.
In this regard, -fno-builtin-foo would be more future-proof than
-fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
may want to prevent calls from foo() being optimized into calls to
bar(), but we still may want other optimization on calls to foo().
It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.
Finally, Kees notes that this interface is unsafe, so we should not
encourage its use. As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Suggested-by: Andy Lavr <andy.lavr@gmail.com>
Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200914161643.938408-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
PageTransHuge returns true for both thp and hugetlb, so thp stats was
counting both thp and hugetlb migrations. Exclude hugetlb migration by
setting is_thp variable right.
Clean up thp handling code too when we are there.
Fixes: 1a5bae25e3cf ("mm/vmstat: add events for THP migration without split")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lkml.kernel.org/r/20200917210413.1462975-1-zi.yan@sent.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
...
pudp = pud_offset(&p4d, addr);
This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return. This happens when the
level is folded (on most arches), and that pointer should not be
iterated.
On s390 due to the fact that each task might have different 5,4 or
3-level address translation and hence different levels folded the logic
is more complex and non-iteratable pointer to a local copy leads to
severe problems.
Here is an example of what happens with gup_fast on s390, for a task
with 3-level paging, crossing a 2 GB pud boundary:
// addr = 0x1007ffff000, end = 0x10080001000
static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
unsigned int flags, struct page **pages, int *nr)
{
unsigned long next;
pud_t *pudp;
// pud_offset returns &p4d itself (a pointer to a value on stack)
pudp = pud_offset(&p4d, addr);
do {
// on second iteratation reading "random" stack value
pud_t pud = READ_ONCE(*pudp);
// next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
next = pud_addr_end(addr, end);
...
} while (pudp++, addr = next, addr != end); // pudp++ iterating over stack
return 1;
}
This happens since s390 moved to common gup code with commit
d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") and
commit 1a42010cdc26 ("s390/mm: convert to the generic
get_user_pages_fast code").
s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.
What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.
To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions. And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter. This has
already been discussed in
https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1
Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org> [5.2+]
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We forget to add the suffix to the workingset_restore string, so fix it.
And also update the documentation of cgroup-v2.rst.
Fixes: 170b04b7ae49 ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/20200916100030.71698-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
SWP_FS is used to make swap_{read,write}page() go through the
filesystem, and it's only used for swap files over NFS. So, !SWP_FS
means non NFS for now, it could be either file backed or device backed.
Something similar goes with legacy SWP_FILE.
So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.
FS corruption can be observed with SSD device + XFS + fragmented
swapfile due to CONFIG_THP_SWAP=y.
I reproduced the issue with the following details:
Environment:
QEMU + upstream kernel + buildroot + NVMe (2 GB)
Kernel config:
CONFIG_BLK_DEV_NVME=y
CONFIG_THP_SWAP=y
Some reproducible steps:
mkfs.xfs -f /dev/nvme0n1
mkdir /tmp/mnt
mount /dev/nvme0n1 /tmp/mnt
bs="32k"
sz="1024m" # doesn't matter too much, I also tried 16m
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw
mkswap /tmp/mnt/sw
swapon /tmp/mnt/sw
stress --vm 2 --vm-bytes 600M # doesn't matter too much as well
Symptoms:
- FS corruption (e.g. checksum failure)
- memory corruption at: 0xd2808010
- segfault
Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Eric Sandeen <esandeen@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With the commit 10befea91b61 ("mm: memcg/slab: use a single set of
kmem_caches for all allocations"), it becomes possible to call kfree()
from the slabs_destroy().
The functions cache_flusharray() and do_drain() calls slabs_destroy() on
array_cache of the local CPU without updating the size of the
array_cache. This enables the kfree() call from the slabs_destroy() to
recursively call cache_flusharray() which can potentially call
free_block() on the same elements of the array_cache of the local CPU
and causing double free and memory corruption.
To fix the issue, simply update the local CPU array_cache cache before
calling slabs_destroy().
Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ted Ts'o <tytso@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
clang --target=<triple> is how we can specify a particular toolchain
triple to be use, fix the two occurences in the documentation.
Fixes: fcf1b6a35c16 ("Documentation/llvm: add documentation on building w/ Clang/LLVM")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Pull more kvm fixes from Paolo Bonzini:
"Five small fixes.
The nested migration bug will be fixed with a better API in 5.10 or
5.11, for now this is a fix that works with existing userspace but
keeps the current ugly API"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Add a dedicated INVD intercept routine
KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
KVM: x86: fix MSR_IA32_TSC read for nested migration
selftests: kvm: Fix assert failure in single-step test
KVM: x86: VMX: Make smaller physical guest address space support user-configurable
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fixed FP register access on Loongsoon-3
- added missing 1074 cpu handling
- fixed Loongson2ef build error
* tag 'mips_fixes_5.9_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: BCM47XX: Remove the needless check with the 1074K
MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()
MIPS: Loongson2ef: Disable Loongson MMI instructions
MIPS: Loongson-3: Fix fp register access if MSA enabled
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of driver specific fixes, the fsl-espi and bcm-qspi
changes in particular have been causing breakage for users"
* tag 'spi-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: bcm-qspi: Fix probe regression on iProc platforms
spi: fsl-dspi: fix use-after-free in remove path
spi: fsl-espi: Only process interrupts for expected events
spi: bcm2835: Make polling_limit_us static
spi: spi-fsl-dspi: use XSPI mode instead of DMA for DPAA2 SoCs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A single fix for incorrect specification of some of the register
fields on axp20x devices which would break voltage setting on affected
systems"
* tag 'regulator-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: axp20x: fix LDO2/4 description
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two issues here - one is a fix for use after free issues in the case
where a regmap overrides its name using something dynamically
generated, the other is that we weren't handling access checks
non-incrementing I/O on registers within paged register regions
correctly resulting in spurious errors.
Both of these are quite rare but serious if they occur"
* tag 'regmap-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: fix page selection for noinc writes
regmap: fix page selection for noinc reads
regmap: debugfs: Add back in erroneously removed initialisation of ret
regmap: debugfs: Fix handling of name string for debugfs init delays
|
|
A previous commit for fixing up short reads botched the async retry
path, so we ended up going to worker threads more often than we should.
Fix this up, so retries work the way they originally were intended to.
Fixes: 227c0c9673d8 ("io_uring: internally retry short reads")
Reported-by: Hao_Xu <haoxu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NFS server fix from Chuck Lever:
"Fix incorrect calculation on platforms that implement
flush_dcache_page()"
* tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6:
SUNRPC: Fix svc_flush_dcache()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix more fallout of recent RCU-lockdep changes in CPU idle code
and two devfreq issues.
Specifics:
- Export rcu_idle_{enter,exit} to modules to fix build issues
introduced by recent RCU-lockdep fixes (Borislav Petkov)
- Add missing return statement to a stub function in the ACPI
processor driver to fix a build issue introduced by recent
RCU-lockdep fixes (Rafael Wysocki)
- Fix recently introduced suspicious RCU usage warnings in the PSCI
cpuidle driver and drop stale comments regarding RCU_NONIDLE()
usage from enter_s2idle_proper() (Ulf Hansson)
- Fix error code path in the tegra30 devfreq driver (Dan Carpenter)
- Add missing information to devfreq_summary debugfs (Chanwoo Choi)"
* tag 'pm-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
PM / devfreq: tegra30: Disable clock on error in probe
PM / devfreq: Add timer type to devfreq_summary debugfs
cpuidle: Drop misleading comments about RCU usage
cpuidle: psci: Fix suspicious RCU usage
rcu/tree: Export rcu_idle_{enter,exit} to modules
|
|
The INVD instruction intercept performs emulation. Emulation can't be done
on an SEV guest because the guest memory is encrypted.
Provide a dedicated intercept routine for the INVD intercept. And since
the instruction is emulated as a NOP, just skip it instead.
Fixes: 1654efcbc431 ("KVM: SVM: Add KVM_SEV_INIT command")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <a0b9a19ffa7fef86a3cc700c7ea01cb2731e04e5.1600972918.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Pull rdma fix from Jason Gunthorpe:
"One fix for a bug that blktests hits when using rxe: tear down the CQ
pool before waiting for all references to go away"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Fix ordering of CQ pool destruction
|
|
Pull drm fixes from Dave Airlie:
"Fairly quiet, a couple of i915 fixes, one dma-buf fix, one vc4 and two
sun4i changes
dma-buf:
- Single null pointer deref fix
i915:
- Fix selftest reference to stack data out of scope
- Fix GVT null pointer dereference
vc4:
- fill asoc card owner
sun4i:
- program secondary CSC correctly"
* tag 'drm-fixes-2020-09-25' of git://anongit.freedesktop.org/drm/drm:
drm/i915/selftests: Push the fake iommu device from the stack to data
dmabuf: fix NULL pointer dereference in dma_buf_release()
drm/i915/gvt: Fix port number for BDW on EDID region setup
drm/sun4i: mixer: Extend regmap max_register
drm/sun4i: sun8i-csc: Secondary CSC register correction
drm/vc4/vc4_hdmi: fill ASoC card owner
|
|
* pm-cpuidle:
ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
cpuidle: Drop misleading comments about RCU usage
cpuidle: psci: Fix suspicious RCU usage
rcu/tree: Export rcu_idle_{enter,exit} to modules
|
|
This causes all the bios to be submitted with REQ_NOWAIT, which can be
problematic on either btrfs or on file systems that otherwise use a mix
of block devices where only some of them support it.
For now, just remove the setting of plug->nowait = true.
Reported-by: Dan Melnic <dmm@fb.com>
Reported-by: Brian Foster <bfoster@redhat.com>
Fixes: b63534c41e20 ("io_uring: re-issue block requests that failed because of resources")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit 88b7381a939d ("USB: Select better matching USB drivers when
available") inadvertently broke usbip functionality. The commit in
question allows USB device drivers to be explicitly matched with
USB devices via the use of driver-provided identifier tables and
match functions, which is useful for a specialised device driver
to be chosen for a device that can also be handled by another,
more generic, device driver.
Prior, the USB device section of usb_device_match() had an
unconditional "return 1" statement, which allowed user-space to bind
USB devices to the usbip_host device driver, if desired. However,
the aforementioned commit changed the default/fallback return
value to zero. This breaks device drivers such as usbip_host, so
this commit restores the legacy behaviour, but only if a device
driver does not have an id_table and a match() function.
In addition, if usb_device_match is called for a device driver
and device pair where the device does not match the id_table of the
device driver in question, then the device driver will be disqualified
for the device. This allows avoiding the default case of "return 1",
which prevents undesirable probe() calls to a driver even though
its id_table did not match the device.
Finally, this commit changes the specialised-driver-to-generic-driver
transition code so that when a device driver returns -ENODEV, a more
generic device driver is only considered if the current device driver
does not have an id_table and a match() function. This ensures that
"generic" drivers such as usbip_host will not be considered specialised
device drivers and will not cause the device to be locked in to the
generic device driver, when a more specialised device driver could be
tried.
All of these changes restore usbip functionality without regressions,
ensure that the specialised/generic device driver selection logic works
as expected with the usb and apple-mfi-fastcharge drivers, and do not
negatively affect the use of devices provided by dummy_hcd.
Fixes: 88b7381a939d ("USB: Select better matching USB drivers when available")
Cc: <stable@vger.kernel.org> # 5.8
Cc: Bastien Nocera <hadess@hadess.net>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: <syzkaller@googlegroups.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-5-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit resolves a minor bug in the selection/discovery of more
specific USB device drivers for devices that are currently bound to
generic USB device drivers.
The bug is related to the way a candidate USB device driver is
compared against the generic USB device driver. The code in
is_dev_usb_generic_driver() assumes that the device driver in question
is a USB device driver by calling to_usb_device_driver(dev->driver)
to downcast; however I have observed that this assumption is not always
true, through code instrumentation.
This commit avoids the incorrect downcast altogether by comparing
the USB device's driver (i.e., dev->driver) to the generic USB
device driver directly. This method was suggested by Alan Stern.
This bug was found while investigating Andrey Konovalov's report
indicating usbip device driver misbehaviour with the recently merged
generic USB device driver selection feature. The report is linked
below.
Fixes: d5643d2249b2 ("USB: Fix device driver race")
Cc: <stable@vger.kernel.org> # 5.8
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Bastien Nocera <hadess@hadess.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: <syzkaller@googlegroups.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-4-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit resolves a bug in the selection/discovery of more
specific USB device drivers for devices that are currently bound to
generic USB device drivers.
The bug is in the logic that determines whether a device currently
bound to a generic USB device driver should be re-probed by a
more specific USB device driver or not. The code in
__usb_bus_reprobe_drivers() used to have the following lines:
if (usb_device_match_id(udev, new_udriver->id_table) == NULL &&
(!new_udriver->match || new_udriver->match(udev) != 0))
return 0;
ret = device_reprobe(dev);
As the reader will notice, the code checks whether the USB device in
consideration matches the identifier table (id_table) of a specific
USB device_driver (new_udriver), followed by a similar check, but this
time with the USB device driver's match function. However, the match
function's return value is not checked correctly. When match() returns
zero, it means that the specific USB device driver is *not* applicable
to the USB device in question, but the code then goes on to reprobe the
device with the new USB device driver under consideration. All this to
say, the logic is inverted.
This bug was found by code inspection and instrumentation while
investigating the root cause of the issue reported by Andrey Konovalov,
where usbip took over syzkaller's virtual USB devices in an undesired
manner. The report is linked below.
Fixes: d5643d2249b2 ("USB: Fix device driver race")
Cc: <stable@vger.kernel.org> # 5.8
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Bastien Nocera <hadess@hadess.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: <syzkaller@googlegroups.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-3-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This commit reverts commit 7a2f2974f265 ("usbip: Implement a match
function to fix usbip").
In summary, commit d5643d2249b2 ("USB: Fix device driver race")
inadvertently broke usbip functionality, which I resolved in an incorrect
manner by introducing a match function to usbip, usbip_match(), that
unconditionally returns true.
However, the usbip_match function, as is, causes usbip to take over
virtual devices used by syzkaller for USB fuzzing, which is a regression
reported by Andrey Konovalov.
Furthermore, in conjunction with the fix of another bug, handled by another
patch titled "usbcore/driver: Fix specific driver selection" in this patch
set, the usbip_match function causes unexpected USB subsystem behaviour
when the usbip_host driver is loaded. The unexpected behaviour can be
qualified as follows:
- If commit 41160802ab8e ("USB: Simplify USB ID table match") is included
in the kernel, then all USB devices are bound to the usbip_host
driver, which appears to the user as if all USB devices were
disconnected.
- If the same commit (41160802ab8e) is not in the kernel (as is the case
with v5.8.10) then all USB devices are re-probed and re-bound to their
original device drivers, which appears to the user as a disconnection
and re-connection of USB devices.
Please note that this commit will make usbip non-operational again,
until yet another patch in this patch set is merged, titled
"usbcore/driver: Accommodate usbip".
Cc: <stable@vger.kernel.org> # 5.8: 41160802ab8e: USB: Simplify USB ID table match
Cc: <stable@vger.kernel.org> # 5.8
Cc: Bastien Nocera <hadess@hadess.net>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: <syzkaller@googlegroups.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Link: https://lore.kernel.org/r/20200922110703.720960-2-m.v.b@runbox.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We need to move the closing of the src_device out of all the device
replace locking, but we definitely want to zero out the superblock
before we commit the last time to make sure the device is properly
removed. Handle this by pushing btrfs_scratch_superblocks into
btrfs_dev_replace_finishing, and then later on we'll move the src_device
closing and freeing stuff where we need it to be.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull devfreq updates for 5.9-rc7 from Chanwoo Choi:
"1. Update devfreq core
- Add missing timer type to devfreq_summary debugfs node.
2. Fix devfreq device driver
- Fix the exception handling about clock on tegra30-devfreq.c"
* tag 'devfreq-fixes-for-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: tegra30: Disable clock on error in probe
PM / devfreq: Add timer type to devfreq_summary debugfs
|
|
commit 7b6620d7db56 ("block: remove REQ_NOWAIT_INLINE") removed the
REQ_NOWAIT_INLINE related code, but the diff wasn't applied to
blk_types.h somehow.
Then commit 2771cefeac49 ("block: remove the REQ_NOWAIT_INLINE flag")
removed the REQ_NOWAIT_INLINE flag while the BLK_QC_T_EAGAIN flag still
remains.
Fixes: 7b6620d7db56 ("block: remove REQ_NOWAIT_INLINE")
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If we cancel these requests, we'll leak the memory associated with the
filename. Add them to the table of ops that need cleaning, if
REQ_F_NEED_CLEANUP is set.
Cc: stable@vger.kernel.org
Fixes: e62753e4e292 ("io_uring: call statx directly")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Reset the MMU context during kvm_set_cr4() if SMAP or PKE is toggled.
Recent commits to (correctly) not reload PDPTRs when SMAP/PKE are
toggled inadvertantly skipped the MMU context reset due to the mask
of bits that triggers PDPTR loads also being used to trigger MMU context
resets.
Fixes: 427890aff855 ("kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode")
Fixes: cb957adb4ea4 ("kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode")
Cc: Jim Mattson <jmattson@google.com>
Cc: Peter Shier <pshier@google.com>
Cc: Oliver Upton <oupton@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923215352.17756-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
sizeof() when applied to a pointer typed expression should give the size
of the pointed data, even if the data is a pointer.
Fixes: e1f24a79f424 ("IB/mlx5: Support congestion related counters")
Link: https://lore.kernel.org/r/20200917081354.2083293-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.9:
- Single null pointer deref fix for dma-buf.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4106c21e-f52c-4c05-6cdb-daa743bb8617@linux.intel.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc7:
- Fix selftest reference to stack data out of scope
- Fix GVT null pointer dereference
- Backmerge from Linus' master to fix build
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87zh5fpmha.fsf@intel.com
|
|
The dax mess had some fallout, and i915 used a later base to fix their CI.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
or get freed, for that matter, if it's a long (separately stored)
name.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for 5.9
- fix error during controller probe that cause double free irqs
(Keith Busch)
- FC connection establishment fix (James Smart)
- properly handle completions for invalid tags (Xianting Tian)
- pass the correct nsid to the command effects and supported log
(Chaitanya Kulkarni)"
* tag 'nvme-5.9-2020-09-24' of git://git.infradead.org/nvme:
nvme-core: don't use NVME_NSID_ALL for command effects and supported log
nvme-fc: fail new connections to a deleted host or remote port
nvme-pci: fix NULL req in completion handler
nvme: return errors for hwmon init
|
|
HIP08 supports RC inline up to size of 32 Bytes, and all data should be
put into SQWQE. For HIP09, this capability is extended to 1024 Bytes, if
length of data is longer than 32 Bytes, they will be filled into extended
sge space.
Link: https://lore.kernel.org/r/1599744069-9968-1-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The sq_sig_type field should be filled when querying QP, or the users may
get a wrong value.
Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1600509802-44382-9-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The hardware will add AckReq flag in BTH header according to the value of
ack_req_freq to request ACK from responder for the packets with this flag.
It should be greater than or equal to lp_pktn_ini instead of using a fixed
value.
Fixes: 7b9bd73ed13d ("RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC")
Link: https://lore.kernel.org/r/1600509802-44382-8-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The rnr_retry returned to the user is not correct, it should be got from
another fields in QPC.
Fixes: bfe860351e31 ("RDMA/hns: Fix cast from or to restricted __le32 for driver")
Link: https://lore.kernel.org/r/1600509802-44382-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
calc_pg_sz() may gets a data calculation overflow if the PAGE_SIZE is 64 KB
and hop_num is 2. It is because that all variables involved in calculation
are defined in type of int. So change the type of bt_chunk_size,
buf_chunk_size and obj_per_chunk_default to u64.
Fixes: ba6bb7e97421 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1600509802-44382-6-git-send-email-liweihang@huawei.com
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
According to the RoCE v1 specification, the sl (service level) 0-7 are
mapped directly to priorities 0-7 respectively, sl 8-15 are reserved. The
driver should verify whether the the value of sl is larger than 7, if so,
an exception should be returned.
Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1600509802-44382-5-git-send-email-liweihang@huawei.com
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Change "initialze" to "initialize".
Fixes: 8f3e9f3ea087 ("IB/hns: Add code for refreshing CQ CI using TPTR")
Link: https://lore.kernel.org/r/1600509802-44382-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
HIP08 doesn't support modifying the maximum number of outstanding WR in an
SRQ. However, the driver does not return a failure message, and users may
mistakenly think that the resizing is executed successfully. So the driver
needs to intercept this operation.
Fixes: ffb1308b88b6 ("RDMA/hns: Move SRQ code to the reasonable place")
Link: https://lore.kernel.org/r/1600509802-44382-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
According to the IB specifications, the verbs should return an immediate
error when the users set an unsupported opcode. Furthermore, refactor codes
about opcode in process of post_send to make the difference between opcodes
clearer.
Link: https://lore.kernel.org/r/1600509802-44382-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
For HIP09, size of SCCC (Soft Congestion Control Context) is increased to
64 Bytes from 32 Bytes. The hardware will get the configuration of SCCC
from driver instead of using a fixed value.
Link: https://lore.kernel.org/r/1600245806-56321-5-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The new version of RoCEE supports using QPC in size of 256B or 512B, so
that HIP09 can supports new congestion control algorithms by using QPC in
larger size.
Link: https://lore.kernel.org/r/1600245806-56321-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The new version of RoCEE supports using CQE in size of 32B or 64B. The
performance of bus can be improved by using larger size of CQE.
Link: https://lore.kernel.org/r/1600245806-56321-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The new version of RoCEE supports using CEQE in size of 4B or 64B, AEQE in
size of 16B or 64B. The performance of bus can be improved by using larger
size of EQE.
Link: https://lore.kernel.org/r/1600245806-56321-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
MSR reads/writes should always access the L1 state, since the (nested)
hypervisor should intercept all the msrs it wants to adjust, and these
that it doesn't should be read by the guest as if the host had read it.
However IA32_TSC is an exception. Even when not intercepted, guest still
reads the value + TSC offset.
The write however does not take any TSC offset into account.
This is documented in Intel's SDM and seems also to happen on AMD as well.
This creates a problem when userspace wants to read the IA32_TSC value and then
write it. (e.g for migration)
In this case it reads L2 value but write is interpreted as an L1 value.
To fix this make the userspace initiated reads of IA32_TSC return L1 value
as well.
Huge thanks to Dave Gilbert for helping me understand this very confusing
semantic of MSR writes.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200921103805.9102-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|