summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-13random: vary jitter iterations based on cycle counter speedJason A. Donenfeld
Currently, we do the jitter dance if two consecutive reads to the cycle counter return different values. If they do, then we consider the cycle counter to be fast enough that one trip through the scheduler will yield one "bit" of credited entropy. If those two reads return the same value, then we assume the cycle counter is too slow to show meaningful differences. This methodology is flawed for a variety of reasons, one of which Eric posted a patch to fix in [1]. The issue that patch solves is that on a system with a slow counter, you might be [un]lucky and read the counter _just_ before it changes, so that the second cycle counter you read differs from the first, even though there's usually quite a large period of time in between the two. For example: | real time | cycle counter | | --------- | ------------- | | 3 | 5 | | 4 | 5 | | 5 | 5 | | 6 | 5 | | 7 | 5 | <--- a | 8 | 6 | <--- b | 9 | 6 | <--- c If we read the counter at (a) and compare it to (b), we might be fooled into thinking that it's a fast counter, when in reality it is not. The solution in [1] is to also compare counter (b) to counter (c), on the theory that if the counter is _actually_ slow, and (a)!=(b), then certainly (b)==(c). This helps solve this particular issue, in one sense, but in another sense, it mostly functions to disallow jitter entropy on these systems, rather than simply taking more samples in that case. Instead, this patch takes a different approach. Right now we assume that a difference in one set of consecutive samples means one "bit" of credited entropy per scheduler trip. We can extend this so that a difference in two sets of consecutive samples means one "bit" of credited entropy per /two/ scheduler trips, and three for three, and four for four. In other words, we can increase the amount of jitter "work" we require for each "bit", depending on how slow the cycle counter is. So this patch takes whole bunch of samples, sees how many of them are different, and divides to find the amount of work required per "bit", and also requires that at least some minimum of them are different in order to attempt any jitter entropy. Note that this approach is still far from perfect. It's not a real statistical estimate on how much these samples vary; it's not a real-time analysis of the relevant input data. That remains a project for another time. However, it makes the same (partly flawed) assumptions as the code that's there now, so it's probably not worse than the status quo, and it handles the issue Eric mentioned in [1]. But, again, it's probably a far cry from whatever a really robust version of this would be. [1] https://lore.kernel.org/lkml/20220421233152.58522-1-ebiggers@kernel.org/ https://lore.kernel.org/lkml/20220421192939.250680-1-ebiggers@kernel.org/ Cc: Eric Biggers <ebiggers@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13random: insist on random_get_entropy() existing in order to simplifyJason A. Donenfeld
All platforms are now guaranteed to provide some value for random_get_entropy(). In case some bug leads to this not being so, we print a warning, because that indicates that something is really very wrong (and likely other things are impacted too). This should never be hit, but it's a good and cheap way of finding out if something ever is problematic. Since we now have viable fallback code for random_get_entropy() on all platforms, which is, in the worst case, not worse than jiffies, we can count on getting the best possible value out of it. That means there's no longer a use for using jiffies as entropy input. It also means we no longer have a reason for doing the round-robin register flow in the IRQ handler, which was always of fairly dubious value. Instead we can greatly simplify the IRQ handler inputs and also unify the construction between 64-bits and 32-bits. We now collect the cycle counter and the return address, since those are the two things that matter. Because the return address and the irq number are likely related, to the extent we mix in the irq number, we can just xor it into the top unchanging bytes of the return address, rather than the bottom changing bytes of the cycle counter as before. Then, we can do a fixed 2 rounds of SipHash/HSipHash. Finally, we use the same construction of hashing only half of the [H]SipHash state on 32-bit and 64-bit. We're not actually discarding any entropy, since that entropy is carried through until the next time. And more importantly, it lets us do the same sponge-like construction everywhere. Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13xtensa: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. This is accomplished by just including the asm-generic code like on other architectures, which means we can get rid of the empty stub function here. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13sparc: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. This is accomplished by just including the asm-generic code like on other architectures, which means we can get rid of the empty stub function here. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13um: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. This is accomplished by just including the asm-generic code like on other architectures, which means we can get rid of the empty stub function here. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13x86/tsc: Use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is suboptimal. Instead, fallback to calling random_get_entropy_fallback(), which isn't extremely high precision or guaranteed to be entropic, but is certainly better than returning zero all the time. If CONFIG_X86_TSC=n, then it's possible for the kernel to run on systems without RDTSC, such as 486 and certain 586, so the fallback code is only required for that case. As well, fix up both the new function and the get_cycles() function from which it was derived to use cpu_feature_enabled() rather than boot_cpu_has(), and use !IS_ENABLED() instead of #ifndef. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: x86@kernel.org
2022-05-13nios2: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13arm: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13mips: use fallback for random_get_entropy() instead of just c0 randomJason A. Donenfeld
For situations in which we don't have a c0 counter register available, we've been falling back to reading the c0 "random" register, which is usually bounded by the amount of TLB entries and changes every other cycle or so. This means it wraps extremely often. We can do better by combining this fast-changing counter with a potentially slower-changing counter from random_get_entropy_fallback() in the more significant bits. This commit combines the two, taking into account that the changing bits are in a different bit position depending on the CPU model. In addition, we previously were falling back to 0 for ancient CPUs that Linux does not support anyway; remove that dead path entirely. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13riscv: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul Walmsley <paul.walmsley@sifive.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13m68k: use fallback for random_get_entropy() instead of zeroJason A. Donenfeld
In the event that random_get_entropy() can't access a cycle counter or similar, falling back to returning 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13timekeeping: Add raw clock fallback for random_get_entropy()Jason A. Donenfeld
The addition of random_get_entropy_fallback() provides access to whichever time source has the highest frequency, which is useful for gathering entropy on platforms without available cycle counters. It's not necessarily as good as being able to quickly access a cycle counter that the CPU has, but it's still something, even when it falls back to being jiffies-based. In the event that a given arch does not define get_cycles(), falling back to the get_cycles() default implementation that returns 0 is really not the best we can do. Instead, at least calling random_get_entropy_fallback() would be preferable, because that always needs to return _something_, even falling back to jiffies eventually. It's not as though random_get_entropy_fallback() is super high precision or guaranteed to be entropic, but basically anything that's not zero all the time is better than returning zero all the time. Finally, since random_get_entropy_fallback() is used during extremely early boot when randomizing freelists in mm_init(), it can be called before timekeeping has been initialized. In that case there really is nothing we can do; jiffies hasn't even started ticking yet. So just give up and return 0. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Theodore Ts'o <tytso@mit.edu>
2022-05-13openrisc: start CPU timer early in bootJason A. Donenfeld
In order to measure the boot process, the timer should be switched on as early in boot as possible. As well, the commit defines the get_cycles macro, like the previous patches in this series, so that generic code is aware that it's implemented by the platform, as is done on other archs. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Acked-by: Stafford Horne <shorne@gmail.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13powerpc: define get_cycles macro for arch-overrideJason A. Donenfeld
PowerPC defines a get_cycles() function, but it does not do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. While the get_cycles() ifdef is not currently used, the following timekeeping patch in this series will depend on the macro existing (or not existing) when defining random_get_entropy(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@ozlabs.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13alpha: define get_cycles macro for arch-overrideJason A. Donenfeld
Alpha defines a get_cycles() function, but it does not do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. While the get_cycles() ifdef is not currently used, the following timekeeping patch in this series will depend on the macro existing (or not existing) when defining random_get_entropy(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13parisc: define get_cycles macro for arch-overrideJason A. Donenfeld
PA-RISC defines a get_cycles() function, but it does not do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. While the get_cycles() ifdef is not currently used, the following timekeeping patch in this series will depend on the macro existing (or not existing) when defining random_get_entropy(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13s390: define get_cycles macro for arch-overrideJason A. Donenfeld
S390x defines a get_cycles() function, but it does not do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. While the get_cycles() ifdef is not currently used, the following timekeeping patch in this series will depend on the macro existing (or not existing) when defining random_get_entropy(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13ia64: define get_cycles macro for arch-overrideJason A. Donenfeld
Itanium defines a get_cycles() function, but it does not do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. While the get_cycles() ifdef is not currently used, the following timekeeping patch in this series will depend on the macro existing (or not existing) when defining random_get_entropy(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13init: call time_init() before rand_initialize()Jason A. Donenfeld
Currently time_init() is called after rand_initialize(), but rand_initialize() makes use of the timer on various platforms, and sometimes this timer needs to be initialized by time_init() first. In order for random_get_entropy() to not return zero during early boot when it's potentially used as an entropy source, reverse the order of these two calls. The block doing random initialization was right before time_init() before, so changing the order shouldn't have any complicated effects. Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13random: fix sysctl documentation nitsJason A. Donenfeld
A semicolon was missing, and the almost-alphabetical-but-not ordering was confusing, so regroup these by category instead. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-05-13Merge tag 'gfs2-v5.18-rc4-fix3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: "We've finally identified commit dc732906c245 ("gfs2: Introduce flag for glock holder auto-demotion") to be the other cause of the filesystem corruption we've been seeing. This feature isn't strictly necessary anymore, so we've decided to stop using it for now. With this and the gfs_iomap_end rounding fix you've already seen ("gfs2: Fix filesystem block deallocation for short writes" in this pull request), we're corruption free again now. - Fix filesystem block deallocation for short writes. - Stop using glock holder auto-demotion for now. - Get rid of buffered writes inefficiencies due to page faults being disabled. - Minor other cleanups" * tag 'gfs2-v5.18-rc4-fix3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Stop using glock holder auto-demotion for now gfs2: buffered write prefaulting gfs2: Align read and write chunks to the page cache gfs2: Pull return value test out of should_fault_in_pages gfs2: Clean up use of fault_in_iov_iter_{read,write}able gfs2: Variable rename gfs2: Fix filesystem block deallocation for short writes
2022-05-13Merge tag 'v5.18-next-dts64' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into arm/dt MT8195: - add evaluation and demo board MT8192: - add new nodes: pwrap, PMIC, scp, USB, efuse, IOMMU, smi, DPI, PCIe, SPMI, audio system, MMC and video enconder - add evaluation board MT8183: - fix dtschema issues - update compatible for the display ambient light processor (disp-aal) - fix dtschema warning for the pumpki board MT8173: - add power domains to the video enconder nodes - add GCE support to the display mutex node MT7622: - specify number of DMA requests explicitely - specify level 2 cache topology - add SPI-NAND flash device - fix dtschema warnings for the System Companion Processor (SCP) * tag 'v5.18-next-dts64' of git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux: (37 commits) arm64: dts: mt8192: Follow binding order for SCP registers arm64: dts: mediatek: add mtk-snfi for mt7622 arm64: dts: mediatek: mt8195-demo: enable uart1 arm64: dts: mediatek: mt8195-demo: Remove input-name property arm64: dts: mediatek: mt8183-pumpkin: fix bad thermistor node name arm64: dts: mt7622: specify the L2 cache topology arm64: dts: mt7622: specify the number of DMA requests arm64: dts: mediatek: pumpkin: Remove input-name property arm64: dts: mediatek: mt8173: Add gce-client-reg handle to disp-mutex arm64: dts: mediatek: Add device-tree for MT8195 Demo board dt-bindings: arm64: dts: mediatek: Add mt8195-demo board arm64: dts: Add mediatek SoC mt8195 and evaluation board arm64: dts: mt8192: Add mmc device nodes arm64: dts: mt8183: Update disp_aal node compatible arm64: dts: mt8192: Add audio-related nodes arm64: dts: mt8192: Add spmi node dt-bindings: arm: Add compatible for Mediatek MT8192 arm64: dts: mt6359: add PMIC MT6359 related nodes arm64: dts: mediatek: mt8173: Add power domain to encoder nodes arm64: dts: mediatek: Get rid of mediatek, larb for MM nodes ... Link: https://lore.kernel.org/r/2cd90ca7-7541-d47a-fec6-b0c64cf74fa3@gmail.com Like the 32-bit branch, this contains an incompatible binding change by removing the mediatek,larb properties from the dts files, so these no longer work with kernels prior to 5.18. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-05-13ext4: add unmount filesystem messageZhang Yi
Now that we have kernel message at mount time, system administrator could acquire the mount time, device and options easily. But we don't have corresponding unmounting message at umount time, so we cannot know if someone umount a filesystem easily. Some of the modern filesystems (e.g. xfs) have the umounting kernel message, so add one for ext4 filesystem for convenience. EXT4-fs (sdb): mounted filesystem with ordered data mode. Quota mode: none. EXT4-fs (sdb): unmounting filesystem. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220412145320.2669897-1-yi.zhang@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-13Merge tag 'v5.18-next-dts32' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux into arm/dt Delete no longer needed properties of MediaTek Larbs for MT2701. * tag 'v5.18-next-dts32' of git://git.kernel.org/pub/scm/linux/kernel/git/matthias.bgg/linux: arm: dts: mediatek: Get rid of mediatek, larb for MM nodes Link: https://lore.kernel.org/r/b4383f23-0adc-b9de-a1d9-abd1c2f82b27@gmail.com This concludes a cleanup that was started back in 2019, with an incompatible DT binding change. Kernels before 5.18 can no longer use the updated dtb from 5.19, and the drivers no longer parse the old properties, which breaks compatibility with older dtb files. Link: https://lore.kernel.org/lkml/1546318276-18993-2-git-send-email-yong.wu@mediatek.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-05-13io_uring: only wake when the correct events are setDylan Yudaken
The check for waking up a request compares the poll_t bits, however this will always contain some common flags so this always wakes up. For files with single wait queues such as sockets this can cause the request to be sent to the async worker unnecesarily. Further if it is non-blocking will complete the request with EAGAIN which is not desired. Here exclude these common events, making sure to not exclude POLLERR which might be important. Fixes: d7718a9d25a6 ("io_uring: use poll driven retry for files that support it") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220512091834.728610-3-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-13gfs2: Stop using glock holder auto-demotion for nowAndreas Gruenbacher
We're having unresolved issues with the glock holder auto-demotion mechanism introduced in commit dc732906c245. This mechanism was assumed to be essential for avoiding frequent short reads and writes until commit 296abc0d91d8 ("gfs2: No short reads or writes upon glock contention"). Since then, when the inode glock is lost, it is simply re-acquired and the operation is resumed. This means that apart from the performance penalty, we might as well drop the inode glock before faulting in pages, and re-acquire it afterwards. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13gfs2: buffered write prefaultingAndreas Gruenbacher
In gfs2_file_buffered_write, to increase the likelihood that all the user memory we're trying to write will be resident in memory, carry out the write in chunks and fault in each chunk of user memory before trying to write it. Otherwise, some workloads will trigger frequent short "internal" writes, causing filesystem blocks to be allocated and then partially deallocated again when writing into holes, which is wasteful and breaks reservations. Neither the chunked writes nor any of the short "internal" writes are user visible. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13ext4: remove unnecessary conditionalsLv Ruyi
iput() has already handled null and non-null parameter, so it is no need to use if(). Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Link: https://lore.kernel.org/r/20220411032337.2517465-1-lv.ruyi@zte.com.cn Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-05-13Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four fixes, all in drivers. These patches mosly fix error legs and exceptional conditions (scsi_dh_alua, qla2xxx). The lpfc fixes are for coding issues with lpfc features" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: lpfc: Correct BDE DMA address assignment for GEN_REQ_WQE scsi: lpfc: Fix split code for FLOGI on FCoE scsi: qla2xxx: Fix missed DMA unmap for aborted commands scsi: scsi_dh_alua: Properly handle the ALUA transitioning state
2022-05-13selftests/bpf: Fix usdt_400 test caseAndrii Nakryiko
usdt_400 test case relies on compiler using the same arg spec for usdt_400 USDT. This assumption breaks with Clang (Clang generates different arg specs with varying offsets relative to %rbp), so simplify this further and hard-code the constant which will guarantee that arg spec is the same across all 400 inlinings. Fixes: 630301b0d59d ("selftests/bpf: Add basic USDT selftests") Reported-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220513173703.89271-1-andrii@kernel.org
2022-05-13gfs2: Align read and write chunks to the page cacheAndreas Gruenbacher
Align the chunks that reads and writes are carried out in to the page cache rather than the user buffers. This will be more efficient in general, especially for allocating writes. Optimizing the case that the user buffer is gfs2 backed isn't very useful; we only need to make sure we won't deadlock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13gfs2: Pull return value test out of should_fault_in_pagesAndreas Gruenbacher
Pull the return value test of the previous read or write operation out of should_fault_in_pages(). In a following patch, we'll fault in pages before the I/O and there will be no return value to check. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13gfs2: Clean up use of fault_in_iov_iter_{read,write}ableAndreas Gruenbacher
No need to store the return value of the fault_in functions in separate variables. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13gfs2: Variable renameAndreas Gruenbacher
Instead of counting the number of bytes read from the filesystem, functions gfs2_file_direct_read and gfs2_file_read_iter count the number of bytes written into the user buffer. Conversely, functions gfs2_file_direct_write and gfs2_file_buffered_write count the number of bytes read from the user buffer. This is nothing but confusing, so change the read functions to count how many bytes they have read, and the write functions to count how many bytes they have written. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13gfs2: Fix filesystem block deallocation for short writesAndreas Gruenbacher
When a write cannot be carried out in full, gfs2_iomap_end() releases blocks that have been allocated for this write but haven't been used. To compute the end of the allocation, gfs2_iomap_end() incorrectly rounded the end of the attempted write down to the next block boundary to arrive at the end of the allocation. It would have to round up, but the end of the allocation is also available as iomap->offset + iomap->length, so just use that instead. In addition, use round_up() for computing the start of the unused range. Fixes: 64bc06bb32ee ("gfs2: iomap buffered write support") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-05-13Merge tag 'at91-dt-5.19' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/dt AT91 & LAN966 DT #1 for 5.19: - at91: DT compliance updates to gic and dataflash nodes - lan966: addition to many basic nodes for various peripherals - lan966: Kontron KSwitch D10: support for this new board and its network switch * tag 'at91-dt-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: dts: kswitch-d10: enable networking ARM: dts: lan966x: add switch node ARM: dts: lan966x: add serdes node ARM: dts: lan966x: add reset switch reset node ARM: dts: lan966x: add MIIM nodes ARM: dts: lan966x: add hwmon node ARM: dts: lan966x: add basic Kontron KSwitch D10 support ARM: dts: lan966x: add flexcom I2C nodes ARM: dts: lan966x: add flexcom SPI nodes ARM: dts: lan966x: add all flexcom usart nodes ARM: dts: lan966x: add missing uart DMA channel ARM: dts: lan966x: add sgpio node ARM: dts: lan966x: swap dma channels for crypto node ARM: dts: lan966x: rename pinctrl nodes ARM: dts: at91: sama7g5: remove interrupt-parent from gic node ARM: dts: at91: use generic node name for dataflash Link: https://lore.kernel.org/r/20220513162338.87717-1-nicolas.ferre@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-05-13kseltest/cgroup: Make test_stress.sh work if run interactivelyWaiman Long
Commit 54de76c01239 ("kselftest/cgroup: fix test_stress.sh to use OUTPUT dir") changes the test_core command path from . to $OUTPUT. However, variable OUTPUT may not be defined if the command is run interactively. Fix that by using ${OUTPUT:-.} to cover both cases. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-05-13security: declare member holding string literal constChristian Göttsche
The struct security_hook_list member lsm is assigned in security_add_hooks() with string literals passed from the individual security modules. Declare the function parameter and the struct member const to signal their immutability. Reported by Clang [-Wwrite-strings]: security/selinux/hooks.c:7388:63: error: passing 'const char [8]' to parameter of type 'char *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] security_add_hooks(selinux_hooks, ARRAY_SIZE(selinux_hooks), selinux); ^~~~~~~~~ ./include/linux/lsm_hooks.h:1629:11: note: passing argument to parameter 'lsm' here char *lsm); ^ Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-05-13Merge tag 'ceph-for-5.18-rc7' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fix from Ilya Dryomov: "Two fixes to properly maintain xattrs on async creates and thus preserve SELinux context on newly created files and to avoid improper usage of folio->private field which triggered BUG_ONs. Both marked for stable" * tag 'ceph-for-5.18-rc7' of https://github.com/ceph/ceph-client: ceph: check folio PG_private bit instead of folio->private ceph: fix setting of xattrs on async created inodes
2022-05-13Merge tag 'nfs-for-5.18-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "One more pull request. There was a bug in the fix to ensure that gss- proxy continues to work correctly after we fixed the AF_LOCAL socket leak in the RPC code. This therefore reverts that broken patch, and replaces it with one that works correctly. Stable fixes: - SUNRPC: Ensure that the gssproxy client can start in a connected state Bugfixes: - Revert "SUNRPC: Ensure gss-proxy connects on setup" - nfs: fix broken handling of the softreval mount option" * tag 'nfs-for-5.18-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: fix broken handling of the softreval mount option SUNRPC: Ensure that the gssproxy client can start in a connected state Revert "SUNRPC: Ensure gss-proxy connects on setup"
2022-05-13Merge branch 'master' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2022-05-13 1) Cleanups for the code behind the XFRM offload API. This is a preparation for the extension of the API for policy offload. From Leon Romanovsky. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next: xfrm: drop not needed flags variable in XFRM offload struct net/mlx5e: Use XFRM state direction instead of flags netdevsim: rely on XFRM state direction instead of flags ixgbe: propagate XFRM offload state direction instead of flags xfrm: store and rely on direction to construct offload flags xfrm: rename xfrm_state_offload struct to allow reuse xfrm: delete not used number of external headers xfrm: free not used XFRM_ESP_NO_TRAILER flag ==================== Link: https://lore.kernel.org/r/20220513151218.4010119-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-13Merge tag 'mm-hotfixes-stable-2022-05-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Seven MM fixes, three of which address issues added in the most recent merge window, four of which are cc:stable. Three non-MM fixes, none very serious" [ And yes, that's a real pull request from Andrew, not me creating a branch from emailed patches. Woo-hoo! ] * tag 'mm-hotfixes-stable-2022-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: add a mailing list for DAMON development selftests: vm: Makefile: rename TARGETS to VMTARGETS mm/kfence: reset PG_slab and memcg_data before freeing __kfence_pool mailmap: add entry for martyna.szapar-mudlaw@intel.com arm[64]/memremap: don't abuse pfn_valid() to ensure presence of linear map procfs: prevent unprivileged processes accessing fdinfo dir mm: mremap: fix sign for EFAULT error return value mm/hwpoison: use pr_err() instead of dump_page() in get_any_page() mm/huge_memory: do not overkill when splitting huge_zero_page Revert "mm/memory-failure.c: skip huge_zero_page in memory_failure()"
2022-05-13sfc: siena: Fix Kconfig dependenciesRen Zhijie
If CONFIG_PTP_1588_CLOCK=m and CONFIG_SFC_SIENA=y, the siena driver will fail to link: drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_remove_channel': ptp.c:(.text+0xa28): undefined reference to `ptp_clock_unregister' drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_probe_channel': ptp.c:(.text+0x13a0): undefined reference to `ptp_clock_register' ptp.c:(.text+0x1470): undefined reference to `ptp_clock_unregister' drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_pps_worker': ptp.c:(.text+0x1d29): undefined reference to `ptp_clock_event' drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_siena_ptp_get_ts_info': ptp.c:(.text+0x301b): undefined reference to `ptp_clock_index' To fix this build error, make SFC_SIENA depends on PTP_1588_CLOCK. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: d48523cb88e0 ("sfc: Copy shared files needed for Siena (part 2)") Signed-off-by: Ren Zhijie <renzhijie2@huawei.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220513012721.140871-1-renzhijie2@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-13Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: - TLB invalidation workaround for Qualcomm Kryo-4xx "gold" CPUs - Fix broken dependency in the vDSO Makefile - Fix pointer authentication overrides in ISAR2 ID register * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs arm64: cpufeature: remove duplicate ID_AA64ISAR2_EL1 entry arm64: vdso: fix makefile dependency on vdso.so
2022-05-13drm/amdgpu: clean up some inconsistent indentingJiapeng Chong
Eliminate the follow smatch warning: drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c:35 nbio_v7_7_get_rev_id() warn: inconsistent indenting. drivers/gpu/drm/amd/amdgpu/nbio_v7_7.c:214 nbio_v7_7_init_registers() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-05-13Merge tag 'hwmon-for-v5.18-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Restrict ltq-cputemp to SOC_XWAY to fix build failure - Add OF device ID table to tmp401 driver to enable auto-load * tag 'hwmon-for-v5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (ltq-cputemp) restrict it to SOC_XWAY hwmon: (tmp401) Add OF device ID table
2022-05-13Merge tag 'drm-fixes-2022-05-13' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Pretty quiet week on the fixes front, 4 amdgpu and one i915 fix. I think there might be a few misc fbdev ones outstanding, but I'll see if they are necessary and pass them on if so. amdgpu: - Disable ASPM for VI boards on ADL platforms - S0ix DCN3.1 display fix - Resume regression fix - Stable pstate fix i915: - fix for kernel memory corruption when running a lot of OpenCL tests in parallel" * tag 'drm-fixes-2022-05-13' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu/ctx: only reset stable pstate if the user changed it (v2) Revert "drm/amd/pm: keep the BACO feature enabled for suspend" drm/i915: Fix race in __i915_vma_remove_closed drm/amd/display: undo clearing of z10 related function pointers drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems
2022-05-13netfilter: conntrack: skip verification of zero UDP checksumKevin Mitchell
The checksum is optional for UDP packets. However nf_reject would previously require a valid checksum to elicit a response such as ICMP_DEST_UNREACH. Add some logic to nf_reject_verify_csum to determine if a UDP packet has a zero checksum and should therefore not be verified. Signed-off-by: Kevin Mitchell <kevmitch@arista.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-13netfilter: flowtable: nft_flow_route use more data for reverse routeSven Auhagen
When creating a flow table entry, the reverse route is looked up based on the current packet. There can be scenarios where the user creates a custom ip rule to route the traffic differently. In order to support those scenarios, the lookup needs to add more information based on the current packet. The patch adds multiple new information to the route lookup. Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-13netfilter: prefer extension check to pointer checkFlorian Westphal
The pointer check usually results in a 'false positive': its likely that the ctnetlink module is loaded but no event monitoring is enabled. After recent change to autodetect ctnetlink usage and only allocate the ecache extension if a listener is active, check if the extension is present on a given conntrack. If its not there, there is nothing to report and calls to the notification framework can be elided. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>