summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-07-21Merge tag 'mtd/fixes-for-5.19-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fix from Richard Weinberger: "A aingle NAND controller fix: - gpmi: Fix busy timeout setting (wrong calculation, yes again)" * tag 'mtd/fixes-for-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: gpmi: Set WAIT_FOR_READY timeout based on program/erase times
2022-07-21Merge tag 'net-5.19-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can. Still no major regressions, most of the changes are still due to data races fixes, plus the usual bunch of drivers fixes. Previous releases - regressions: - tcp/udp: make early_demux back namespacified. - dsa: fix issues with vlan_filtering_is_global Previous releases - always broken: - ip: fix data-races around ipv4_net_table (round 2, 3 & 4) - amt: fix validation and synchronization bugs - can: fix detection of mcp251863 - eth: iavf: fix handling of dummy receive descriptors - eth: lan966x: fix issues with MAC table - eth: stmmac: dwmac-mediatek: fix clock issue Misc: - dsa: update documentation" * tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication net/sched: cls_api: Fix flow action initialization tcp: Fix data-races around sysctl_tcp_max_reordering. tcp: Fix a data-race around sysctl_tcp_abort_on_overflow. tcp: Fix a data-race around sysctl_tcp_rfc1337. tcp: Fix a data-race around sysctl_tcp_stdurg. tcp: Fix a data-race around sysctl_tcp_retrans_collapse. tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. tcp: Fix data-races around sysctl_tcp_recovery. tcp: Fix a data-race around sysctl_tcp_early_retrans. tcp: Fix data-races around sysctl knobs related to SYN option. udp: Fix a data-race around sysctl_udp_l3mdev_accept. ip: Fix data-races around sysctl_ip_prot_sock. ipv4: Fix data-races around sysctl_fib_multipath_hash_fields. ipv4: Fix data-races around sysctl_fib_multipath_hash_policy. ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe() can: mcp251xfd: fix detection of mcp251863 Documentation: fix udp_wmem_min in ip-sysctl.rst ...
2022-07-21s390/archrandom: prevent CPACF trng invocations in interrupt contextHarald Freudenberger
This patch slightly reworks the s390 arch_get_random_seed_{int,long} implementation: Make sure the CPACF trng instruction is never called in any interrupt context. This is done by adding an additional condition in_task(). Justification: There are some constrains to satisfy for the invocation of the arch_get_random_seed_{int,long}() functions: - They should provide good random data during kernel initialization. - They should not be called in interrupt context as the TRNG instruction is relatively heavy weight and may for example make some network loads cause to timeout and buck. However, it was not clear what kind of interrupt context is exactly encountered during kernel init or network traffic eventually calling arch_get_random_seed_long(). After some days of investigations it is clear that the s390 start_kernel function is not running in any interrupt context and so the trng is called: Jul 11 18:33:39 t35lp54 kernel: [<00000001064e90ca>] arch_get_random_seed_long.part.0+0x32/0x70 Jul 11 18:33:39 t35lp54 kernel: [<000000010715f246>] random_init+0xf6/0x238 Jul 11 18:33:39 t35lp54 kernel: [<000000010712545c>] start_kernel+0x4a4/0x628 Jul 11 18:33:39 t35lp54 kernel: [<000000010590402a>] startup_continue+0x2a/0x40 The condition in_task() is true and the CPACF trng provides random data during kernel startup. The network traffic however, is more difficult. A typical call stack looks like this: Jul 06 17:37:07 t35lp54 kernel: [<000000008b5600fc>] extract_entropy.constprop.0+0x23c/0x240 Jul 06 17:37:07 t35lp54 kernel: [<000000008b560136>] crng_reseed+0x36/0xd8 Jul 06 17:37:07 t35lp54 kernel: [<000000008b5604b8>] crng_make_state+0x78/0x340 Jul 06 17:37:07 t35lp54 kernel: [<000000008b5607e0>] _get_random_bytes+0x60/0xf8 Jul 06 17:37:07 t35lp54 kernel: [<000000008b56108a>] get_random_u32+0xda/0x248 Jul 06 17:37:07 t35lp54 kernel: [<000000008aefe7a8>] kfence_guarded_alloc+0x48/0x4b8 Jul 06 17:37:07 t35lp54 kernel: [<000000008aeff35e>] __kfence_alloc+0x18e/0x1b8 Jul 06 17:37:07 t35lp54 kernel: [<000000008aef7f10>] __kmalloc_node_track_caller+0x368/0x4d8 Jul 06 17:37:07 t35lp54 kernel: [<000000008b611eac>] kmalloc_reserve+0x44/0xa0 Jul 06 17:37:07 t35lp54 kernel: [<000000008b611f98>] __alloc_skb+0x90/0x178 Jul 06 17:37:07 t35lp54 kernel: [<000000008b6120dc>] __napi_alloc_skb+0x5c/0x118 Jul 06 17:37:07 t35lp54 kernel: [<000000008b8f06b4>] qeth_extract_skb+0x13c/0x680 Jul 06 17:37:07 t35lp54 kernel: [<000000008b8f6526>] qeth_poll+0x256/0x3f8 Jul 06 17:37:07 t35lp54 kernel: [<000000008b63d76e>] __napi_poll.constprop.0+0x46/0x2f8 Jul 06 17:37:07 t35lp54 kernel: [<000000008b63dbec>] net_rx_action+0x1cc/0x408 Jul 06 17:37:07 t35lp54 kernel: [<000000008b937302>] __do_softirq+0x132/0x6b0 Jul 06 17:37:07 t35lp54 kernel: [<000000008abf46ce>] __irq_exit_rcu+0x13e/0x170 Jul 06 17:37:07 t35lp54 kernel: [<000000008abf531a>] irq_exit_rcu+0x22/0x50 Jul 06 17:37:07 t35lp54 kernel: [<000000008b922506>] do_io_irq+0xe6/0x198 Jul 06 17:37:07 t35lp54 kernel: [<000000008b935826>] io_int_handler+0xd6/0x110 Jul 06 17:37:07 t35lp54 kernel: [<000000008b9358a6>] psw_idle_exit+0x0/0xa Jul 06 17:37:07 t35lp54 kernel: ([<000000008ab9c59a>] arch_cpu_idle+0x52/0xe0) Jul 06 17:37:07 t35lp54 kernel: [<000000008b933cfe>] default_idle_call+0x6e/0xd0 Jul 06 17:37:07 t35lp54 kernel: [<000000008ac59f4e>] do_idle+0xf6/0x1b0 Jul 06 17:37:07 t35lp54 kernel: [<000000008ac5a28e>] cpu_startup_entry+0x36/0x40 Jul 06 17:37:07 t35lp54 kernel: [<000000008abb0d90>] smp_start_secondary+0x148/0x158 Jul 06 17:37:07 t35lp54 kernel: [<000000008b935b9e>] restart_int_handler+0x6e/0x90 which confirms that the call is in softirq context. So in_task() covers exactly the cases where we want to have CPACF trng called: not in nmi, not in hard irq, not in soft irq but in normal task context and during kernel init. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Link: https://lore.kernel.org/r/20220713131721.257907-1-freude@linux.ibm.com Fixes: e4f74400308c ("s390/archrandom: simplify back to earlier design and initialize earlier") [agordeev@linux.ibm.com changed desc, added Fixes and Link, removed -stable] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-21mmu_gather: Force tlb-flush VM_PFNMAP vmasPeter Zijlstra
Jann reported a race between munmap() and unmap_mapping_range(), where unmap_mapping_range() will no-op once unmap_vmas() has unlinked the VMA; however munmap() will not yet have invalidated the TLBs. Therefore unmap_mapping_range() will complete while there are still (stale) TLB entries for the specified range. Mitigate this by force flushing TLBs for VM_PFNMAP ranges. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21mmu_gather: Let there be one tlb_{start,end}_vma() implementationPeter Zijlstra
Now that architectures are no longer allowed to override tlb_{start,end}_vma() re-arrange code so that there is only one implementation for each of these functions. This much simplifies trying to figure out what they actually do. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21csky/tlb: Remove tlb_flush() definePeter Zijlstra
The previous patch removed the tlb_flush_end() implementation which used tlb_flush_range(). This means: - csky did double invalidates, a range invalidate per vma and a full invalidate at the end - csky actually has range invalidates and as such the generic tlb_flush implementation is more efficient for it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Tested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21mmu_gather: Remove per arch tlb_{start,end}_vma()Peter Zijlstra
Scattered across the archs are 3 basic forms of tlb_{start,end}_vma(). Provide two new MMU_GATHER_knobs to enumerate them and remove the per arch tlb_{start,end}_vma() implementations. - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range() but does *NOT* want to call it for each VMA. - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the invalidate across multiple VMAs if possible. With these it is possible to capture the three forms: 1) empty stubs; select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS 2) start: flush_cache_range(), end: empty; select MMU_GATHER_MERGE_VMAS 3) start: flush_cache_range(), end: flush_tlb_range(); default Obviously, if the architecture does not have flush_cache_range() then it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21scripts/gdb: Fix gdb 'lx-symbols' commandKhalid Masum
Currently the command 'lx-symbols' in gdb exits with the error`Function "do_init_module" not defined in "kernel/module.c"`. This occurs because the file kernel/module.c was moved to kernel/module/main.c. Fix this breakage by changing the path to "kernel/module/main.c" in LoadModuleBreakpoint. Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Fixes: cfc1d277891e ("module: Move all into module/") Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21watch-queue: remove spurious double semicolonLinus Torvalds
Sedat Dilek noticed that I had an extraneous semicolon at the end of a line in the previous patch. It's harmless, but unintentional, and while compilers just treat it as an extra empty statement, for all I know some other tooling might warn about it. So clean it up before other people notice too ;) Fixes: 353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly") Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
2022-07-21spi: spi-rspi: Fix PIO fallback on RZ platformsBiju Das
RSPI IP on RZ/{A, G2L} SoC's has the same signal for both interrupt and DMA transfer request. Setting DMARS register for DMA transfer makes the signal to work as a DMA transfer request signal and subsequent interrupt requests to the interrupt controller are masked. PIO fallback does not work as interrupt signal is disabled. This patch fixes this issue by re-enabling the interrupts by calling dmaengine_synchronize(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20220721143449.879257-1-biju.das.jz@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-21erofs: record the longest decompressed size in this roundGao Xiang
Currently, `pcl->length' records the longest decompressed length as long as the pcluster itself isn't reclaimed. However, such number is unneeded for the general cases since it doesn't indicate the exact decompressed size in this round. Instead, let's record the decompressed size for this round instead, thus `pcl->nr_pages' can be completely dropped and pageofs_out is also designed to be kept in sync with `pcl->length'. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-16-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce z_erofs_do_decompressed_bvec()Gao Xiang
Both out_bvecs and in_bvecs share the common logic for decompressed buffers. So let's make a helper for this. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-15-hsiangkao@linux.alibaba.com
2022-07-21erofs: try to leave (de)compressed_pages on stack if possibleGao Xiang
For the most cases, small pclusters can be decompressed with page arrays on stack. Try to leave both (de)compressed_pages on stack if possible as before. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-14-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce struct z_erofs_decompress_backendGao Xiang
Let's introduce struct z_erofs_decompress_backend in order to pass on the decompression backend context between helper functions more easier and avoid too many arguments. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-13-hsiangkao@linux.alibaba.com
2022-07-21erofs: get rid of `z_pagemap_global'Gao Xiang
In order to introduce multi-reference pclusters for compressed data deduplication, let's get rid of the global page array for now since it needs to be re-designed then at least. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-12-hsiangkao@linux.alibaba.com
2022-07-21erofs: clean up `enum z_erofs_collectmode'Gao Xiang
`enum z_erofs_collectmode' is really ambiguous, but I'm not quite sure if there are better naming, basically it's used to judge whether inplace I/O can be used due to the current status of pclusters in the chain. Rename it as `enum z_erofs_pclustermode' instead. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-11-hsiangkao@linux.alibaba.com
2022-07-21erofs: get rid of `enum z_erofs_page_type'Gao Xiang
Remove it since pagevec[] is no longer used. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-10-hsiangkao@linux.alibaba.com
2022-07-21erofs: rework online page handlingGao Xiang
Since all decompressed offsets have been integrated to bvecs[], this patch avoids all sub-indexes so that page->private only includes a part count and an eio flag, thus in the future folio->private can have the same meaning. In addition, PG_error will not be used anymore after this patch and we're heading to use page->private (later folio->private) and page->mapping (later folio->mapping) only. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-9-hsiangkao@linux.alibaba.com
2022-07-21erofs: switch compressed_pages[] to bufvecGao Xiang
Convert compressed_pages[] to bufvec in order to avoid using page->private to keep onlinepage_index (decompressed offset) for inplace I/O pages. In the future, we only rely on folio->private to keep a countdown to unlock folios and set folio_uptodate. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-8-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce `z_erofs_parse_in_bvecs'Gao Xiang
`z_erofs_decompress_pcluster()' is too long therefore it'd be better to introduce another helper to parse compressed pages (or laterly, compressed bvecs.) BTW, since `compressed_bvecs' is too long as a part of the function name, `in_bvecs' is used here instead. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-7-hsiangkao@linux.alibaba.com
2022-07-21erofs: drop the old pagevec approachGao Xiang
Remove the old pagevec approach but keep z_erofs_page_type for now. It will be reworked in the following commits as well. Also rename Z_EROFS_NR_INLINE_PAGEVECS as Z_EROFS_INLINE_BVECS with the new value 2 since it's actually enough to bootstrap. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-6-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce bufvec to store decompressed buffersGao Xiang
For each pcluster, the total compressed buffers are determined in advance, yet the number of decompressed buffers actually vary. Too many decompressed pages can be recorded if one pcluster is highly compressed or its pcluster size is large. That takes extra memory footprints compared to uncompressed filesystems, especially a lot of I/O in flight on low-ended devices. Therefore, similar to inplace I/O, pagevec was introduced to reuse page cache to store these pointers in the time-sharing way since these pages are actually unused before decompressing. In order to make it more flexable, a cleaner bufvec is used to replace the old pagevec stuffs so that - Decompressed offsets can be stored inline, thus it can be used for the upcoming feature like compressed data deduplication. It's calculated by `page_offset(page) - map->m_la'; - Towards supporting large folios for compressed inodes since our final goal is to completely avoid page->private but use folio->private only for all page cache pages. Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-5-hsiangkao@linux.alibaba.com
2022-07-21erofs: introduce `z_erofs_parse_out_bvecs()'Gao Xiang
`z_erofs_decompress_pcluster()' is too long therefore it'd be better to introduce another helper to parse decompressed pages (or laterly, decompressed bvecs.) BTW, since `decompressed_bvecs' is too long as a part of the function name, `out_bvecs' is used instead. Reviewed-by: Yue Hu <huyue2@coolpad.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-4-hsiangkao@linux.alibaba.com
2022-07-21erofs: clean up z_erofs_collector_begin()Gao Xiang
Rearrange the code and get rid of all gotos. Reviewed-by: Yue Hu <huyue2@coolpad.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-3-hsiangkao@linux.alibaba.com
2022-07-21erofs: get rid of unneeded `inode', `map' and `sb'Gao Xiang
Since commit 5c6dcc57e2e5 ("erofs: get rid of `struct z_erofs_collector'"), these arguments can be dropped as well. No logic changes. Reviewed-by: Yue Hu <huyue2@coolpad.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220715154203.48093-2-hsiangkao@linux.alibaba.com
2022-07-21io_uring: do not recycle buffer in READVDylan Yudaken
READV cannot recycle buffers as it would lose some of the data required to reimport that buffer. Reported-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Fixes: b66e65f41426 ("io_uring: never call io_buffer_select() for a buffer re-select") Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220721131325.624788-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-21io_uring: fix free of unallocated buffer listDylan Yudaken
in the error path of io_register_pbuf_ring, only free bl if it was allocated. Reported-by: Dipanjan Das <mail.dipanjan.das@gmail.com> Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers") Signed-off-by: Dylan Yudaken <dylany@fb.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/all/CANX2M5bXKw1NaHdHNVqssUUaBCs8aBpmzRNVEYEvV0n44P7ioA@mail.gmail.com/ Link: https://lore.kernel.org/all/CANX2M5YiZBXU3L6iwnaLs-HHJXRvrxM8mhPDiMDF9Y9sAvOHUA@mail.gmail.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-21Merge tag 'at91-fixes-5.19-3' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes AT91 fixes for 5.19 #3 It contains one fix for LAN966 based SoCs fixing the frequency of sys_clk. sys_clk is feeding different IPs so having proper frequency for it in DT is necessary for proper working of different drivers. * tag 'at91-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: dts: lan966x: fix sys_clk frequency Link: https://lore.kernel.org/r/20220721075705.1739915-1-claudiu.beznea@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-21sched/deadline: Fix BUG_ON condition for deboosted tasksJuri Lelli
Tasks the are being deboosted from SCHED_DEADLINE might enter enqueue_task_dl() one last time and hit an erroneous BUG_ON condition: since they are not boosted anymore, the if (is_dl_boosted()) branch is not taken, but the else if (!dl_prio) is and inside this one we BUG_ON(!is_dl_boosted), which is of course false (BUG_ON triggered) otherwise we had entered the if branch above. Long story short, the current condition doesn't make sense and always leads to triggering of a BUG. Fix this by only checking enqueue flags, properly: ENQUEUE_REPLENISH has to be present, but additional flags are not a problem. Fixes: 64be6f1f5f71 ("sched/deadline: Don't replenish from a !SCHED_DEADLINE entity") Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220714151908.533052-1-juri.lelli@redhat.com
2022-07-21Merge tag 'amd-drm-fixes-5.19-2022-07-20' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.19-2022-07-20: amdgpu: - Drop redundant buffer cleanup that can lead to a segfault - Add a bo_list mutex to avoid possible list corruption in CS Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220720210917.6202-1-alexander.deucher@amd.com
2022-07-21Merge tag 'drm-intel-fixes-2022-07-20-1' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix the regression caused by the lack of GuC v70. Let's accept the fallback to v69. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YtgguaR5JYK083oZ@intel.com
2022-07-20drm/amdgpu: Protect the amdgpu_bo_list list with a mutex v2Luben Tuikov
Protect the struct amdgpu_bo_list with a mutex. This is used during command submission in order to avoid buffer object corruption as recorded in the link below. v2 (chk): Keep the mutex looked for the whole CS to avoid using the list from multiple CS threads at the same time. Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Vitaly Prosyak <Vitaly.Prosyak@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048 Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-07-20watchqueue: make sure to serialize 'wqueue->defunct' properlyLinus Torvalds
When the pipe is closed, we mark the associated watchqueue defunct by calling watch_queue_clear(). However, while that is protected by the watchqueue lock, new watchqueue entries aren't actually added under that lock at all: they use the pipe->rd_wait.lock instead, and looking up that pipe happens without any locking. The watchqueue code uses the RCU read-side section to make sure that the wqueue entry itself hasn't disappeared, but that does not protect the pipe_info in any way. So make sure to actually hold the wqueue lock when posting watch events, properly serializing against the pipe being torn down. Reported-by: Noam Rathaus <noamr@ssd-disclosure.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-20spi: spi-cadence: Fix SPI NO Slave Select macro definitionSai Krishna Potthuri
Fix SPI NO Slave Select macro definition, when all the SPI CS bits are high which means no slave is selected. Fixes: 21b511ddee09 ("spi: spi-cadence: Fix SPI CS gets toggling sporadically") Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com> Signed-off-by: Amit Kumar Mahapatra <amit.kumar-mahapatra@xilinx.com> Link: https://lore.kernel.org/r/20220713164529.28444-1-amit.kumar-mahapatra@xilinx.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-20perf/x86/intel/lbr: Fix unchecked MSR access error on HSWKan Liang
The fuzzer triggers the below trace. [ 7763.384369] unchecked MSR access error: WRMSR to 0x689 (tried to write 0x1fffffff8101349e) at rIP: 0xffffffff810704a4 (native_write_msr+0x4/0x20) [ 7763.397420] Call Trace: [ 7763.399881] <TASK> [ 7763.401994] intel_pmu_lbr_restore+0x9a/0x1f0 [ 7763.406363] intel_pmu_lbr_sched_task+0x91/0x1c0 [ 7763.410992] __perf_event_task_sched_in+0x1cd/0x240 On a machine with the LBR format LBR_FORMAT_EIP_FLAGS2, when the TSX is disabled, a TSX quirk is required to access LBR from registers. The lbr_from_signext_quirk_needed() is introduced to determine whether the TSX quirk should be applied. However, the lbr_from_signext_quirk_needed() is invoked before the intel_pmu_lbr_init(), which parses the LBR format information. Without the correct LBR format information, the TSX quirk never be applied. Move the lbr_from_signext_quirk_needed() into the intel_pmu_lbr_init(). Checking x86_pmu.lbr_has_tsx in the lbr_from_signext_quirk_needed() is not required anymore. Both LBR_FORMAT_EIP_FLAGS2 and LBR_FORMAT_INFO have LBR_TSX flag, but only the LBR_FORMAT_EIP_FLAGS2 requirs the quirk. Update the comments accordingly. Fixes: 1ac7fd8159a8 ("perf/x86/intel/lbr: Support LBR format V7") Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220714182630.342107-1-kan.liang@linux.intel.com
2022-07-20lkdtm: Disable return thunks in rodata.cJosh Poimboeuf
The following warning was seen: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:557 apply_returns (arch/x86/kernel/alternative.c:557 (discriminator 1)) Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc4-00008-gee88d363d156 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014 RIP: 0010:apply_returns (arch/x86/kernel/alternative.c:557 (discriminator 1)) Code: ff ff 74 cb 48 83 c5 04 49 39 ee 0f 87 81 fe ff ff e9 22 ff ff ff 0f 0b 48 83 c5 04 49 39 ee 0f 87 6d fe ff ff e9 0e ff ff ff <0f> 0b 48 83 c5 04 49 39 ee 0f 87 59 fe ff ff e9 fa fe ff ff 48 89 The warning happened when apply_returns() failed to convert "JMP __x86_return_thunk" to RET. It was instead a JMP to nowhere, due to the thunk relocation not getting resolved. That rodata.o code is objcopy'd to .rodata, and later memcpy'd, so relocations don't work (and are apparently silently ignored). LKDTM is only used for testing, so the naked RET should be fine. So just disable return thunks for that file. While at it, disable objtool and KCSAN for the file. Fixes: 0b53c374b9ef ("x86/retpoline: Use -mfunction-return") Reported-by: kernel test robot <oliver.sang@intel.com> Debugged-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/lkml/Ys58BxHxoDZ7rfpr@xsang-OptiPlex-9020/
2022-07-20x86/bugs: Warn when "ibrs" mitigation is selected on Enhanced IBRS partsPawan Gupta
IBRS mitigation for spectre_v2 forces write to MSR_IA32_SPEC_CTRL at every kernel entry/exit. On Enhanced IBRS parts setting MSR_IA32_SPEC_CTRL[IBRS] only once at boot is sufficient. MSR writes at every kernel entry/exit incur unnecessary performance loss. When Enhanced IBRS feature is present, print a warning about this unnecessary performance loss. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2a5eaf54583c2bfe0edc4fea64006656256cca17.1657814857.git.pawan.kumar.gupta@linux.intel.com
2022-07-20x86/alternative: Report missing return thunk detailsKees Cook
Debugging missing return thunks is easier if we can see where they're happening. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/lkml/Ys66hwtFcGbYmoiZ@hirez.programming.kicks-ass.net/
2022-07-20lockdown: Fix kexec lockdown bypass with ima policyEric Snowberg
The lockdown LSM is primarily used in conjunction with UEFI Secure Boot. This LSM may also be used on machines without UEFI. It can also be enabled when UEFI Secure Boot is disabled. One of lockdown's features is to prevent kexec from loading untrusted kernels. Lockdown can be enabled through a bootparam or after the kernel has booted through securityfs. If IMA appraisal is used with the "ima_appraise=log" boot param, lockdown can be defeated with kexec on any machine when Secure Boot is disabled or unavailable. IMA prevents setting "ima_appraise=log" from the boot param when Secure Boot is enabled, but this does not cover cases where lockdown is used without Secure Boot. To defeat lockdown, boot without Secure Boot and add ima_appraise=log to the kernel command line; then: $ echo "integrity" > /sys/kernel/security/lockdown $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" > \ /sys/kernel/security/ima/policy $ kexec -ls unsigned-kernel Add a call to verify ima appraisal is set to "enforce" whenever lockdown is enabled. This fixes CVE-2022-21505. Cc: stable@vger.kernel.org Fixes: 29d3c1c8dfe7 ("kexec: Allow kexec_file() with appropriate IMA policy when locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Acked-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-20spi: bcm2835: bcm2835_spi_handle_err(): fix NULL pointer deref for non DMA ↵Marc Kleine-Budde
transfers In case a IRQ based transfer times out the bcm2835_spi_handle_err() function is called. Since commit 1513ceee70f2 ("spi: bcm2835: Drop dma_pending flag") the TX and RX DMA transfers are unconditionally canceled, leading to NULL pointer derefs if ctlr->dma_tx or ctlr->dma_rx are not set. Fix the NULL pointer deref by checking that ctlr->dma_tx and ctlr->dma_rx are valid pointers before accessing them. Fixes: 1513ceee70f2 ("spi: bcm2835: Drop dma_pending flag") Cc: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220719072234.2782764-1-mkl@pengutronix.de Signed-off-by: Mark Brown <broonie@kernel.org>
2022-07-20selftests: gpio: fix include path to kernel headers for out of tree buildsKent Gibson
When building selftests out of the kernel tree the gpio.h the include path is incorrect and the build falls back to the system includes which may be outdated. Add the KHDR_INCLUDES to the CFLAGS to include the gpio.h from the build tree. Fixes: 4f4d0af7b2d9 ("selftests: gpio: restore CFLAGS options") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Kent Gibson <warthog618@gmail.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-07-20tools: Fixed MIPS builds due to struct flock re-definitionFlorian Fainelli
Building perf for MIPS failed after 9f79b8b72339 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little") with the following error: CC /home/fainelli/work/buildroot/output/bmips/build/linux-custom/tools/perf/trace/beauty/fcntl.o In file included from ../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:77, from ../include/uapi/linux/fcntl.h:5, from trace/beauty/fcntl.c:10: ../include/uapi/asm-generic/fcntl.h:188:8: error: redefinition of 'struct flock' struct flock { ^~~~~ In file included from ../include/uapi/linux/fcntl.h:5, from trace/beauty/fcntl.c:10: ../../../../host/mipsel-buildroot-linux-gnu/sysroot/usr/include/asm/fcntl.h:63:8: note: originally defined here struct flock { ^~~~~ This is due to the local copy under tools/include/uapi/asm-generic/fcntl.h including the toolchain's kernel headers which already define 'struct flock' and define HAVE_ARCH_STRUCT_FLOCK to future inclusions make a decision as to whether re-defining 'struct flock' is appropriate or not. Make sure what do not re-define 'struct flock' when HAVE_ARCH_STRUCT_FLOCK is already defined. Fixes: 9f79b8b72339 ("uapi: simplify __ARCH_FLOCK{,64}_PAD a little") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> [arnd: sync with include/uapi/asm-generic/fcntl.h as well] Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-07-20Merge tag 'linux-can-fixes-for-5.19-20220720' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== this is a pull request of 2 patches for net/master. The first patch is by me and fixes the detection of the mcp251863 in the mcp251xfd driver. The last patch is by Liang He and adds a missing of_node_put() in the rcar_canfd driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20mlxsw: spectrum_router: Fix IPv4 nexthop gateway indicationIdo Schimmel
mlxsw needs to distinguish nexthops with a gateway from connected nexthops in order to write the former to the adjacency table of the device. The check used to rely on the fact that nexthops with a gateway have a 'link' scope whereas connected nexthops have a 'host' scope. This is no longer correct after commit 747c14307214 ("ip: fix dflt addr selection for connected nexthop"). Fix that by instead checking the address family of the gateway IP. This is a more direct way and also consistent with the IPv6 counterpart in mlxsw_sp_rt6_is_gateway(). Cc: stable@vger.kernel.org Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop") Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20net/sched: cls_api: Fix flow action initializationOz Shlomo
The cited commit refactored the flow action initialization sequence to use an interface method when translating tc action instances to flow offload objects. The refactored version skips the initialization of the generic flow action attributes for tc actions, such as pedit, that allocate more than one offload entry. This can cause potential issues for drivers mapping flow action ids. Populate the generic flow action fields for all the flow action entries. Fixes: c54e1d920f04 ("flow_offload: add ops to tc_action_ops for flow action setup") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> ---- v1 -> v2: - coalese the generic flow action fields initialization to a single loop Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20Merge branch 'net-sysctl-races-round-4'David S. Miller
Kuniyuki Iwashima says: ==================== sysctl: Fix data-races around ipv4_net_table (Round 4). This series fixes data-races around 17 knobs after fib_multipath_use_neigh in ipv4_net_table. tcp_fack was skipped because it's obsolete and there's no readers. So, round 5 will start with tcp_dsack, 2 rounds left for 27 knobs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix data-races around sysctl_tcp_max_reordering.Kuniyuki Iwashima
While reading sysctl_tcp_max_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.Kuniyuki Iwashima
While reading sysctl_tcp_abort_on_overflow, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_rfc1337.Kuniyuki Iwashima
While reading sysctl_tcp_rfc1337, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_stdurg.Kuniyuki Iwashima
While reading sysctl_tcp_stdurg, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>