summaryrefslogtreecommitdiff
path: root/arch/powerpc
AgeCommit message (Collapse)Author
2023-08-28Merge tag 'v6.6-vfs.fchmodat2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull fchmodat2 system call from Christian Brauner: "This adds the fchmodat2() system call. It is a revised version of the fchmodat() system call, adding a missing flag argument. Support for both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included. Adding this system call revision has been a longstanding request but so far has always fallen through the cracks. While the kernel implementation of fchmodat() does not have a flag argument the libc provided POSIX-compliant fchmodat(3) version does. Both glibc and musl have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW (see [1] and [2]). The workaround is brittle because it relies not just on O_PATH and O_NOFOLLOW semantics and procfs magic links but also on our rather inconsistent symlink semantics. This gives userspace a proper fchmodat2() system call that libcs can use to properly implement fchmodat(3) and allows them to get rid of their hacks. In this case it will immediately benefit them as the current workaround is already defunct because of aformentioned inconsistencies. In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use AT_EMPTY_PATH with fchmodat2(). This is already possible with fchownat() so there's no reason to not also support it for fchmodat2(). The implementation is simple and comes with selftests. Implementation of the system call and wiring up the system call are done as separate patches even though they could arguably be one patch. But in case there are merge conflicts from other system call additions it can be beneficial to have separate patches" Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1] Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2] * tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests: fchmodat2: remove duplicate unneeded defines fchmodat2: add support for AT_EMPTY_PATH selftests: Add fchmodat2 selftest arch: Register fchmodat2, usually as syscall 452 fs: Add fchmodat2() Non-functional cleanup of a "__user * filename"
2023-08-28Merge tag 'v6.6-vfs.ctime' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs timestamp updates from Christian Brauner: "This adds VFS support for multi-grain timestamps and converts tmpfs, xfs, ext4, and btrfs to use them. This carries acks from all relevant filesystems. The VFS always uses coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot of metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g., backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. This introduces fine-grained timestamps that are used when they are actively queried. This uses the 31st bit of the ctime tv_nsec field to indicate that something has queried the inode for the mtime or ctime. When this flag is set, on the next mtime or ctime update, the kernel will fetch a fine-grained timestamp instead of the usual coarse-grained one. As POSIX generally mandates that when the mtime changes, the ctime must also change the kernel always stores normalized ctime values, so only the first 30 bits of the tv_nsec field are ever used. Filesytems can opt into this behavior by setting the FS_MGTIME flag in the fstype. Filesystems that don't set this flag will continue to use coarse-grained timestamps. Various preparatory changes, fixes and cleanups are included: - Fixup all relevant places where POSIX requires updating ctime together with mtime. This is a wide-range of places and all maintainers provided necessary Acks. - Add new accessors for inode->i_ctime directly and change all callers to rely on them. Plain accesses to inode->i_ctime are now gone and it is accordingly rename to inode->__i_ctime and commented as requiring accessors. - Extend generic_fillattr() to pass in a request mask mirroring in a sense the statx() uapi. This allows callers to pass in a request mask to only get a subset of attributes filled in. - Rework timestamp updates so it's possible to drop the @now parameter the update_time() inode operation and associated helpers. - Add inode_update_timestamps() and convert all filesystems to it removing a bunch of open-coding" * tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits) btrfs: convert to multigrain timestamps ext4: switch to multigrain timestamps xfs: switch to multigrain timestamps tmpfs: add support for multigrain timestamps fs: add infrastructure for multigrain timestamps fs: drop the timespec64 argument from update_time xfs: have xfs_vn_update_time gets its own timestamp fat: make fat_update_time get its own timestamp fat: remove i_version handling from fat_update_time ubifs: have ubifs_update_time use inode_update_timestamps btrfs: have it use inode_update_timestamps fs: drop the timespec64 arg from generic_update_time fs: pass the request_mask to generic_fillattr fs: remove silly warning from current_time gfs2: fix timestamp handling on quota inodes fs: rename i_ctime field to __i_ctime selinux: convert to ctime accessor functions security: convert to ctime accessor functions apparmor: convert to ctime accessor functions sunrpc: convert to ctime accessor functions ...
2023-08-28powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory ↵Aneesh Kumar K.V
attached Commit 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing") used 256MB as the memory block size when we have ibm,coherent-device-memory device tree node present. Instead of returning with 256MB memory block size, continue to check the rest of the memory regions and make sure we can still map them using a 256MB memory block size. Fixes: 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing") Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230828074658.59553-2-aneesh.kumar@linux.ibm.com
2023-08-28powerpc/mm/book3s64: Fix build error with SPARSEMEM disabledAneesh Kumar K.V
With CONFIG_SPARSEMEM disabled the below kernel build error is observed. arch/powerpc/mm/init_64.c:477:38: error: use of undeclared identifier 'SECTION_SIZE_BITS' CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM and it is more clear to describe the code dependency in terms of MEMORY_HOTPLUG. Outside memory hotplug the kernel uses memory_block_size for kernel directmap. Instead of depending on SECTION_SIZE_BITS to compute the direct map page size, add a new #define which defaults to 16M(same as existing SECTION_SIZE) Fixes: 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing") Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308251532.k9PpWEAD-lkp@intel.com/ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230828074658.59553-1-aneesh.kumar@linux.ibm.com
2023-08-25Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4 issues or aren't considered suitable for a -stable backport" * tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: shmem: fix smaps BUG sleeping while atomic selftests: cachestat: catch failing fsync test on tmpfs selftests: cachestat: test for cachestat availability maple_tree: disable mas_wr_append() when other readers are possible madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check mm: multi-gen LRU: don't spin during memcg release mm: memory-failure: fix unexpected return value in soft_offline_page() radix tree: remove unused variable mm: add a call to flush_cache_vmap() in vmap_pfn() selftests/mm: FOLL_LONGTERM need to be updated to 0x100 nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers() mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast selftests: cgroup: fix test_kmem_basic less than error mm: enable page walking API to lock vmas during the walk smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd() mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
2023-08-25powerpc/iommu: Fix notifiers being shared by PCI and VIO busesRussell Currey
fail_iommu_setup() registers the fail_iommu_bus_notifier struct to both PCI and VIO buses. struct notifier_block is a linked list node, so this causes any notifiers later registered to either bus type to also be registered to the other since they share the same node. This causes issues in (at least) the vgaarb code, which registers a notifier for PCI buses. pci_notify() ends up being called on a vio device, converted with to_pci_dev() even though it's not a PCI device, and finally makes a bad access in vga_arbiter_add_pci_device() as discovered with KASAN: BUG: KASAN: slab-out-of-bounds in vga_arbiter_add_pci_device+0x60/0xe00 Read of size 4 at addr c000000264c26fdc by task swapper/0/1 Call Trace: dump_stack_lvl+0x1bc/0x2b8 (unreliable) print_report+0x3f4/0xc60 kasan_report+0x244/0x698 __asan_load4+0xe8/0x250 vga_arbiter_add_pci_device+0x60/0xe00 pci_notify+0x88/0x444 notifier_call_chain+0x104/0x320 blocking_notifier_call_chain+0xa0/0x140 device_add+0xac8/0x1d30 device_register+0x58/0x80 vio_register_device_node+0x9ac/0xce0 vio_bus_scan_register_devices+0xc4/0x13c __machine_initcall_pseries_vio_device_init+0x94/0xf0 do_one_initcall+0x12c/0xaa8 kernel_init_freeable+0xa48/0xba8 kernel_init+0x64/0x400 ret_from_kernel_thread+0x5c/0x64 Fix this by creating separate notifier_block structs for each bus type. Fixes: d6b9a81b2a45 ("powerpc: IOMMU fault injection") Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> [mpe: Add #ifdef to fix CONFIG_IBMVIO=n build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230322035322.328709-1-ruscur@russell.cc
2023-08-24powerpc: implement the new page table range APIMatthew Wilcox (Oracle)
Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio(). Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio. [willy@infradead.org: re-export flush_dcache_icache_folio()] Link: https://lkml.kernel.org/r/ZMx1daYwvD9EM7Cv@casper.infradead.org Link: https://lkml.kernel.org/r/20230802151406.3735276-22-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24mm: drop per-VMA lock when returning VM_FAULT_RETRY or VM_FAULT_COMPLETEDSuren Baghdasaryan
handle_mm_fault returning VM_FAULT_RETRY or VM_FAULT_COMPLETED means mmap_lock has been released. However with per-VMA locks behavior is different and the caller should still release it. To make the rules consistent for the caller, drop the per-VMA lock when returning VM_FAULT_RETRY or VM_FAULT_COMPLETED. Currently the only path returning VM_FAULT_RETRY under per-VMA locks is do_swap_page and no path returns VM_FAULT_COMPLETED for now. [willy@infradead.org: fix riscv] Link: https://lkml.kernel.org/r/CAJuCfpE6GWEx1rPBmNpUfoD5o-gNFz9-UFywzCE2PbEGBiVz7g@mail.gmail.com Link: https://lkml.kernel.org/r/20230630211957.1341547-4-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Peter Xu <peterx@redhat.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Minchan Kim <minchan@google.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Punit Agrawal <punit.agrawal@bytedance.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-25powerpc/mpc5xxx: Add missing fwnode_handle_put()Liang He
In mpc5xxx_fwnode_get_bus_frequency(), we should add fwnode_handle_put() when break out of the iteration fwnode_for_each_parent_node() as it will automatically increase and decrease the refcounter. Fixes: de06fba62af6 ("powerpc/mpc5xxx: Switch mpc5xxx_get_bus_frequency() to use fwnode") Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230322030423.1855440-1-windhl@126.com
2023-08-25powerpc/config: Disable SLAB_DEBUG_ON in skirootJoel Stanley
In 5.10 commit 5e84dd547bce ("powerpc/configs/skiroot: Enable some more hardening options") set SLUB_DEBUG_ON. When 5.14 came around, commit 792702911f58 ("slub: force on no_hash_pointers when slub_debug is enabled") print all the pointers when SLUB_DEBUG_ON is set. This was fine, but in 5.12 commit 5ead723a20e0 ("lib/vsprintf: no_hash_pointers prints all addresses as unhashed") added the warning at boot. Disable SLAB_DEBUG_ON as we don't want the nasty warning. We have CONFIG_EXPERT so SLAB_DEBUG is enabled. We do lose the settings in DEBUG_DEFAULT_FLAGS, but it's not clear that these should have been always-on anyway. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230705023056.16273-1-joel@jms.id.au
2023-08-25powerpc/pseries: Remove unused hcall tracing instructionNicholas Piggin
When JUMP_LABEL=n, the tracepoint refcount test in the pre-call stores the refcount value to the stack, so the same value can be used for the post-call (presumably to avoid racing with the value concurrently changing). On little-endian (ELFv2) that might have just worked by luck, because 32(r1) is STK_PARAM(R3) there and so the value save gets clobbered by the tracing code when it's non-zero, but fortunately r3 is the hcall number and 0 is an invalid hcall number so it should get clobbered by another non-zero value. In any case, commit cc1adb5f32557 ("powerpc/pseries: Use jump labels for hcall tracepoints") removed the code that actually used the value stored, so now it's just dead code. It's fragile to be storing to the stack like this, and confusing. Better remove it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230509091600.70994-2-npiggin@gmail.com
2023-08-25powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=nNicholas Piggin
With JUMP_LABEL=n, hcall_tracepoint_refcount's address is being tested instead of its value. This results in the tracing slowpath always being taken unnecessarily. Fixes: 9a10ccb29c0a2 ("powerpc/pseries: move hcall_tracepoint_refcount out of .toc") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230509091600.70994-1-npiggin@gmail.com
2023-08-25powerpc: dts: add missing space before {Krzysztof Kozlowski
Add missing whitespace between node name/label and opening {. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230705145743.292855-1-krzysztof.kozlowski@linaro.org
2023-08-25powerpc/eeh: Use pci_dev_id() to simplify the codeJialin Zhang
PCI core API pci_dev_id() can be used to get the BDF number for a pci device. We don't need to compose it mannually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230815023303.3515503-1-zhangjialin11@huawei.com
2023-08-25powerpc/64s: Move CPU -mtune options into KconfigMichael Ellerman
Currently the -mtune options are set in the Makefile, depending on what the compiler supports. One downside of doing it that way is that the chosen -mtune option is not recorded in the .config. Another downside is that if there's ever a need to do more complicated logic to calculate the correct option, that gets messy in the Makefile. So move the determination of which -mtune option to use into Kconfig logic. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230329234308.2215833-1-mpe@ellerman.id.au
2023-08-25powerpc/powermac: Fix unused function warningMichael Ellerman
Clang reports: arch/powerpc/platforms/powermac/feature.c:137:19: error: unused function 'simple_feature_tweak' It's only used inside the #ifndef CONFIG_PPC64 block, so move it in there to fix the warning. While at it drop the inline, the compiler will decide whether it should be inlined or not. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308181501.AR5HMDWC-lkp@intel.com/ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230821140949.491881-1-mpe@ellerman.id.au
2023-08-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: include/net/inet_sock.h f866fbc842de ("ipv4: fix data-races around inet->inet_id") c274af224269 ("inet: introduce inet->inet_flags") https://lore.kernel.org/all/679ddff6-db6e-4ff6-b177-574e90d0103d@tessares.net/ Adjacent changes: drivers/net/bonding/bond_alb.c e74216b8def3 ("bonding: fix macvlan over alb bond support") f11e5bd159b0 ("bonding: support balance-alb with openvswitch") drivers/net/ethernet/broadcom/bgmac.c d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()") 23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()") drivers/net/ethernet/broadcom/genet/bcmmii.c 32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()") acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()") net/sctp/socket.c f866fbc842de ("ipv4: fix data-races around inet->inet_id") b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-24powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPTRussell Currey
lppaca_shared_proc() takes a pointer to the lppaca which is typically accessed through get_lppaca(). With DEBUG_PREEMPT enabled, this leads to checking if preemption is enabled, for example: BUG: using smp_processor_id() in preemptible [00000000] code: grep/10693 caller is lparcfg_data+0x408/0x19a0 CPU: 4 PID: 10693 Comm: grep Not tainted 6.5.0-rc3 #2 Call Trace: dump_stack_lvl+0x154/0x200 (unreliable) check_preemption_disabled+0x214/0x220 lparcfg_data+0x408/0x19a0 ... This isn't actually a problem however, as it does not matter which lppaca is accessed, the shared proc state will be the same. vcpudispatch_stats_procfs_init() already works around this by disabling preemption, but the lparcfg code does not, erroring any time /proc/powerpc/lparcfg is accessed with DEBUG_PREEMPT enabled. Instead of disabling preemption on the caller side, rework lppaca_shared_proc() to not take a pointer and instead directly access the lppaca, bypassing any potential preemption checks. Fixes: f13c13a00512 ("powerpc: Stop using non-architected shared_proc field in lppaca") Signed-off-by: Russell Currey <ruscur@russell.cc> [mpe: Rework to avoid needing a definition in paca.h and lppaca.h] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230823055317.751786-4-mpe@ellerman.id.au
2023-08-24powerpc: Don't include lppaca.h in paca.hMichael Ellerman
By adding a forward declaration for struct lppaca we can untangle paca.h and lppaca.h. Also move get_lppaca() into lppaca.h for consistency. Add includes of lppaca.h to some files that need it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230823055317.751786-3-mpe@ellerman.id.au
2023-08-24powerpc/pseries: Move hcall_vphn() prototype into vphn.hMichael Ellerman
Consolidate the two prototypes for hcall_vphn() into vphn.h. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230823055317.751786-2-mpe@ellerman.id.au
2023-08-24powerpc/pseries: Move VPHN constants into vphn.hMichael Ellerman
These don't have any particularly good reason to belong in lppaca.h, move them into their own header. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230823055317.751786-1-mpe@ellerman.id.au
2023-08-24powerpc: Drop zalloc_maybe_bootmem()Michael Ellerman
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These used to be called early during boot before slab setup, and also during runtime due to hotplug. But commit 5537fcb319d0 ("powerpc/pci: Add ppc_md.discover_phbs()") moved the boot-time calls later, after slab setup, meaning there's no longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all cases. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230823055430.752550-1-mpe@ellerman.id.au
2023-08-24powerpc/powernv: Use struct opal_prd_msg in more placesMichael Ellerman
Use the newly added struct opal_prd_msg in some other functions that operate on opal_prd messages, rather than using other types. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230821142820.497107-2-mpe@ellerman.id.au
2023-08-24powerpc/powernv: Fix fortify source warnings in opal-prd.cMichael Ellerman
As reported by Mahesh & Aneesh, opal_prd_msg_notifier() triggers a FORTIFY_SOURCE warning: memcpy: detected field-spanning write (size 32) of single field "&item->msg" at arch/powerpc/platforms/powernv/opal-prd.c:355 (size 4) WARNING: CPU: 9 PID: 660 at arch/powerpc/platforms/powernv/opal-prd.c:355 opal_prd_msg_notifier+0x174/0x188 [opal_prd] NIP opal_prd_msg_notifier+0x174/0x188 [opal_prd] LR opal_prd_msg_notifier+0x170/0x188 [opal_prd] Call Trace: opal_prd_msg_notifier+0x170/0x188 [opal_prd] (unreliable) notifier_call_chain+0xc0/0x1b0 atomic_notifier_call_chain+0x2c/0x40 opal_message_notify+0xf4/0x2c0 This happens because the copy is targeting item->msg, which is only 4 bytes in size, even though the enclosing item was allocated with extra space following the msg. To fix the warning define struct opal_prd_msg with a union of the header and a flex array, and have the memcpy target the flex array. Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Reported-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Tested-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230821142820.497107-1-mpe@ellerman.id.au
2023-08-24powerpc: Remove <asm/ide.h>Geert Uytterhoeven
As of commit b7fb14d3ac63117e ("ide: remove the legacy ide driver") in v5.14, there are no more generic users of <asm/ide.h>. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-08-23powerpc/85xx: Mark some functions static and add missing includes to fix no ↵Christophe Leroy
previous prototype error corenet{32/64}_smp_defconfig leads to: CC arch/powerpc/sysdev/ehv_pic.o arch/powerpc/sysdev/ehv_pic.c:45:6: error: no previous prototype for 'ehv_pic_unmask_irq' [-Werror=missing-prototypes] 45 | void ehv_pic_unmask_irq(struct irq_data *d) | ^~~~~~~~~~~~~~~~~~ arch/powerpc/sysdev/ehv_pic.c:52:6: error: no previous prototype for 'ehv_pic_mask_irq' [-Werror=missing-prototypes] 52 | void ehv_pic_mask_irq(struct irq_data *d) | ^~~~~~~~~~~~~~~~ arch/powerpc/sysdev/ehv_pic.c:59:6: error: no previous prototype for 'ehv_pic_end_irq' [-Werror=missing-prototypes] 59 | void ehv_pic_end_irq(struct irq_data *d) | ^~~~~~~~~~~~~~~ arch/powerpc/sysdev/ehv_pic.c:66:6: error: no previous prototype for 'ehv_pic_direct_end_irq' [-Werror=missing-prototypes] 66 | void ehv_pic_direct_end_irq(struct irq_data *d) | ^~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/sysdev/ehv_pic.c:71:5: error: no previous prototype for 'ehv_pic_set_affinity' [-Werror=missing-prototypes] 71 | int ehv_pic_set_affinity(struct irq_data *d, const struct cpumask *dest, | ^~~~~~~~~~~~~~~~~~~~ arch/powerpc/sysdev/ehv_pic.c:112:5: error: no previous prototype for 'ehv_pic_set_irq_type' [-Werror=missing-prototypes] 112 | int ehv_pic_set_irq_type(struct irq_data *d, unsigned int flow_type) | ^~~~~~~~~~~~~~~~~~~~ CC arch/powerpc/sysdev/fsl_rio.o arch/powerpc/sysdev/fsl_rio.c:102:5: error: no previous prototype for 'fsl_rio_mcheck_exception' [-Werror=missing-prototypes] 102 | int fsl_rio_mcheck_exception(struct pt_regs *regs) | ^~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/sysdev/fsl_rio.c:306:5: error: no previous prototype for 'fsl_map_inb_mem' [-Werror=missing-prototypes] 306 | int fsl_map_inb_mem(struct rio_mport *mport, dma_addr_t lstart, | ^~~~~~~~~~~~~~~ arch/powerpc/sysdev/fsl_rio.c:357:6: error: no previous prototype for 'fsl_unmap_inb_mem' [-Werror=missing-prototypes] 357 | void fsl_unmap_inb_mem(struct rio_mport *mport, dma_addr_t lstart) | ^~~~~~~~~~~~~~~~~ arch/powerpc/sysdev/fsl_rio.c:445:5: error: no previous prototype for 'fsl_rio_setup' [-Werror=missing-prototypes] 445 | int fsl_rio_setup(struct platform_device *dev) | ^~~~~~~~~~~~~ CC arch/powerpc/sysdev/fsl_rmu.o arch/powerpc/sysdev/fsl_rmu.c:362:6: error: no previous prototype for 'msg_unit_error_handler' [-Werror=missing-prototypes] 362 | void msg_unit_error_handler(void) | ^~~~~~~~~~~~~~~~~~~~~~ CC arch/powerpc/platforms/85xx/corenet_generic.o arch/powerpc/platforms/85xx/corenet_generic.c:33:13: error: no previous prototype for 'corenet_gen_pic_init' [-Werror=missing-prototypes] 33 | void __init corenet_gen_pic_init(void) | ^~~~~~~~~~~~~~~~~~~~ arch/powerpc/platforms/85xx/corenet_generic.c:51:13: error: no previous prototype for 'corenet_gen_setup_arch' [-Werror=missing-prototypes] 51 | void __init corenet_gen_setup_arch(void) | ^~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/platforms/85xx/corenet_generic.c:104:12: error: no previous prototype for 'corenet_gen_publish_devices' [-Werror=missing-prototypes] 104 | int __init corenet_gen_publish_devices(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ CC arch/powerpc/platforms/85xx/qemu_e500.o arch/powerpc/platforms/85xx/qemu_e500.c:28:13: error: no previous prototype for 'qemu_e500_pic_init' [-Werror=missing-prototypes] 28 | void __init qemu_e500_pic_init(void) | ^~~~~~~~~~~~~~~~~~ CC arch/powerpc/kernel/pmc.o arch/powerpc/kernel/pmc.c:78:6: error: no previous prototype for 'power4_enable_pmcs' [-Werror=missing-prototypes] 78 | void power4_enable_pmcs(void) | ^~~~~~~~~~~~~~~~~~ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c90780017b624b91771a3e4240dcbadc68137915.1692684784.git.christophe.leroy@csgroup.eu
2023-08-23powerpc/64e: Fix circular dependency with CONFIG_SMP disabledChristophe Leroy
asm/percpu.h includes asm/paca.h which needs struct tlb_core_data which is defined in mmu-e500.h asm/percpu.h is included from asm/mmu.h in a #ifdef CONFIG_E500 before the inclusion of mmu-e500.h To fix that, move the inclusion of asm/percpu.h into mmu-e500.h after the definition of struct tlb_core_data Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308220708.nRf5AUAe-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202308220857.uFq2oAxM-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202308221055.lw3UzJIL-lkp@intel.com/ Fixes: 3a24ea0df83e ("powerpc/kuap: Use ASM feature fixups instead of static branches") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/5e0f97d5cbcd05238b56b4424ab096468296824d.1692684461.git.christophe.leroy@csgroup.eu
2023-08-23powerpc/powernv: fix debugfs_create_dir() error checkingImmad Mir
The debugfs_create_dir returns ERR_PTR incase of an error and the correct way of checking it by using the IS_ERR inline function, and not the simple null comparision. This patch fixes this. Suggested-by: Ivan Orlov <ivan.orlov0322@gmail.com> Signed-off-by: Immad Mir <mirimmad17@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/CY5PR12MB64553EE96EBB3927311DB598C6459@CY5PR12MB6455.namprd12.prod.outlook.com
2023-08-21merge mm-hotfixes-stable into mm-stable to pick up depended-upon changesAndrew Morton
2023-08-21treewide: drop CONFIG_EMBEDDEDRandy Dunlap
There is only one Kconfig user of CONFIG_EMBEDDED and it can be switched to EXPERT or "if !ARCH_MULTIPLATFORM" (suggested by Arnd). Link: https://lkml.kernel.org/r/20230816055010.31534-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [RISC-V] Acked-by: Greg Ungerer <gerg@linux-m68k.org> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Russell King <linux@armlinux.org.uk> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Brian Cain <bcain@quicinc.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21powerpc: convert various functions to use ptdescsVishal Moola (Oracle)
In order to split struct ptdesc from struct page, convert various functions to use ptdescs. Link: https://lkml.kernel.org/r/20230807230513.102486-13-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matthew Wilcox <willy@infradead.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21powerpc/book3s64/memhotplug: enable memmap on memory for radixAneesh Kumar K.V
Radix vmemmap mapping can map things correctly at the PMD level or PTE level based on different device boundary checks. Hence we skip the restrictions w.r.t vmemmap size to be multiple of PMD_SIZE. This also makes the feature widely useful because to use PMD_SIZE vmemmap area we require a memory block size of 2GiB We can also use MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY to that the feature can work with a memory block size of 256MB. Using altmap.reserve feature to align things correctly at pageblock granularity. We can end up losing some pages in memory with this. For ex: with a 256MiB memory block size, we require 4 pages to map vmemmap pages, In order to align things correctly we end up adding a reserve of 28 pages. ie, for every 4096 pages 28 pages get reserved. Link: https://lkml.kernel.org/r/20230808091501.287660-6-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21mm: lock vma explicitly before doing vm_flags_reset and vm_flags_reset_onceSuren Baghdasaryan
Implicit vma locking inside vm_flags_reset() and vm_flags_reset_once() is not obvious and makes it hard to understand where vma locking is happening. Also in some cases (like in dup_userfaultfd()) vma should be locked earlier than vma_flags modification. To make locking more visible, change these functions to assert that the vma write lock is taken and explicitly lock the vma beforehand. Fix userfaultfd functions which should lock the vma earlier. Link: https://lkml.kernel.org/r/20230804152724.3090321-5-surenb@google.com Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-21mm: enable page walking API to lock vmas during the walkSuren Baghdasaryan
walk_page_range() and friends often operate under write-locked mmap_lock. With introduction of vma locks, the vmas have to be locked as well during such walks to prevent concurrent page faults in these areas. Add an additional member to mm_walk_ops to indicate locking requirements for the walk. The change ensures that page walks which prevent concurrent page faults by write-locking mmap_lock, operate correctly after introduction of per-vma locks. With per-vma locks page faults can be handled under vma lock without taking mmap_lock at all, so write locking mmap_lock would not stop them. The change ensures vmas are properly locked during such walks. A sample issue this solves is do_mbind() performing queue_pages_range() to queue pages for migration. Without this change a concurrent page can be faulted into the area and be left out of migration. Link: https://lkml.kernel.org/r/20230804152724.3090321-2-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Suggested-by: Jann Horn <jannh@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-22powerpc/ftrace: Add support for -fpatchable-function-entryNaveen N Rao
GCC v13.1 updated support for -fpatchable-function-entry on ppc64le to emit nops after the local entry point, rather than before it. This allows us to use this in the kernel for ftrace purposes. A new script is added under arch/powerpc/tools/ to help detect if nops are emitted after the function local entry point, or before the global entry point. With -fpatchable-function-entry, we no longer have the profiling instructions generated at function entry, so we only need to validate the presence of two nops at the ftrace location in ftrace_init_nop(). We patch the preceding instruction with 'mflr r0' to match the -mprofile-kernel ABI for subsequent ftrace use. This changes the profiling instructions used on ppc32. The default -pg option emits an additional 'stw' instruction after 'mflr r0' and before the branch to _mcount 'bl _mcount'. This is very similar to the original -mprofile-kernel implementation on ppc64le, where an additional 'std' instruction was used to save LR to its save location in the caller's stackframe. Subsequently, this additional store was removed in later compiler versions for performance reasons. The same reasons apply for ppc32 so we only patch in a 'mflr r0'. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/68586d22981a2c3bb45f27a2b621173d10a7d092.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Implement ftrace_replace_code()Naveen N Rao
Implement ftrace_replace_code() to consolidate logic from the different ftrace patching routines: ftrace_make_nop(), ftrace_make_call() and ftrace_modify_call(). Note that ftrace_make_call() is still required primarily to handle patching modules during their load time. The other two routines should no longer be called. This lays the groundwork to enable better control in patching ftrace locations, including the ability to nop-out preceding profiling instructions when ftrace is disabled. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c28f852225646b0561bbf3c1d22d03f041ace8e0.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Replace use of ftrace_call_replace() with ↵Naveen N Rao
ftrace_create_branch_inst() ftrace_create_branch_inst() is clearer about its intent than ftrace_call_replace(). Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/953513b88fa922ba7a66d772dc1310710efe9177.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify ftrace_modify_call()Naveen N Rao
Now that we validate the ftrace location during initialization in ftrace_init_nop(), we can simplify ftrace_modify_call() to patch-in the updated branch instruction without worrying about the instructions surrounding the ftrace location. Note that we continue to ensure we have the expected branch instruction at the ftrace location before patching it with the updated branch destination. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/06275720939f8ee4c2f61c9e9a3e89b1fa3c441d.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify ftrace_make_call()Naveen N Rao
Now that we validate the ftrace location during initialization in ftrace_init_nop(), we can simplify ftrace_make_call() to replace the nop without worrying about the instructions surrounding the ftrace location. Note that we continue to ensure that we have a nop at the ftrace location before patching it. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/2d28866d2f556488a663981abe5621511efb207b.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify ftrace_make_nop()Naveen N Rao
Now that we validate the ftrace location during initialization in ftrace_init_nop(), we can simplify ftrace_make_nop() to patch-in the nop without worrying about the instructions surrounding the ftrace location. Note that we continue to ensure that we have a bl to ftrace_[regs_]caller at the ftrace location before nop-ing it out. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/e12ccbf28c50c3a07fb614f4d392e55f7098a729.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Add separate ftrace_init_nop() with additional validationNaveen N Rao
Currently, we validate instructions around the ftrace location every time we have to enable/disable ftrace. Introduce ftrace_init_nop() to instead perform all the validation during ftrace initialization. This allows us to simply patch the necessary instructions during enabling/disabling ftrace. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/f373684081e8e98be09b7f44d2d93069768324dc.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Stop re-purposing linker generated long branches for ftraceNaveen N Rao
Commit 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs") added ftrace support for ppc64 kernel images with a text section larger than 32MB. The patch did two things: 1. Add stubs at the end of .text to branch into ftrace_[regs_]caller for functions that were out of branch range. 2. Re-purpose linker-generated long branches to _mcount to instead branch to ftrace_[regs_]caller. Before that, we only supported kernel .text up to ~32MB. With the above, we now support up to ~96MB: - The first 32MB of kernel text can branch directly into ftrace_[regs_]caller since that symbol is usually at the beginning. - The modified long_branch from (2) above is used by the next 32MB of kernel text. - The next 32MB of kernel text can use the stub at the end of text to branch back to ftrace_[regs_]caller. While re-purposing the long branch works in practice, it still restricts ftrace to kernel text up to ~96MB. The stub at the end of kernel text from (1) already enables us to extend ftrace support for kernel text up to 64MB, which fulfils the original requirement. Further, once we switch to -fpatchable-function-entry, there will not be a long branch that we can use. Stop re-purposing the linker-generated long branches for ftrace to simplify the code. If there are good reasons to support ftrace on kernels beyond 64MB, we can consider adding support by using -fpatchable-function-entry. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/33fa3be97f8e1f2171254ef2e1b0d5c8836c11fd.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Refactor ftrace_modify_code()Naveen N Rao
Split up ftrace_modify_code() into a few helpers for future use. Also update error messages accordingly. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/a8daa49712b44ff539e6c22a2ea649a540386798.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Consolidate ftrace support into fewer filesNaveen N Rao
ftrace_low.S has just the _mcount stub and return_to_handler(). Merge this back into ftrace_mprofile.S and ftrace_64_pg.S to keep all ftrace code together, and to allow those to evolve independently. ftrace_mprofile.S is also not an entirely accurate name since this also holds ppc32 code. This will be all the more incorrect once support for -fpatchable-function-entry is added. Rename files here to more accurately describe the code: - ftrace_mprofile.S is renamed to ftrace_entry.S - ftrace_pg.c is renamed to ftrace_64_pg.c - ftrace_64_pg.S is rename to ftrace_64_pg_entry.S Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/b900c9a8bba9d6c3c295e0f99886acf3e5bf6f7b.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Extend ftrace support for large kernels to ppc32Naveen N Rao
Commit 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs") added ftrace support for ppc64 kernel images with a text section larger than 32MB. The approach itself isn't specific to ppc64, so extend the same to also work on ppc32. While at it, reduce the space reserved for the stub from 64 bytes to 32 bytes since the different stub variants are all less than 8 instructions. To reduce use of #ifdef, a stub implementation is provided for kernel_toc_address() and -SZ_2G is cast to 'long long' to prevent errors on ppc32. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/9fa3258cbb9105cf8a0a8135214d44ffbc75fe84.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Use FTRACE_REGS_ADDR to identify the correct ftrace trampolineNaveen N Rao
Instead of keying off DYNAMIC_FTRACE_WITH_REGS, use FTRACE_REGS_ADDR to identify the proper ftrace trampoline address to use. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/6045a280a57a7ea937a5bb13ccac747026dbfb07.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify function_graph support in ftrace.cNaveen N Rao
Since we now support DYNAMIC_FTRACE_WITH_ARGS across ppc32 and ppc64 ELFv2, we can simplify function_graph tracer support code in ftrace.c Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/4dc92c4b1ed444dc62b748ae7327acdb9e096864.1687166935.git.naveen@kernel.org
2023-08-22powerpc64/ftrace: Move ELFv1 and -pg support code into a separate fileNaveen N Rao
ELFv1 support is deprecated and on the way out. Pre -mprofile-kernel ftrace support (-pg only) is very limited and is retained primarily for clang builds. It won't be necessary once clang lands support for -fpatchable-function-entry. Copy the existing ftrace code supporting these into ftrace_pg.c. ftrace.c can then be refactored and enhanced with a focus on ppc32 and ppc64 ELFv2. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1eb6cc6c3141ddb77a2a25f8a9e83d83ff312b02.1687166935.git.naveen@kernel.org
2023-08-22powerpc/module: Remove unused .ftrace.tramp sectionNaveen N Rao
.ftrace.tramp section is not used for any purpose. This code was added all the way back in the original commit introducing support for dynamic ftrace on ppc64 modules. Remove it. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/9cf6d7f37ba82f7cb6dafecf660f44925c526d8d.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Fix dropping weak symbols with older toolchainsNaveen N Rao
The minimum level of gcc supported for building the kernel is v5.1. v5.x releases of gcc emitted a three instruction sequence for -mprofile-kernel: mflr r0 std r0, 16(r1) bl _mcount It is only with the v6.x releases that gcc started emitting the two instruction sequence for -mprofile-kernel, omitting the second store instruction. With the older three instruction sequence, the actual ftrace location can be the 5th instruction into a function. Update the allowed offset for ftrace location from 12 to 16 to accommodate the same. Cc: stable@vger.kernel.org Fixes: 7af82ff90a2b06 ("powerpc/ftrace: Ignore weak functions") Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/7b265908a9461e38fc756ef9b569703860a80621.1687166935.git.naveen@kernel.org