summaryrefslogtreecommitdiff
path: root/arch/mips
AgeCommit message (Collapse)Author
2024-02-12MIPS: Clear Cause.BD in instruction_pointer_setJiaxun Yang
Clear Cause.BD after we use instruction_pointer_set to override EPC. This can prevent exception_epc check against instruction code at new return address. It won't be considered as "in delay slot" after epc being overridden anyway. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-02-12ptrace: Introduce exception_ip arch hookJiaxun Yang
On architectures with delay slot, architecture level instruction pointer (or program counter) in pt_regs may differ from where exception was triggered. Introduce exception_ip hook to invoke architecture code and determine actual instruction pointer to the exception. Link: https://lore.kernel.org/lkml/00d1b813-c55f-4365-8d81-d70258e10b16@app.fastmail.com/ Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-02-12MIPS: Add 'memory' clobber to csum_ipv6_magic() inline assemblerGuenter Roeck
After 'lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests' was applied, the test_csum_ipv6_magic unit test started failing for all mips platforms, both little and bit endian. Oddly enough, adding debug code into test_csum_ipv6_magic() made the problem disappear. The gcc manual says: "The "memory" clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters) " This is definitely the case for csum_ipv6_magic(). Indeed, adding the 'memory' clobber fixes the problem. Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Palmer Dabbelt <palmer@rivosinc.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-02-09work around gcc bugs with 'asm goto' with outputsLinus Torvalds
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-01-27mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nanXi Ruoyao
If we still own the FPU after initializing fcr31, when we are preempted the dirty value in the FPU will be read out and stored into fcr31, clobbering our setting. This can cause an improper floating-point environment after execve(). For example: zsh% cat measure.c #include <fenv.h> int main() { return fetestexcept(FE_INEXACT); } zsh% cc measure.c -o measure -lm zsh% echo $((1.0/3)) # raising FE_INEXACT 0.33333333333333331 zsh% while ./measure; do ; done (stopped in seconds) Call lose_fpu(0) before setting fcr31 to prevent this. Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling") Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-27MIPS: loongson64: set nid for reserved memblock regionHuang Pei
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals that reserved memblock regions have no valid node id set, just set it right since loongson64 firmware makes it clear in memory layout info. This works around booting failure on 3A1000+ since commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") under CONFIG_DEFERRED_STRUCT_PAGE_INIT. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-27Revert "MIPS: loongson64: set nid for reserved memblock region"Thomas Bogendoerfer
This reverts commit ce7b1b97776ec0b068c4dd6b6dbb48ae09a23519. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-26MIPS: lantiq: register smp_ops on non-smp platformsAleksander Jan Bajkowski
Lantiq uses a common kernel config for devices with 24Kc and 34Kc cores. The changes made previously to add support for interrupts on all cores work on 24Kc platforms with SMP disabled and 34Kc platforms with SMP enabled. This patch fixes boot issues on Danube (single core 24Kc) with SMP enabled. Fixes: 730320fd770d ("MIPS: lantiq: enable all hardware interrupts on second VPE") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-26MIPS: loongson64: set nid for reserved memblock regionHuang Pei
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals that reserved memblock regions have no valid node id set, just set it right since loongson64 firmware makes it clear in memory layout info. This works around booting failure on 3A1000+ since commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") under CONFIG_DEFERRED_STRUCT_PAGE_INIT. Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-26MIPS: reserve exception vector space ONLY ONCEHuang Pei
"cpu_probe" is called both by BP and APs, but reserving exception vector (like 0x0-0x1000) called by "cpu_probe" need once and calling on APs is too late since memblock is unavailable at that time. So, reserve exception vector ONLY by BP. Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-26MIPS: BCM63XX: Fix missing prototypesFlorian Fainelli
Most of the symbols for which we do not have a prototype can actually be made static and for the few that cannot, there is already a declaration in a header for it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-22MIPS: sgi-ip32: Fix missing prototypesThomas Bogendoerfer
Fix interrupt function prototypes, move all prototypes into a new file ip32-common.h and include it where needed. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
2024-01-22MIPS: sgi-ip30: Fix missing prototypesThomas Bogendoerfer
Include needed header files. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
2024-01-22MIPS: fw arc: Fix missing prototypesThomas Bogendoerfer
Make ArcGetMemoryDescriptor() static since it's only needed internally. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
2024-01-22MIPS: sgi-ip27: Fix missing prototypesThomas Bogendoerfer
Fix missing prototypes by making not shared functions static and adding others to ip27-common.h. Also drop ip27-hubio.c as it's not used for a long time. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
2024-01-22MIPS: Alchemy: Fix missing prototypesFlorian Fainelli
We have a number of missing prototypes warnings for board_setup(), alchemy_set_lpj() and prom_init_cmdline(), prom_getenv() and prom_get_ethernet_addr(). Fix those by providing definitions for the first two functions in au1000.h which is included everywhere relevant, and including prom.h for the last three. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-22MIPS: Cobalt: Fix missing prototypesFlorian Fainelli
Fix missing prototypes warnings for cobalt_machine_halt() and cobalt_machine_restart() by moving their prototypes to cobalt.h which is included by setup.c. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-19Merge tag 'vfs-6.8.netfs' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: "This extends the netfs helper library that network filesystems can use to replace their own implementations. Both afs and 9p are ported. cifs is ready as well but the patches are way bigger and will be routed separately once this is merged. That will remove lots of code as well. The overal goal is to get high-level I/O and knowledge of the page cache and ouf of the filesystem drivers. This includes knowledge about the existence of pages and folios The pull request converts afs and 9p. This removes about 800 lines of code from afs and 300 from 9p. For 9p it is now possible to do writes in larger than a page chunks. Additionally, multipage folio support can be turned on for 9p. Separate patches exist for cifs removing another 2000+ lines. I've included detailed information in the individual pulls I took. Summary: - Add NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time. - Support for direct and unbuffered I/O. - Support for write-through caching in the page cache. - O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the page cache and then flushing afterwards. - Support for write-streaming. - Support for write grouping. - Skip reads for which the server could only return zeros or EOF. - The fscache module is now part of the netfs library and the corresponding maintainer entry is updated. - Some helpers from the fscache subsystem are renamed to mark them as belonging to the netfs library. - Follow-up fixes for the netfs library. - Follow-up fixes for the 9p conversion" * tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (50 commits) netfs: Fix wrong #ifdef hiding wait cachefiles: Fix signed/unsigned mixup netfs: Fix the loop that unmarks folios after writing to the cache netfs: Fix interaction between write-streaming and cachefiles culling netfs: Count DIO writes netfs: Mark netfs_unbuffered_write_iter_locked() static netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" netfs: Rearrange netfs_io_subrequest to put request pointer first 9p: Use length of data written to the server in preference to error 9p: Do a couple of cleanups 9p: Fix initialisation of netfs_inode for 9p cachefiles: Fix __cachefiles_prepare_write() 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion ...
2024-01-18Merge tag 'iommu-updates-v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core changes: - Fix race conditions in device probe path - Retire IOMMU bus_ops - Support for passing custom allocators to page table drivers - Clean up Kconfig around IOMMU_SVA - Support for sharing SVA domains with all devices bound to a mm - Firmware data parsing cleanup - Tracing improvements for iommu-dma code - Some smaller fixes and cleanups ARM-SMMU drivers: - Device-tree binding updates: - Add additional compatible strings for Qualcomm SoCs - Document Adreno clocks for Qualcomm's SM8350 SoC - SMMUv2: - Implement support for the ->domain_alloc_paging() callback - Ensure Secure context is restored following suspend of Qualcomm SMMU implementation - SMMUv3: - Disable stalling mode for the "quiet" context descriptor - Minor refactoring and driver cleanups Intel VT-d driver: - Cleanup and refactoring AMD IOMMU driver: - Improve IO TLB invalidation logic - Small cleanups and improvements Rockchip IOMMU driver: - DT binding update to add Rockchip RK3588 Apple DART driver: - Apple M1 USB4/Thunderbolt DART support - Cleanups Virtio IOMMU driver: - Add support for iotlb_sync_map - Enable deferred IO TLB flushes" * tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits) iommu: Don't reserve 0-length IOVA region iommu/vt-d: Move inline helpers to header files iommu/vt-d: Remove unused vcmd interfaces iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through() iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly iommu/sva: Fix memory leak in iommu_sva_bind_device() dt-bindings: iommu: rockchip: Add Rockchip RK3588 iommu/dma: Trace bounce buffer usage when mapping buffers iommu/arm-smmu: Convert to domain_alloc_paging() iommu/arm-smmu: Pass arm_smmu_domain to internal functions iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED iommu/arm-smmu: Convert to a global static identity domain iommu/arm-smmu: Reorganize arm_smmu_domain_add_master() iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent() iommu/arm-smmu-v3: Add a type for the STE iommu/arm-smmu-v3: disable stall for quiet_cd iommu/qcom: restore IOMMU state if needed iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible iommu/arm-smmu-qcom: Add missing GMU entry to match table ...
2024-01-18Merge tag 'percpu-for-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu Pull percpu updates from Dennis Zhou: "Enable percpu page allocator for RISC-V. There are RISC-V configurations with sparse NUMA configurations and small vmalloc space causing dynamic percpu allocations to fail as the backing chunk stride is too far apart" * tag 'percpu-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: riscv: Enable pcpu page first chunk allocator mm: Introduce flush_cache_vmap_early()
2024-01-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "Generic: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. - Clean up Kconfigs that all KVM architectures were selecting - New functionality around "guest_memfd", a new userspace API that creates an anonymous file and returns a file descriptor that refers to it. guest_memfd files are bound to their owning virtual machine, cannot be mapped, read, or written by userspace, and cannot be resized. guest_memfd files do however support PUNCH_HOLE, which can be used to switch a memory area between guest_memfd and regular anonymous memory. - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify per-page attributes for a given page of guest memory; right now the only attribute is whether the guest expects to access memory via guest_memfd or not, which in Confidential SVMs backed by SEV-SNP, TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM). x86: - Support for "software-protected VMs" that can use the new guest_memfd and page attributes infrastructure. This is mostly useful for testing, since there is no pKVM-like infrastructure to provide a meaningfully reduced TCB. - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Use more generic lockdep assertions in paths that don't actually care about whether the caller is a reader or a writer. - let Xen guests opt out of having PV clock reported as "based on a stable TSC", because some of them don't expect the "TSC stable" bit (added to the pvclock ABI by KVM, but never set by Xen) to be set. - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL. - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always flushes on nested transitions, i.e. always satisfies flush requests. This allows running bleeding edge versions of VMware Workstation on top of KVM. - Sanity check that the CPU supports flush-by-ASID when enabling SEV support. - On AMD machines with vNMI, always rely on hardware instead of intercepting IRET in some cases to detect unmasking of NMIs - Support for virtualizing Linear Address Masking (LAM) - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state prior to refreshing the vPMU model. - Fix a double-overflow PMU bug by tracking emulated counter events using a dedicated field instead of snapshotting the "previous" counter. If the hardware PMC count triggers overflow that is recognized in the same VM-Exit that KVM manually bumps an event count, KVM would pend PMIs for both the hardware-triggered overflow and for KVM-triggered overflow. - Turn off KVM_WERROR by default for all configs so that it's not inadvertantly enabled by non-KVM developers, which can be problematic for subsystems that require no regressions for W=1 builds. - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL "features". - Don't force a masterclock update when a vCPU synchronizes to the current TSC generation, as updating the masterclock can cause kvmclock's time to "jump" unexpectedly, e.g. when userspace hotplugs a pre-created vCPU. - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths, partly as a super minor optimization, but mostly to make KVM play nice with position independent executable builds. - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on CONFIG_HYPERV as a minor optimization, and to self-document the code. - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation" at build time. ARM64: - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base granule sizes. Branch shared with the arm64 tree. - Large Fine-Grained Trap rework, bringing some sanity to the feature, although there is more to come. This comes with a prefix branch shared with the arm64 tree. - Some additional Nested Virtualization groundwork, mostly introducing the NV2 VNCR support and retargetting the NV support to that version of the architecture. - A small set of vgic fixes and associated cleanups. Loongarch: - Optimization for memslot hugepage checking - Cleanup and fix some HW/SW timer issues - Add LSX/LASX (128bit/256bit SIMD) support RISC-V: - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Support for reporting steal time along with selftest s390: - Bugfixes Selftests: - Fix an annoying goof where the NX hugepage test prints out garbage instead of the magic token needed to run the test. - Fix build errors when a header is delete/moved due to a missing flag in the Makefile. - Detect if KVM bugged/killed a selftest's VM and print out a helpful message instead of complaining that a random ioctl() failed. - Annotate the guest printf/assert helpers with __printf(), and fix the various bugs that were lurking due to lack of said annotation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits) x86/kvm: Do not try to disable kvmclock if it was not enabled KVM: x86: add missing "depends on KVM" KVM: fix direction of dependency on MMU notifiers KVM: introduce CONFIG_KVM_COMMON KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache RISC-V: KVM: selftests: Add get-reg-list test for STA registers RISC-V: KVM: selftests: Add steal_time test support RISC-V: KVM: selftests: Add guest_sbi_probe_extension RISC-V: KVM: selftests: Move sbi_ecall to processor.c RISC-V: KVM: Implement SBI STA extension RISC-V: KVM: Add support for SBI STA registers RISC-V: KVM: Add support for SBI extension registers RISC-V: KVM: Add SBI STA info to vcpu_arch RISC-V: KVM: Add steal-update vcpu request RISC-V: KVM: Add SBI STA extension skeleton RISC-V: paravirt: Implement steal-time support RISC-V: Add SBI STA extension definitions RISC-V: paravirt: Add skeleton for pv-time support RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr() ...
2024-01-17Merge tag 'mips_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds
Pull MIPS updates from Thomas Bogendoerfer: "Just cleanups and fixes" * tag 'mips_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Alchemy: Fix an out-of-bound access in db1550_dev_setup() MIPS: Alchemy: Fix an out-of-bound access in db1200_dev_setup() MIPS: Fix typos MIPS: Remove unused shadow GPR support from vector irq setup MIPS: Allow vectored interrupt handler to reside everywhere for 64bit mips: Set dump-stack arch description mips: mm: add slab availability checking in ioremap_prot mips: Optimize max_mapnr init procedure mips: Fix max_mapnr being uninitialized on early stages mips: Fix incorrect max_low_pfn adjustment mips: dmi: Fix early remap on MIPS32 MIPS: compressed: Use correct instruction for 64 bit code MIPS: SGI-IP27: hubio: fix nasid kernel-doc warning MAINTAINERS: Add myself as maintainer of the Ralink architecture
2024-01-11Merge tag 'net-next-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "The most interesting thing is probably the networking structs reorganization and a significant amount of changes is around self-tests. Core & protocols: - Analyze and reorganize core networking structs (socks, netdev, netns, mibs) to optimize cacheline consumption and set up build time warnings to safeguard against future header changes This improves TCP performances with many concurrent connections up to 40% - Add page-pool netlink-based introspection, exposing the memory usage and recycling stats. This helps indentify bad PP users and possible leaks - Refine TCP/DCCP source port selection to no longer favor even source port at connect() time when IP_LOCAL_PORT_RANGE is set. This lowers the time taken by connect() for hosts having many active connections to the same destination - Refactor the TCP bind conflict code, shrinking related socket structs - Refactor TCP SYN-Cookie handling, as a preparation step to allow arbitrary SYN-Cookie processing via eBPF - Tune optmem_max for 0-copy usage, increasing the default value to 128KB and namespecifying it - Allow coalescing for cloned skbs coming from page pools, improving RX performances with some common configurations - Reduce extension header parsing overhead at GRO time - Add bridge MDB bulk deletion support, allowing user-space to request the deletion of matching entries - Reorder nftables struct members, to keep data accessed by the datapath first - Introduce TC block ports tracking and use. This allows supporting multicast-like behavior at the TC layer - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and classifiers (RSVP and tcindex) - More data-race annotations - Extend the diag interface to dump TCP bound-only sockets - Conditional notification of events for TC qdisc class and actions - Support for WPAN dynamic associations with nearby devices, to form a sub-network using a specific PAN ID - Implement SMCv2.1 virtual ISM device support - Add support for Batman-avd mulicast packet type BPF: - Tons of verifier improvements: - BPF register bounds logic and range support along with a large test suite - log improvements - complete precision tracking support for register spills - track aligned STACK_ZERO cases as imprecise spilled registers. This improves the verifier "instructions processed" metric from single digit to 50-60% for some programs - support for user's global BPF subprogram arguments with few commonly requested annotations for a better developer experience - support tracking of BPF_JNE which helps cases when the compiler transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like - several fixes - Add initial TX metadata implementation for AF_XDP with support in mlx5 and stmmac drivers. Two types of offloads are supported right now, that is, TX timestamp and TX checksum offload - Fix kCFI bugs in BPF all forms of indirect calls from BPF into kernel and from kernel into BPF work with CFI enabled. This allows BPF to work with CONFIG_FINEIBT=y - Change BPF verifier logic to validate global subprograms lazily instead of unconditionally before the main program, so they can be guarded using BPF CO-RE techniques - Support uid/gid options when mounting bpffs - Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id - Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext - Add BPF link_info support for uprobe multi link along with bpftool integration for the latter - Support for VLAN tag in XDP hints - Remove deprecated bpfilter kernel leftovers given the project is developed in user-space (https://github.com/facebook/bpfilter) Misc: - Support for parellel TC self-tests execution - Increase MPTCP self-tests coverage - Updated the bridge documentation, including several so-far undocumented features - Convert all the net self-tests to run in unique netns, to avoid random failures due to conflict and allow concurrent runs - Add TCP-AO self-tests - Add kunit tests for both cfg80211 and mac80211 - Autogenerate Netlink families documentation from YAML spec - Add yml-gen support for fixed headers and recursive nests, the tool can now generate user-space code for all genetlink families for which we have specs - A bunch of additional module descriptions fixes - Catch incorrect freeing of pages belonging to a page pool Driver API: - Rust abstractions for network PHY drivers; do not cover yet the full C API, but already allow implementing functional PHY drivers in rust - Introduce queue and NAPI support in the netdev Netlink interface, allowing complete access to the device <> NAPIs <> queues relationship - Introduce notifications filtering for devlink to allow control application scale to thousands of instances - Improve PHY validation, requesting rate matching information for each ethtool link mode supported by both the PHY and host - Add support for ethtool symmetric-xor RSS hash - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD platform - Expose pin fractional frequency offset value over new DPLL generic netlink attribute - Convert older drivers to platform remove callback returning void - Add support for PHY package MMD read/write New hardware / drivers: - Ethernet: - Octeon CN10K devices - Broadcom 5760X P7 - Qualcomm SM8550 SoC - Texas Instrument DP83TG720S PHY - Bluetooth: - IMC Networks Bluetooth radio Removed: - WiFi: - libertas 16-bit PCMCIA support - Atmel at76c50x drivers - HostAP ISA/PCMCIA style 802.11b driver - zd1201 802.11b USB dongles - Orinoco ISA/PCMCIA 802.11b driver - Aviator/Raytheon driver - Planet WL3501 driver - RNDIS USB 802.11b driver Driver updates: - Ethernet high-speed NICs: - Intel (100G, ice, idpf): - allow one by one port representors creation and removal - add temperature and clock information reporting - add get/set for ethtool's header split ringparam - add again FW logging - adds support switchdev hardware packet mirroring - iavf: implement symmetric-xor RSS hash - igc: add support for concurrent physical and free-running timers - i40e: increase the allowable descriptors - nVidia/Mellanox: - Preparation for Socket-Direct multi-dev netdev. That will allow in future releases combining multiple PFs devices attached to different NUMA nodes under the same netdev - Broadcom (bnxt): - TX completion handling improvements - add basic ntuple filter support - reduce MSIX vectors usage for MQPRIO offload - add VXLAN support, USO offload and TX coalesce completion for P7 - Marvell Octeon EP: - xmit-more support - add PF-VF mailbox support and use it for FW notifications for VFs - Wangxun (ngbe/txgbe): - implement ethtool functions to operate pause param, ring param, coalesce channel number and msglevel - Netronome/Corigine (nfp): - add flow-steering support - support UDP segmentation offload - Ethernet NICs embedded, slower, virtual: - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver - stmmac: add support for HW-accelerated VLAN stripping - TI AM654x sw: add mqprio, frame preemption & coalescing - gve: add support for non-4k page sizes. - virtio-net: support dynamic coalescing moderation - nVidia/Mellanox Ethernet datacenter switches: - allow firmware upgrade without a reboot - more flexible support for bridge flooding via the compressed FID flooding mode - Ethernet embedded switches: - Microchip: - fine-tune flow control and speed configurations in KSZ8xxx - KSZ88X3: enable setting rmii reference - Renesas: - add jumbo frames support - Marvell: - 88E6xxx: add "eth-mac" and "rmon" stats support - Ethernet PHYs: - aquantia: add firmware load support - at803x: refactor the driver to simplify adding support for more chip variants - NXP C45 TJA11xx: Add MACsec offload support - Wifi: - MediaTek (mt76): - NVMEM EEPROM improvements - mt7996 Extremely High Throughput (EHT) improvements - mt7996 Wireless Ethernet Dispatcher (WED) support - mt7996 36-bit DMA support - Qualcomm (ath12k): - support for a single MSI vector - WCN7850: support AP mode - Intel (iwlwifi): - new debugfs file fw_dbg_clear - allow concurrent P2P operation on DFS channels - Bluetooth: - QCA2066: support HFP offload - ISO: more broadcast-related improvements - NXP: better recovery in case receiver/transmitter get out of sync" * tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits) lan78xx: remove redundant statement in lan78xx_get_eee lan743x: remove redundant statement in lan743x_ethtool_get_eee bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer() bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel() bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter() tcp: Revert no longer abort SYN_SENT when receiving some ICMP Revert "mlx5 updates 2023-12-20" Revert "net: stmmac: Enable Per DMA Channel interrupt" ipvlan: Remove usage of the deprecated ida_simple_xx() API ipvlan: Fix a typo in a comment net/sched: Remove ipt action tests net: stmmac: Use interrupt mode INTM=1 for per channel irq net: stmmac: Add support for TX/RX channel interrupt net: stmmac: Make MSI interrupt routine generic dt-bindings: net: snps,dwmac: per channel irq net: phy: at803x: make read_status more generic net: phy: at803x: add support for cdt cross short test for qca808x net: phy: at803x: refactor qca808x cable test get status function net: phy: at803x: generalize cdt fault length function net: ethernet: cortina: Drop TSO support ...
2024-01-11MIPS: Alchemy: Fix an out-of-bound access in db1550_dev_setup()Christophe JAILLET
When calling spi_register_board_info(), Fixes: f869d42e580f ("MIPS: Alchemy: Improved DB1550 support, with audio and serial busses.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-11MIPS: Alchemy: Fix an out-of-bound access in db1200_dev_setup()Christophe JAILLET
When calling spi_register_board_info(), we should pass the number of elements in 'db1200_spi_devs', not 'db1200_i2c_devs'. Fixes: 63323ec54a7e ("MIPS: Alchemy: Extended DB1200 board support.") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-10Merge tag 'asm-generic-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cleanups from Arnd Bergmann: "A series from Baoquan He cleans up the asm-generic/io.h to remove the ioremap_uc() definition from everything except x86, which still needs it for pre-PAT systems. This series notably contains a patch from Jiaxun Yang that converts MIPS to use asm-generic/io.h like every other architecture does, enabling future cleanups. Some of my own patches fix -Wmissing-prototype warnings in architecture specific code across several architectures. This is now needed as the warning is enabled by default. There are still some remaining warnings in minor platforms, but the series should catch most of the widely used ones make them more consistent with one another. David McKay fixes a bug in __generic_cmpxchg_local() when this is used on 64-bit architectures. This could currently only affect parisc64 and sparc64. Additional cleanups address from Linus Walleij, Uwe Kleine-König, Thomas Huth, and Kefeng Wang help reduce unnecessary inconsistencies between architectures" * tag 'asm-generic-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: Fix 32 bit __generic_cmpxchg_local Hexagon: Make pfn accessors statics inlines ARC: mm: Make virt_to_pfn() a static inline mips: remove extraneous asm-generic/iomap.h include sparc: Use $(kecho) to announce kernel images being ready arm64: vdso32: Define BUILD_VDSO32_64 to correct prototypes csky: fix arch_jump_label_transform_static override arch: add do_page_fault prototypes arch: add missing prepare_ftrace_return() prototypes arch: vdso: consolidate gettime prototypes arch: include linux/cpu.h for trap_init() prototype arch: fix asm-offsets.c building with -Wmissing-prototypes arch: consolidate arch_irq_work_raise prototypes hexagon: Remove CONFIG_HEXAGON_ARCH_VERSION from uapi header asm/io: remove unnecessary xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() mips: io: remove duplicated codes arch/*/io.h: remove ioremap_uc in some architectures mips: add <asm-generic/io.h> including
2024-01-09Merge tag 'lsm-pr-20240105' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull security module updates from Paul Moore: - Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and lsm_set_self_attr(). The first syscall simply lists the LSMs enabled, while the second and third get and set the current process' LSM attributes. Yes, these syscalls may provide similar functionality to what can be found under /proc or /sys, but they were designed to support multiple, simultaneaous (stacked) LSMs from the start as opposed to the current /proc based solutions which were created at a time when only one LSM was allowed to be active at a given time. We have spent considerable time discussing ways to extend the existing /proc interfaces to support multiple, simultaneaous LSMs and even our best ideas have been far too ugly to support as a kernel API; after +20 years in the kernel, I felt the LSM layer had established itself enough to justify a handful of syscalls. Support amongst the individual LSM developers has been nearly unanimous, with a single objection coming from Tetsuo (TOMOYO) as he is worried that the LSM_ID_XXX token concept will make it more difficult for out-of-tree LSMs to survive. Several members of the LSM community have demonstrated the ability for out-of-tree LSMs to continue to exist by picking high/unused LSM_ID values as well as pointing out that many kernel APIs rely on integer identifiers, e.g. syscalls (!), but unfortunately Tetsuo's objections remain. My personal opinion is that while I have no interest in penalizing out-of-tree LSMs, I'm not going to penalize in-tree development to support out-of-tree development, and I view this as a necessary step forward to support the push for expanded LSM stacking and reduce our reliance on /proc and /sys which has occassionally been problematic for some container users. Finally, we have included the linux-api folks on (all?) recent revisions of the patchset and addressed all of their concerns. - Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit ioctls on 64-bit systems problem. This patch includes support for all of the existing LSMs which provide ioctl hooks, although it turns out only SELinux actually cares about the individual ioctls. It is worth noting that while Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this patch, they did both indicate they are okay with the changes. - Fix a potential memory leak in the CALIPSO code when IPv6 is disabled at boot. While it's good that we are fixing this, I doubt this is something users are seeing in the wild as you need to both disable IPv6 and then attempt to configure IPv6 labeled networking via NetLabel/CALIPSO; that just doesn't make much sense. Normally this would go through netdev, but Jakub asked me to take this patch and of all the trees I maintain, the LSM tree seemed like the best fit. - Update the LSM MAINTAINERS entry with additional information about our process docs, patchwork, bug reporting, etc. I also noticed that the Lockdown LSM is missing a dedicated MAINTAINERS entry so I've added that to the pull request. I've been working with one of the major Lockdown authors/contributors to see if they are willing to step up and assume a Lockdown maintainer role; hopefully that will happen soon, but in the meantime I'll continue to look after it. - Add a handful of mailmap entries for Serge Hallyn and myself. * tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits) lsm: new security_file_ioctl_compat() hook lsm: Add a __counted_by() annotation to lsm_ctx.ctx calipso: fix memory leak in netlbl_calipso_add_pass() selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test MAINTAINERS: add an entry for the lockdown LSM MAINTAINERS: update the LSM entry mailmap: add entries for Serge Hallyn's dead accounts mailmap: update/replace my old email addresses lsm: mark the lsm_id variables are marked as static lsm: convert security_setselfattr() to use memdup_user() lsm: align based on pointer length in lsm_fill_user_ctx() lsm: consolidate buffer size handling into lsm_fill_user_ctx() lsm: correct error codes in security_getselfattr() lsm: cleanup the size counters in security_getselfattr() lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation lsm: drop LSM_ID_IMA LSM: selftests for Linux Security Module syscalls SELinux: Add selfattr hooks AppArmor: Add selfattr hooks Smack: implement setselfattr and getselfattr hooks ...
2024-01-09Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Quite a lot of kexec work this time around. Many singleton patches in many places. The notable patch series are: - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio conversions for file paths'. - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2: Folio conversions for directory paths'. - IA64 remnant removal in Heiko Carstens's 'Remove unused code after IA-64 removal'. - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere in 'Treewide: enable -Wmissing-prototypes'. This had some followup fixes: - Nathan Chancellor has cleaned up the hexagon build in the series 'hexagon: Fix up instances of -Wmissing-prototypes'. - Nathan also addressed some s390 warnings in 's390: A couple of fixes for -Wmissing-prototypes'. - Arnd Bergmann addresses the same warnings for MIPS in his series 'mips: address -Wmissing-prototypes warnings'. - Baoquan He has made kexec_file operate in a top-down-fitting manner similar to kexec_load in the series 'kexec_file: Load kernel at top of system RAM if required' - Baoquan He has also added the self-explanatory 'kexec_file: print out debugging message if required'. - Some checkstack maintenance work from Tiezhu Yang in the series 'Modify some code about checkstack'. - Douglas Anderson has disentangled the watchdog code's logging when multiple reports are occurring simultaneously. The series is 'watchdog: Better handling of concurrent lockups'. - Yuntao Wang has contributed some maintenance work on the crash code in 'crash: Some cleanups and fixes'" * tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits) crash_core: fix and simplify the logic of crash_exclude_mem_range() x86/crash: use SZ_1M macro instead of hardcoded value x86/crash: remove the unused image parameter from prepare_elf_headers() kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE scripts/decode_stacktrace.sh: strip unexpected CR from lines watchdog: if panicking and we dumped everything, don't re-enable dumping watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/hardlockup: adopt softlockup logic avoiding double-dumps kexec_core: fix the assignment to kimage->control_page x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init() lib/trace_readwrite.c:: replace asm-generic/io with linux/io nilfs2: cpfile: fix some kernel-doc warnings stacktrace: fix kernel-doc typo scripts/checkstack.pl: fix no space expression between sp and offset x86/kexec: fix incorrect argument passed to kexec_dprintk() x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs nilfs2: add missing set_freezable() for freezable kthread kernel: relay: remove relay_file_splice_read dead code, doesn't work docs: submit-checklist: remove all of "make namespacecheck" ...
2024-01-09Merge tag 'mm-stable-2024-01-08-15-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...
2024-01-08Merge tag 'vfs-6.8.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: "This contains the work to retrieve detailed information about mounts via two new system calls. This is hopefully the beginning of the end of the saga that started with fsinfo() years ago. The LWN articles in [1] and [2] can serve as a summary so we can avoid rehashing everything here. At LSFMM in May 2022 we got into a room and agreed on what we want to do about fsinfo(). Basically, split it into pieces. This is the first part of that agreement. Specifically, it is concerned with retrieving information about mounts. So this only concerns the mount information retrieval, not the mount table change notification, or the extended filesystem specific mount option work. That is separate work. Currently mounts have a 32bit id. Mount ids are already in heavy use by libmount and other low-level userspace but they can't be relied upon because they're recycled very quickly. We agreed that mounts should carry a unique 64bit id by which they can be referenced directly. This is now implemented as part of this work. The new 64bit mount id is exposed in statx() through the new STATX_MNT_ID_UNIQUE flag. If the flag isn't raised the old mount id is returned. If it is raised and the kernel supports the new 64bit mount id the flag is raised in the result mask and the new 64bit mount id is returned. New and old mount ids do not overlap so they cannot be conflated. Two new system calls are introduced that operate on the 64bit mount id: statmount() and listmount(). A summary of the api and usage can be found on LWN as well (cf. [3]) but of course, I'll provide a summary here as well. Both system calls rely on struct mnt_id_req. Which is the request struct used to pass the 64bit mount id identifying the mount to operate on. It is extensible to allow for the addition of new parameters and for future use in other apis that make use of mount ids. statmount() mimicks the semantics of statx() and exposes a set flags that userspace may raise in mnt_id_req to request specific information to be retrieved. A statmount() call returns a struct statmount filled in with information about the requested mount. Supported requests are indicated by raising the request flag passed in struct mnt_id_req in the @mask argument in struct statmount. Currently we do support: - STATMOUNT_SB_BASIC: Basic filesystem info - STATMOUNT_MNT_BASIC Mount information (mount id, parent mount id, mount attributes etc) - STATMOUNT_PROPAGATE_FROM Propagation from what mount in current namespace - STATMOUNT_MNT_ROOT Path of the root of the mount (e.g., mount --bind /bla /mnt returns /bla) - STATMOUNT_MNT_POINT Path of the mount point (e.g., mount --bind /bla /mnt returns /mnt) - STATMOUNT_FS_TYPE Name of the filesystem type as the magic number isn't enough due to submounts The string options STATMOUNT_MNT_{ROOT,POINT} and STATMOUNT_FS_TYPE are appended to the end of the struct. Userspace can use the offsets in @fs_type, @mnt_root, and @mnt_point to reference those strings easily. The struct statmount reserves quite a bit of space currently for future extensibility. This isn't really a problem and if this bothers us we can just send a follow-up pull request during this cycle. listmount() is given a 64bit mount id via mnt_id_req just as statmount(). It takes a buffer and a size to return an array of the 64bit ids of the child mounts of the requested mount. Userspace can thus choose to either retrieve child mounts for a mount in batches or iterate through the child mounts. For most use-cases it will be sufficient to just leave space for a few child mounts. But for big mount tables having an iterator is really helpful. Iterating through a mount table works by setting @param in mnt_id_req to the mount id of the last child mount retrieved in the previous listmount() call" Link: https://lwn.net/Articles/934469 [1] Link: https://lwn.net/Articles/829212 [2] Link: https://lwn.net/Articles/950569 [3] * tag 'vfs-6.8.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: add selftest for statmount/listmount fs: keep struct mnt_id_req extensible wire up syscalls for statmount/listmount add listmount(2) syscall statmount: simplify string option retrieval statmount: simplify numeric option retrieval add statmount(2) syscall namespace: extract show_path() helper mounts: keep list of mounts in an rbtree add unique mount ID
2024-01-08KVM: introduce CONFIG_KVM_COMMONPaolo Bonzini
CONFIG_HAVE_KVM is currently used by some architectures to either enabled the KVM config proper, or to enable host-side code that is not part of the KVM module. However, CONFIG_KVM's "select" statement in virt/kvm/Kconfig corresponds to a third meaning, namely to enable common Kconfigs required by all architectures that support KVM. These three meanings can be replaced respectively by an architecture-specific Kconfig, by IS_ENABLED(CONFIG_KVM), or by a new Kconfig symbol that is in turn selected by the architecture-specific "config KVM". Start by introducing such a new Kconfig symbol, CONFIG_KVM_COMMON. Unlike CONFIG_HAVE_KVM, it is selected by CONFIG_KVM, not by architecture code, and it brings in all dependencies of common KVM code. In particular, INTERVAL_TREE was missing in loongarch and riscv, so that is another thing that is fixed. Fixes: 8132d887a702 ("KVM: remove CONFIG_HAVE_KVM_EVENTFD", 2023-12-08) Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/all/44907c6b-c5bd-4e4a-a921-e4d3825539d8@infradead.org/ Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-08MIPS: Fix typosBjorn Helgaas
Fix typos, most reported by "codespell arch/mips". Only touches comments, no code changes. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-mips@vger.kernel.org Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2024-01-05mm/mglru: add dummy pmd_dirty()Kinsey Ho
Add dummy pmd_dirty() for architectures that don't provide it. This is similar to commit 6617da8fb565 ("mm: add dummy pmd_young() for architectures not having it"). Link: https://lkml.kernel.org/r/20231227141205.2200125-5-kinseyho@google.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312210606.1Etqz3M4-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202312210042.xQEiqlEh-lkp@intel.com/ Signed-off-by: Kinsey Ho <kinseyho@google.com> Suggested-by: Yu Zhao <yuzhao@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Donet Tom <donettom@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-03Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', ↵Joerg Roedel
'x86/vt-d', 'x86/amd' and 'core' into next
2024-01-02Merge tag 'loongarch-kvm-6.8' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.8 1. Optimization for memslot hugepage checking. 2. Cleanup and fix some HW/SW timer issues. 3. Add LSX/LASX (128bit/256bit SIMD) support.
2024-01-02net/sched: Remove CONFIG_NET_ACT_IPT from default configsJamal Hadi Salim
Now that we are retiring the IPT action. Reviewed-by: Victor Noguiera <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-30MIPS: Remove unused shadow GPR support from vector irq setupThomas Bogendoerfer
Using shadow GPRs for vectored interrupts has never been used, time to remove it. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-30MIPS: Allow vectored interrupt handler to reside everywhere for 64bitThomas Bogendoerfer
Setting up vector interrupts worked only with handlers, which resided in CKSEG0 space. This limits the kernel placement for 64bit platforms. By patching in the offset into vi_handlers[] instead of the full handler address, the vectored exception handler can load the address by itself and jump to it. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2023-12-24netfs, fscache: Combine fscache with netfsDavid Howells
Now that the fscache code is moved to be colocated with the netfslib code so that they combined into one module, do the combining. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: Christian Brauner <christian@brauner.io> cc: linux-fsdevel@vger.kernel.org cc: linux-cachefs@redhat.com cc: linux-nfs@vger.kernel.org, cc: linux-erofs@lists.ozlabs.org
2023-12-21mips: Set dump-stack arch descriptionSerge Semin
In the framework of the MIPS architecture the mips_set_machine_name() method is defined to set the machine name. The name currently is only used in the /proc/cpuinfo file content generation. Let's have it utilized to mach-personalize the dump-stack data too in a way it's done on ARM, ARM64, RISC-V, etc. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-21mips: mm: add slab availability checking in ioremap_protSerge Semin
Recent commit a5f616483110 ("mm/ioremap: add slab availability checking in ioremap_prot") added the slab availability check to the generic ioremap_prot() implementation. It is reasonable to be done for the MIPS32-specific method too since it also relies on the get_vm_area_caller() function (by means of get_vm_area()) which requires the slab allocator being up and running before being called. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-21mips: Optimize max_mapnr init procedureSerge Semin
max_mapnr defines the upper boundary of the pages space in the system. Currently in case if HIGHMEM is available it's calculated based on the upper high memory PFN limit value. Seeing there is a case when it isn't fully correct let's optimize out the max_mapnr variable initialization procedure to cover all the handled in the paging_init() method cases: 1. If CPU has DC-aliases, then high memory is unavailable so the PFNs upper boundary is determined by max_low_pfn. 2. Otherwise if high memory is available, use highend_pfn value representing the upper high memory PFNs limit. 3. Otherwise no high memory is available so set max_mapnr with the low-memory upper limit. Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-21mips: Fix max_mapnr being uninitialized on early stagesSerge Semin
max_mapnr variable is utilized in the pfn_valid() method in order to determine the upper PFN space boundary. Having it uninitialized effectively makes any PFN passed to that method invalid. That in its turn causes the kernel mm-subsystem occasion malfunctions even after the max_mapnr variable is actually properly updated. For instance, pfn_valid() is called in the init_unavailable_range() method in the framework of the calls-chain on MIPS: setup_arch() +-> paging_init() +-> free_area_init() +-> memmap_init() +-> memmap_init_zone_range() +-> init_unavailable_range() Since pfn_valid() always returns "false" value before max_mapnr is initialized in the mem_init() method, any flatmem page-holes will be left in the poisoned/uninitialized state including the IO-memory pages. Thus any further attempts to map/remap the IO-memory by using MMU may fail. In particular it happened in my case on attempt to map the SRAM region. The kernel bootup procedure just crashed on the unhandled unaligned access bug raised in the __update_cache() method: > Unhandled kernel unaligned access[#1]: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-XXX-dirty #2056 > ... > Call Trace: > [<8011ef9c>] __update_cache+0x88/0x1bc > [<80385944>] ioremap_page_range+0x110/0x2a4 > [<80126948>] ioremap_prot+0x17c/0x1f4 > [<80711b80>] __devm_ioremap+0x8c/0x120 > [<80711e0c>] __devm_ioremap_resource+0xf4/0x218 > [<808bf244>] sram_probe+0x4f4/0x930 > [<80889d20>] platform_probe+0x68/0xec > ... Let's fix the problem by initializing the max_mapnr variable as soon as the required data is available. In particular it can be done right in the paging_init() method before free_area_init() is called since all the PFN zone boundaries have already been calculated by that time. Cc: stable@vger.kernel.org Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-21mips: Fix incorrect max_low_pfn adjustmentSerge Semin
max_low_pfn variable is incorrectly adjusted if the kernel is built with high memory support and the later is detected in a running system, so the memory which actually can be directly mapped is getting into the highmem zone. See the ZONE_NORMAL range on my MIPS32r5 system: > Zone ranges: > DMA [mem 0x0000000000000000-0x0000000000ffffff] > Normal [mem 0x0000000001000000-0x0000000007ffffff] > HighMem [mem 0x0000000008000000-0x000000020fffffff] while the zones are supposed to look as follows: > Zone ranges: > DMA [mem 0x0000000000000000-0x0000000000ffffff] > Normal [mem 0x0000000001000000-0x000000001fffffff] > HighMem [mem 0x0000000020000000-0x000000020fffffff] Even though the physical memory within the range [0x08000000;0x20000000] belongs to MMIO on our system, we don't really want it to be considered as high memory since on MIPS32 that range still can be directly mapped. Note there might be other problems caused by the max_low_pfn variable misconfiguration. For instance high_memory variable is initialize with virtual address corresponding to the max_low_pfn PFN, and by design it must define the upper bound on direct map memory, then end of the normal zone. That in its turn potentially may cause problems in accessing the memory by means of the /dev/mem and /dev/kmem devices. Let's fix the discovered misconfiguration then. It turns out the commit a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") didn't introduce the max_low_pfn adjustment quite correct. If the kernel is built with high memory support and the system is equipped with high memory, the max_low_pfn variable will need to be initialized with PFN of the most upper directly reachable memory address so the zone normal would be correctly setup. On MIPS that PFN corresponds to PFN_DOWN(HIGHMEM_START). If the system is built with no high memory support and one is detected in the running system, we'll just need to adjust the max_pfn variable to discard the found high memory from the system and leave the max_low_pfn as is, since the later will be less than PFN_DOWN(HIGHMEM_START) anyway by design of the for_each_memblock() loop performed a bit early in the bootmem_init() method. Fixes: a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-21mips: dmi: Fix early remap on MIPS32Serge Semin
dmi_early_remap() has been defined as ioremap_cache() which on MIPS32 gets to be converted to the VM-based mapping. DMI early remapping is performed at the setup_arch() stage with no VM available. So calling the dmi_early_remap() for MIPS32 causes the system to crash at the early boot time. Fix that by converting dmi_early_remap() to the uncached remapping which is always available on both 32 and 64-bits MIPS systems. Note this change shall not cause any regressions on the current DMI support implementation because on the early boot-up stage neither MIPS32 nor MIPS64 has the cacheable ioremapping support anyway. Fixes: be8fa1cb444c ("MIPS: Add support for Desktop Management Interface (DMI)") Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-21MIPS: compressed: Use correct instruction for 64 bit codeGregory CLEMENT
The code clearing BSS already use macro or use correct instruction depending if the CPU is 32 bits or 64 bits. However, a few instructions remained 32 bits only. By using the accurate MACRO, it is now possible to deal with memory address beyond 32 bits. As a side effect, when using 64bits processor, it also divides the loop number needed to clear the BSS by 2. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2023-12-20mips: fix r3k_cache_init build regressionArnd Bergmann
My earlier patch removed __weak function declarations that used to be turned into wild branches by the linker, instead causing a link failure when the called functions are unavailable: mips-linux-ld: arch/mips/mm/cache.o: in function `cpu_cache_init': cache.c:(.text+0x670): undefined reference to `r3k_cache_init' The __weak method seems suboptimal, so rather than putting that back, make the function calls conditional on the Kconfig symbol that controls the compilation. [akpm@linux-foundation.org: fix whitespace while we're in there] Link: https://lkml.kernel.org/r/20231214205506.310402-1-arnd@kernel.org Fixes: 66445677f01e ("mips: move cache declarations into header") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: kernelci.org bot <bot@kernelci.org> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-15Merge tag 'mm-hotfixes-stable-2023-12-15-07-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues" * tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/mglru: reclaim offlined memcgs harder mm/mglru: respect min_ttl_ms with memcgs mm/mglru: try to stop at high watermarks mm/mglru: fix underprotected page cache mm/shmem: fix race in shmem_undo_range w/THP Revert "selftests: error out if kernel header files are not yet built" crash_core: fix the check for whether crashkernel is from high memory x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXEC loongarch, kexec: change dependency of object files mm/damon/core: make damon_start() waits until kdamond_fn() starts selftests/mm: cow: print ksft header before printing anything else mm: fix VMA heap bounds checking riscv: fix VMALLOC_START definition kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP
2023-12-14Merge tag 'net-6.7-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Current release - regressions: - tcp: fix tcp_disordered_ack() vs usec TS resolution Current release - new code bugs: - dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set() - eth: octeon_ep: initialise control mbox tasks before using APIs Previous releases - regressions: - io_uring/af_unix: disable sending io_uring over sockets - eth: mlx5e: - TC, don't offload post action rule if not supported - fix possible deadlock on mlx5e_tx_timeout_work - eth: iavf: fix iavf_shutdown to call iavf_remove instead iavf_close - eth: bnxt_en: fix skb recycling logic in bnxt_deliver_skb() - eth: ena: fix DMA syncing in XDP path when SWIOTLB is on - eth: team: fix use-after-free when an option instance allocation fails Previous releases - always broken: - neighbour: don't let neigh_forced_gc() disable preemption for long - net: prevent mss overflow in skb_segment() - ipv6: support reporting otherwise unknown prefix flags in RTM_NEWPREFIX - tcp: remove acked SYN flag from packet in the transmit queue correctly - eth: octeontx2-af: - fix a use-after-free in rvu_nix_register_reporters - fix promisc mcam entry action - eth: dwmac-loongson: make sure MDIO is initialized before use - eth: atlantic: fix double free in ring reinit logic" * tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits) net: atlantic: fix double free in ring reinit logic appletalk: Fix Use-After-Free in atalk_ioctl net: stmmac: Handle disabled MDIO busses from devicetree net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX dpaa2-switch: do not ask for MDB, VLAN and FDB replay dpaa2-switch: fix size of the dma_unmap net: prevent mss overflow in skb_segment() vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space() Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set" MIPS: dts: loongson: drop incorrect dwmac fallback compatible stmmac: dwmac-loongson: drop useless check for compatible fallback stmmac: dwmac-loongson: Make sure MDIO is initialized before use tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set() net: ena: Fix XDP redirection error net: ena: Fix DMA syncing in XDP path when SWIOTLB is on net: ena: Fix xdp drops handling due to multibuf packets net: ena: Destroy correct number of xdp queues upon failure net: Remove acked SYN flag from packet in the transmit queue correctly qed: Fix a potential use-after-free in qed_cxt_tables_alloc ...
2023-12-14wire up syscalls for statmount/listmountMiklos Szeredi
Wire up all archs. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20231025140205.3586473-7-mszeredi@redhat.com Reviewed-by: Ian Kent <raven@themaw.net> Signed-off-by: Christian Brauner <brauner@kernel.org>