summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2022-07-29mm: vmpressure: don't count proactive reclaim in vmpressureYosry Ahmed
memory.reclaim is a cgroup v2 interface that allows users to proactively reclaim memory from a memcg, without real memory pressure. Reclaim operations invoke vmpressure, which is used: (a) To notify userspace of reclaim efficiency in cgroup v1, and (b) As a signal for a memcg being under memory pressure for networking (see mem_cgroup_under_socket_pressure()). For (a), vmpressure notifications in v1 are not affected by this change since memory.reclaim is a v2 feature. For (b), the effects of the vmpressure signal (according to Shakeel [1]) are as follows: 1. Reducing send and receive buffers of the current socket. 2. May drop packets on the rx path. 3. May throttle current thread on the tx path. Since proactive reclaim is invoked directly by userspace, not by memory pressure, it makes sense not to throttle networking. Hence, this change makes sure that proactive reclaim caused by memory.reclaim does not trigger vmpressure. [1] https://lore.kernel.org/lkml/CALvZod68WdrXEmBpOkadhB5GPYmCXaDZzXH=yyGOCAjFRn4NDQ@mail.gmail.com/ [yosryahmed@google.com: update documentation] Link: https://lkml.kernel.org/r/20220721173015.2643248-1-yosryahmed@google.com Link: https://lkml.kernel.org/r/20220714064918.2576464-1-yosryahmed@google.com Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: NeilBrown <neilb@suse.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-29writeback: remove inode_to_wb_is_valid()Xiu Jianfeng
inode_to_wb_is_valid() is no longer used since commit fe55d563d417 ("remove inode_congested()"), remove it. Link: https://lkml.kernel.org/r/20220714084147.140324-1-xiujianfeng@huawei.com Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-29NFSD: Fix strncpy() fortify warningChuck Lever
In function ‘strncpy’, inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3, inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11: /home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation] 52 | #define __underlying_strncpy __builtin_strncpy | ^ /home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’ 89 | return __underlying_strncpy(p, q, size); | ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29SUNRPC: Expand the svc_alloc_arg_err tracepointChuck Lever
Record not only the number of pages requested, but the number of pages that were actually allocated, to get a measure of progress (or lack thereof). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29NLM: Defend against file_lock changes after vfs_test_lock()Benjamin Coddington
Instead of trusting that struct file_lock returns completely unchanged after vfs_test_lock() when there's no conflicting lock, stash away our nlm_lockowner reference so we can properly release it for all cases. This defends against another file_lock implementation overwriting fl_owner when the return type is F_UNLCK. Reported-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Tested-by: Roberto Bergantinos Corpas <rbergant@redhat.com> Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29SUNRPC: Fix xdr_encode_bool()Chuck Lever
I discovered that xdr_encode_bool() was returning the same address that was passed in the @p parameter. The documenting comment states that the intent is to return the address of the next buffer location, just like the other "xdr_encode_*" helpers. The result was the encoded results of NFSv3 PATHCONF operations were not formed correctly. Fixes: ded04a587f6c ("NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-07-29clk: fixed-factor: Introduce *clk_hw_register_fixed_factor_parent_hw()Marijn Suijten
Add the devres and non-devres variant of clk_hw_register_fixed_factor_parent_hw() for registering a fixed factor clock with clk_hw parent pointer instead of parent name. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Link: https://lore.kernel.org/r/20220629225331.357308-4-marijn.suijten@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-29clk: mux: Introduce devm_clk_hw_register_mux_parent_hws()Marijn Suijten
Add the devres variant of clk_hw_register_mux_hws() for registering a mux clock with clk_hw parent pointers instead of parent names. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220629225331.357308-3-marijn.suijten@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-29clk: divider: Introduce devm_clk_hw_register_divider_parent_hw()Marijn Suijten
Add the devres variant of clk_hw_register_divider_parent_hw() for registering a divider clock with clk_hw parent pointer instead of parent name. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220629225331.357308-2-marijn.suijten@somainline.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-07-29Merge branches 'acpi-pm', 'acpi-soc', 'acpi-tables' and 'acpi-resource'Rafael J. Wysocki
Merge ACPI power management changes, ACPI LPSS driver changes, ACPI table parsing code changes and ACPI resource handling changes for v5.20-rc1: - Save NVS memory during transitions into S3 on Lenovo G40-45 (Manyi Li). - Add support for upcoming AMD uPEP device ID AMDI008 to the ACPI suspend-to-idle driver for x86 platforms (Shyam Sundar S K). - Clean up checks related to the ACPI_FADT_LOW_POWER_S0 platform flag in the LPIT table driver and the suspend-to-idle driver for x86 platforms (Rafael Wysocki). - Print information messages regarding declared LPS0 idle support in the platform firmware (Rafael Wysocki). - Fix missing check in register_device_clock() in the ACPI driver for Intel SoCs (huhai). - Fix ACS setup in the VIOT table parser (Eric Auger). - Skip IRQ override on AMD Zen platforms where it's harmful (Chuanhong Guo). * acpi-pm: ACPI: PM: x86: Print messages regarding LPS0 idle support ACPI: PM: s2idle: Use LPS0 idle if ACPI_FADT_LOW_POWER_S0 is unset Revert "ACPI / PM: LPIT: Register sysfs attributes based on FADT" ACPI: PM: s2idle: Add support for upcoming AMD uPEP HID AMDI008 ACPI: PM: save NVS memory for Lenovo G40-45 * acpi-soc: ACPI: LPSS: Fix missing check in register_device_clock() * acpi-tables: ACPI: VIOT: Fix ACS setup * acpi-resource: ACPI: resource: skip IRQ override on AMD Zen platforms
2022-07-29Merge branches 'acpi-processor', 'acpi-apei' and 'acpi-ec'Rafael J. Wysocki
Merge ACPI processor driver changes, APEI changes and ACPI EC driver changes for v5.20-rc1: - Drop leftover acpi_processor_get_limit_info() declaration (Riwen Lu). - Split out thermal initialization from ACPI PSS (Riwen Lu). - Annotate more functions in the ACPI CPU idle driver to live in the cpuidle section (Guilherme G. Piccoli). - Fix _EINJ vs "special purpose" EFI memory regions (Dan Williams). - Implement a better fix to avoid spamming the console with old error logs (Tony Luck). - Fix typo in a comment in the APEI code (Xiang wangx). - Clean up the ACPI EC driver after previous changes in it (Hans de Goede). * acpi-processor: ACPI: processor: Drop leftover acpi_processor_get_limit_info() declaration ACPI: processor: Split out thermal initialization from ACPI PSS ACPI: processor/idle: Annotate more functions to live in cpuidle section * acpi-apei: ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP ACPI: APEI: Better fix to avoid spamming the console with old error logs ACPI: APEI: Fix double word in a comment * acpi-ec: ACPI: EC: Drop unused ident initializers from dmi_system_id tables ACPI: EC: Re-use boot_ec when possible even when EC_FLAGS_TRUST_DSDT_GPE is set ACPI: EC: Drop the EC_FLAGS_IGNORE_DSDT_GPE quirk ACPI: EC: Remove duplicate ThinkPad X1 Carbon 6th entry from DMI quirks
2022-07-29Merge branch 'acpi-bus'Rafael J. Wysocki
Merge ACPI device object management changes for v5.20-rc1. - Use the facilities provided by the driver core and some additional helpers to handle the children of a given ACPI device object in multiple places instead of using the children and node list heads in struct acpi_device which is error prone (Rafael Wysocki). - Fix ACPI-related device reference counting issue in the hisi_lpc bus driver (Yang Yingliang). - Drop the children and node list heads that are not needed any more from struct acpi_device (Rafael Wysocki). - Drop driver member from struct acpi_device (Uwe Kleine-König). - Drop redundant check from acpi_device_remove() (Uwe Kleine-König). * acpi-bus: ACPI: bus: Drop unused list heads from struct acpi_device hisi_lpc: Use acpi_dev_for_each_child() bus: hisi_lpc: fix missing platform_device_put() in hisi_lpc_acpi_probe() ACPI: bus: Drop driver member of struct acpi_device ACPI: bus: Drop redundant check in acpi_device_remove() mfd: core: Use acpi_dev_for_each_child() ACPI / MMC: PM: Unify fixing up device power soundwire: Use acpi_dev_for_each_child() platform/x86/thinkpad_acpi: Use acpi_dev_for_each_child() ACPI: scan: Walk ACPI device's children using driver core ACPI: bus: Introduce acpi_dev_for_each_child_reverse() ACPI: video: Use acpi_dev_for_each_child() ACPI: bus: Export acpi_dev_for_each_child() to modules ACPI: property: Use acpi_dev_for_each_child() for child lookup ACPI: container: Use acpi_dev_for_each_child() USB: ACPI: Replace usb_acpi_find_port() with acpi_find_child_by_adr() thunderbolt: ACPI: Replace tb_acpi_find_port() with acpi_find_child_by_adr() ACPI: glue: Introduce acpi_find_child_by_adr() ACPI: glue: Introduce acpi_dev_has_children() ACPI: glue: Use acpi_dev_for_each_child()
2022-07-29Merge branches 'pm-core', 'pm-sleep', 'powercap', 'pm-domains' and 'pm-em'Rafael J. Wysocki
Merge core device power management changes for v5.20-rc1: - Extend support for wakeirq to callback wrappers used during system suspend and resume (Ulf Hansson). - Defer waiting for device probe before loading a hibernation image till the first actual device access to avoid possible deadlocks reported by syzbot (Tetsuo Handa). - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn Helgaas). - Add Raptor Lake-P to the list of processors supported by the Intel RAPL driver (George D Sworo). - Add Alder Lake-N and Raptor Lake-P to the list of processors for which Power Limit4 is supported in the Intel RAPL driver (Sumeet Pawnikar). - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before attempting to remove it (Hsin-Yi Wang). - Change the Energy Model code to represent power in micro-Watts and adjust its users accordingly (Lukasz Luba). * pm-core: PM: runtime: Extend support for wakeirq for force_suspend|resume * pm-sleep: PM: hibernate: defer device probing when resuming from hibernation PM: wakeup: Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP * powercap: powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P powercap: intel_rapl: Add support for RAPTORLAKE_P * pm-domains: PM: domains: Ensure genpd_debugfs_dir exists before remove * pm-em: cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1 firmware: arm_scmi: Get detailed power scale from perf Documentation: EM: Switch to micro-Watts scale PM: EM: convert power field to micro-Watts precision and align drivers
2022-07-29Merge tag 'thermal-v5.20-rc1' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal control changes for 5.20-rc1 from Daniel Lezcano: "- Make per cpufreq / devfreq cooling device ops instead of using a global variable, fix comments and rework the trace information (Lukasz Luba) - Add the include/dt-bindings/thermal.h under the area covered by the thermal maintainer in the MAINTAINERS file (Lukas Bulwahn) - Improve the error output by giving the sensor identification when a thermal zone failed to initialize, the DT bindings by changing the positive logic and adding the r8a779f0 support on the rcar3 (Wolfram Sang) - Convert the QCom tsens DT binding to the dtsformat format (Krzysztof Kozlowski) - Remove the pointless get_trend() function in the QCom, Ux500 and tegra thermal drivers, along with the unused DROP_FULL and RAISE_FULL trends definitions. Simplify the code by using clamp() macros (Daniel Lezcano) - Fix ref_table memory leak at probe time on the k3_j72xx bandgap (Bryan Brattlof) - Fix array underflow in prep_lookup_table (Dan Carpenter) - Add static annotation to the k3_j72xx_bandgap_j7* data structure (Jin Xiaoyun) - Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall) - Fix typos in comments on rzg2l (Biju Das) - Remove as unnecessary call to dev_err() as the error is already printed by the failing function on u8500 (Yang Li) - Register the thermal zones as hwmon sensors for the Qcom thermal sensors (Dmitry Baryshkov) - Fix 'tmon' tool compilation issue by adding phtread.h include (Markus Mayer) - Fix typo in the comments for the 'tmon' tool (Slark Xiao) - Consolidate the thermal core code by beginning to move the thermal trip structure from the thermal OF code as a generic structure to be used by the different sensors when registering a thermal zone (Daniel Lezcano)" * tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (36 commits) thermal/of: Initialize trip points separately thermal/of: Use thermal trips stored in the thermal zone thermal/core: Add thermal_trip in thermal_zone thermal/core: Rename 'trips' to 'num_trips' thermal/core: Move thermal_set_delay_jiffies to static thermal/core: Remove unneeded EXPORT_SYMBOLS thermal/of: Move thermal_trip structure to thermal.h thermal/of: Remove the device node pointer for thermal_trip thermal/of: Replace device node match with device node search thermal/core: Remove duplicate information when an error occurs thermal/core: Avoid calling ->get_trip_temp() unnecessarily thermal/tools/tmon: Fix typo 'the the' in comment thermal/tools/tmon: Include pthread and time headers in tmon.h thermal/ti-soc-thermal: Fix comment typo thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors thermal/drivers/u8500: Remove unnecessary print function dev_err() thermal/drivers/rzg2l: Fix comments thermal/drivers/sun8i: Fix typo in comment thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static ...
2022-07-29Merge tag 'loongarch-fixes-5.19-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: - Fix cache size calculation, stack protection attributes, ptrace's fpr_set and "ROM Size" in boardinfo - Some cleanups and improvements of assembly - Some cleanups of unused code and useless code * tag 'loongarch-fixes-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Fix wrong "ROM Size" of boardinfo LoongArch: Fix missing fcsr in ptrace's fpr_set LoongArch: Fix shared cache size calculation LoongArch: Disable executable stack by default LoongArch: Remove unused variables LoongArch: Remove clock setting during cpu hotplug stage LoongArch: Remove useless header compiler.h LoongArch: Remove several syntactic sugar macros for branches LoongArch: Re-tab the assembly files LoongArch: Simplify "BGT foo, zero" with BGTZ LoongArch: Simplify "BLT foo, zero" with BLTZ LoongArch: Simplify "BEQ/BNE foo, zero" with BEQZ/BNEZ LoongArch: Use the "move" pseudo-instruction where applicable LoongArch: Use the "jr" pseudo-instruction where applicable LoongArch: Use ABI names of registers where appropriate
2022-07-29PCI: Remove pci_mmap_page_range() wrapperArnd Bergmann
The ARCH_GENERIC_PCI_MMAP_RESOURCE symbol came up in a recent discussion, and I noticed that this was left behind by an unfinished cleanup from 2017. The only architecture that still relies on providing its own pci_mmap_page_range() helper instead of using the generic pci_mmap_resource_range() is sparc. Presumably the reasons for this have not changed, but at least this can be simplified by converting sparc to use the same interface as the others. The only difference between the two is the device-specific offset that gets added to or subtracted from vma->vm_pgoff. Change the only caller of pci_mmap_page_range() in common code to subtract this offset and call the modern interface, while adding it back in the sparc implementation to preserve the existing behavior. This removes the complexities of the dual interfaces from the common code, and keeps it all specific to the sparc architecture code. According to David Miller, the sparc code lets user space poke into the VGA I/O port registers by mmapping the I/O space of the parent bridge device, which is something that the generic pci_mmap_resource_range() code apparently does not. Link: https://lore.kernel.org/lkml/1519887203.622.3.camel@infradead.org/t/ Link: https://lore.kernel.org/lkml/20220714214657.2402250-3-shorne@gmail.com/ Link: https://lore.kernel.org/r/20220715153617.3393420-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Stafford Horne <shorne@gmail.com>
2022-07-29PCI: Stub __pci_ioport_map() for arches that don't support it at allStafford Horne
When building OpenRISC PCI, which has no ioport_map(), we get the following build error: lib/pci_iomap.c: In function 'pci_iomap_range': CC drivers/i2c/i2c-core-base.o ./include/asm-generic/pci_iomap.h:29:41: error: implicit declaration of function 'ioport_map'; did you mean 'ioremap'? [-Werror=implicit-function-declaration] 29 | #define __pci_ioport_map(dev, port, nr) ioport_map((port), (nr)) | ^~~~~~~~~~ lib/pci_iomap.c:44:24: note: in expansion of macro '__pci_ioport_map' 44 | return __pci_ioport_map(dev, start, len); | ^~~~~~~~~~~~~~~~ Add a NULL definition of __pci_ioport_map() for architectures that do not support ioport_map(). Suggested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20220722212248.802500-1-shorne@gmail.com Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-07-29Merge back cpuidle material for 5.20.Rafael J. Wysocki
2022-07-29KVM: Add gfp_custom flag in struct kvm_mmu_memory_cacheAnup Patel
The kvm_mmu_topup_memory_cache() always uses GFP_KERNEL_ACCOUNT for memory allocation which prevents it's use in atomic context. To address this limitation of kvm_mmu_topup_memory_cache(), we add gfp_custom flag in struct kvm_mmu_memory_cache. When the gfp_custom flag is set to some GFP_xyz flags, the kvm_mmu_topup_memory_cache() will use that instead of GFP_KERNEL_ACCOUNT. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>
2022-07-29RISC-V: KVM: Add extensible CSR emulation frameworkAnup Patel
We add an extensible CSR emulation framework which is based upon the existing system instruction emulation. This will be useful to upcoming AIA, PMU, Nested and other virtualization features. The CSR emulation framework also has provision to emulate CSR in user space but this will be used only in very specific cases such as AIA IMSIC CSR emulation in user space or vendor specific CSR emulation in user space. By default, all CSRs not handled by KVM RISC-V will be redirected back to Guest VCPU as illegal instruction trap. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>
2022-07-29seg6: add support for SRv6 H.L2Encaps.Red behaviorAndrea Mayer
The SRv6 H.L2Encaps.Red behavior described in [1] is an optimization of the SRv6 H.L2Encaps behavior [2]. H.L2Encaps.Red reduces the length of the SRH by excluding the first segment (SID) in the SRH of the pushed IPv6 header. The first SID is only placed in the IPv6 Destination Address field of the pushed IPv6 header. When the SRv6 Policy only contains one SID the SRH is omitted, unless there is an HMAC TLV to be carried. [1] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.4 [2] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.3 Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: Anton Makarov <anton.makarov11235@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29seg6: add support for SRv6 H.Encaps.Red behaviorAndrea Mayer
The SRv6 H.Encaps.Red behavior described in [1] is an optimization of the SRv6 H.Encaps behavior [2]. H.Encaps.Red reduces the length of the SRH by excluding the first segment (SID) in the SRH of the pushed IPv6 header. The first SID is only placed in the IPv6 Destination Address field of the pushed IPv6 header. When the SRv6 Policy only contains one SID the SRH is omitted, unless there is an HMAC TLV to be carried. [1] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.2 [2] - https://datatracker.ietf.org/doc/html/rfc8986#section-5.1 Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: Anton Makarov <anton.makarov11235@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29net: allow unbound socket for packets in VRF when tcp_l3mdev_accept setMike Manning
The commit 3c82a21f4320 ("net: allow binding socket in a VRF when there's an unbound socket") changed the inet socket lookup to avoid packets in a VRF from matching an unbound socket. This is to ensure the necessary isolation between the default and other VRFs for routing and forwarding. VRF-unaware processes running in the default VRF cannot access another VRF and have to be run with 'ip vrf exec <vrf>'. This is to be expected with tcp_l3mdev_accept disabled, but could be reallowed when this sysctl option is enabled. So instead of directly checking dif and sdif in inet[6]_match, here call inet_sk_bound_dev_eq(). This allows a match on unbound socket for non-zero sdif i.e. for packets in a VRF, if tcp_l3mdev_accept is enabled. Fixes: 3c82a21f4320 ("net: allow binding socket in a VRF when there's an unbound socket") Signed-off-by: Mike Manning <mvrmanning@gmail.com> Link: https://lore.kernel.org/netdev/a54c149aed38fded2d3b5fdb1a6c89e36a083b74.camel@lasnet.de/ Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-29ALSA: control: Use deferred fasync helperTakashi Iwai
For avoiding the potential deadlock via kill_fasync() call, use the new fasync helpers to defer the invocation from the control API. Note that it's merely a workaround. Another note: although we haven't received reports about the deadlock with the control API, the deadlock is still potentially possible, and it's better to align the behavior with other core APIs (PCM and timer); so let's move altogether. Link: https://lore.kernel.org/r/20220728125945.29533-5-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-07-29ALSA: pcm: Use deferred fasync helperTakashi Iwai
For avoiding the potential deadlock via kill_fasync() call, use the new fasync helpers to defer the invocation from timer API. Note that it's merely a workaround. Reported-by: syzbot+8285e973a41b5aa68902@syzkaller.appspotmail.com Reported-by: syzbot+669c9abf11a6a011dd09@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20220728125945.29533-4-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-07-29ALSA: core: Add async signal helpersTakashi Iwai
Currently the call of kill_fasync() from an interrupt handler might lead to potential spin deadlocks, as spotted by syzkaller. Unfortunately, it's not so trivial to fix this lock chain as it's involved with the tasklist_lock that is touched in allover places. As a temporary workaround, this patch provides the way to defer the async signal notification in a work. The new helper functions, snd_fasync_helper() and snd_kill_faync() are replacements for fasync_helper() and kill_fasync(), respectively. In addition, snd_fasync_free() needs to be called at the destructor of the relevant file object. Link: https://lore.kernel.org/r/20220728125945.29533-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-07-29LoongArch: Remove clock setting during cpu hotplug stageBibo Mao
On physical machine we can save power by disabling clock of hot removed cpu. However as different platforms require different methods to configure clocks, the code is platform-specific, and probably belongs to firmware/pmu or cpu regulator, rather than generic arch/loongarch code. Also, there is no such register on QEMU virt machine since the clock/frequency regulation is not emulated. This patch removes the hard-coded clock register accesses in generic LoongArch cpu hotplug flow. Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-07-29Merge branches 'arm/exynos', 'arm/mediatek', 'arm/msm', 'arm/smmu', ↵Joerg Roedel
'virtio', 'x86/vt-d', 'x86/amd' and 'core' into next
2022-07-28firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48()Andy Shevchenko
Since we have a proper endianness converters for BE 48-bit data use them. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220726144906.5217-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28ax25: fix incorrect dev_tracker usageEric Dumazet
While investigating a separate rose issue [1], and enabling CONFIG_NET_DEV_REFCNT_TRACKER=y, Bernard reported an orthogonal ax25 issue [2] An ax25_dev can be used by one (or many) struct ax25_cb. We thus need different dev_tracker, one per struct ax25_cb. After this patch is applied, we are able to focus on rose. [1] https://lore.kernel.org/netdev/fb7544a1-f42e-9254-18cc-c9b071f4ca70@free.fr/ [2] [ 205.798723] reference already released. [ 205.798732] allocated in: [ 205.798734] ax25_bind+0x1a2/0x230 [ax25] [ 205.798747] __sys_bind+0xea/0x110 [ 205.798753] __x64_sys_bind+0x18/0x20 [ 205.798758] do_syscall_64+0x5c/0x80 [ 205.798763] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 205.798768] freed in: [ 205.798770] ax25_release+0x115/0x370 [ax25] [ 205.798778] __sock_release+0x42/0xb0 [ 205.798782] sock_close+0x15/0x20 [ 205.798785] __fput+0x9f/0x260 [ 205.798789] ____fput+0xe/0x10 [ 205.798792] task_work_run+0x64/0xa0 [ 205.798798] exit_to_user_mode_prepare+0x18b/0x190 [ 205.798804] syscall_exit_to_user_mode+0x26/0x40 [ 205.798808] do_syscall_64+0x69/0x80 [ 205.798812] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 205.798827] ------------[ cut here ]------------ [ 205.798829] WARNING: CPU: 2 PID: 2605 at lib/ref_tracker.c:136 ref_tracker_free.cold+0x60/0x81 [ 205.798837] Modules linked in: rose netrom mkiss ax25 rfcomm cmac algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi nls_iso8859_1 i915 rtw88_8821ce rtw88_8821c x86_pkg_temp_thermal rtw88_pci intel_powerclamp rtw88_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio coretemp snd_hda_intel kvm_intel snd_intel_dspcfg mac80211 snd_hda_codec kvm i2c_algo_bit drm_buddy drm_dp_helper btusb drm_kms_helper snd_hwdep btrtl snd_hda_core btbcm joydev crct10dif_pclmul btintel crc32_pclmul ghash_clmulni_intel mei_hdcp btmtk intel_rapl_msr aesni_intel bluetooth input_leds snd_pcm crypto_simd syscopyarea processor_thermal_device_pci_legacy sysfillrect cryptd intel_soc_dts_iosf snd_seq sysimgblt ecdh_generic fb_sys_fops rapl libarc4 processor_thermal_device intel_cstate processor_thermal_rfim cec snd_timer ecc snd_seq_device cfg80211 processor_thermal_mbox mei_me processor_thermal_rapl mei rc_core at24 snd intel_pch_thermal intel_rapl_common ttm soundcore int340x_thermal_zone video [ 205.798948] mac_hid acpi_pad sch_fq_codel ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid i2c_i801 i2c_smbus r8169 xhci_pci ahci libahci realtek lpc_ich xhci_pci_renesas [last unloaded: ax25] [ 205.798992] CPU: 2 PID: 2605 Comm: ax25ipd Not tainted 5.18.11-F6BVP #3 [ 205.798996] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CK3, BIOS 5.011 09/16/2020 [ 205.798999] RIP: 0010:ref_tracker_free.cold+0x60/0x81 [ 205.799005] Code: e8 d2 01 9b ff 83 7b 18 00 74 14 48 c7 c7 2f d7 ff 98 e8 10 6e fc ff 8b 7b 18 e8 b8 01 9b ff 4c 89 ee 4c 89 e7 e8 5d fd 07 00 <0f> 0b b8 ea ff ff ff e9 30 05 9b ff 41 0f b6 f7 48 c7 c7 a0 fa 4e [ 205.799008] RSP: 0018:ffffaf5281073958 EFLAGS: 00010286 [ 205.799011] RAX: 0000000080000000 RBX: ffff9a0bd687ebe0 RCX: 0000000000000000 [ 205.799014] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 00000000ffffffff [ 205.799016] RBP: ffffaf5281073a10 R08: 0000000000000003 R09: fffffffffffd5618 [ 205.799019] R10: 0000000000ffff10 R11: 000000000000000f R12: ffff9a0bc53384d0 [ 205.799022] R13: 0000000000000282 R14: 00000000ae000001 R15: 0000000000000001 [ 205.799024] FS: 0000000000000000(0000) GS:ffff9a0d0f300000(0000) knlGS:0000000000000000 [ 205.799028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 205.799031] CR2: 00007ff6b8311554 CR3: 000000001ac10004 CR4: 00000000001706e0 [ 205.799033] Call Trace: [ 205.799035] <TASK> [ 205.799038] ? ax25_dev_device_down+0xd9/0x1b0 [ax25] [ 205.799047] ? ax25_device_event+0x9f/0x270 [ax25] [ 205.799055] ? raw_notifier_call_chain+0x49/0x60 [ 205.799060] ? call_netdevice_notifiers_info+0x52/0xa0 [ 205.799065] ? dev_close_many+0xc8/0x120 [ 205.799070] ? unregister_netdevice_many+0x13d/0x890 [ 205.799073] ? unregister_netdevice_queue+0x90/0xe0 [ 205.799076] ? unregister_netdev+0x1d/0x30 [ 205.799080] ? mkiss_close+0x7c/0xc0 [mkiss] [ 205.799084] ? tty_ldisc_close+0x2e/0x40 [ 205.799089] ? tty_ldisc_hangup+0x137/0x210 [ 205.799092] ? __tty_hangup.part.0+0x208/0x350 [ 205.799098] ? tty_vhangup+0x15/0x20 [ 205.799103] ? pty_close+0x127/0x160 [ 205.799108] ? tty_release+0x139/0x5e0 [ 205.799112] ? __fput+0x9f/0x260 [ 205.799118] ax25_dev_device_down+0xd9/0x1b0 [ax25] [ 205.799126] ax25_device_event+0x9f/0x270 [ax25] [ 205.799135] raw_notifier_call_chain+0x49/0x60 [ 205.799140] call_netdevice_notifiers_info+0x52/0xa0 [ 205.799146] dev_close_many+0xc8/0x120 [ 205.799152] unregister_netdevice_many+0x13d/0x890 [ 205.799157] unregister_netdevice_queue+0x90/0xe0 [ 205.799161] unregister_netdev+0x1d/0x30 [ 205.799165] mkiss_close+0x7c/0xc0 [mkiss] [ 205.799170] tty_ldisc_close+0x2e/0x40 [ 205.799173] tty_ldisc_hangup+0x137/0x210 [ 205.799178] __tty_hangup.part.0+0x208/0x350 [ 205.799184] tty_vhangup+0x15/0x20 [ 205.799188] pty_close+0x127/0x160 [ 205.799193] tty_release+0x139/0x5e0 [ 205.799199] __fput+0x9f/0x260 [ 205.799203] ____fput+0xe/0x10 [ 205.799208] task_work_run+0x64/0xa0 [ 205.799213] do_exit+0x33b/0xab0 [ 205.799217] ? __handle_mm_fault+0xc4f/0x15f0 [ 205.799224] do_group_exit+0x35/0xa0 [ 205.799228] __x64_sys_exit_group+0x18/0x20 [ 205.799232] do_syscall_64+0x5c/0x80 [ 205.799238] ? handle_mm_fault+0xba/0x290 [ 205.799242] ? debug_smp_processor_id+0x17/0x20 [ 205.799246] ? fpregs_assert_state_consistent+0x26/0x50 [ 205.799251] ? exit_to_user_mode_prepare+0x49/0x190 [ 205.799256] ? irqentry_exit_to_user_mode+0x9/0x20 [ 205.799260] ? irqentry_exit+0x33/0x40 [ 205.799263] ? exc_page_fault+0x87/0x170 [ 205.799268] ? asm_exc_page_fault+0x8/0x30 [ 205.799273] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 205.799277] RIP: 0033:0x7ff6b80eaca1 [ 205.799281] Code: Unable to access opcode bytes at RIP 0x7ff6b80eac77. [ 205.799283] RSP: 002b:00007fff6dfd4738 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 205.799287] RAX: ffffffffffffffda RBX: 00007ff6b8215a00 RCX: 00007ff6b80eaca1 [ 205.799290] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001 [ 205.799293] RBP: 0000000000000001 R08: ffffffffffffff80 R09: 0000000000000028 [ 205.799295] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff6b8215a00 [ 205.799298] R13: 0000000000000000 R14: 00007ff6b821aee8 R15: 00007ff6b821af00 [ 205.799304] </TASK> Fixes: feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") Reported-by: Bernard F6BVP <f6bvp@free.fr> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Duoming Zhou <duoming@zju.edu.cn> Link: https://lore.kernel.org/r/20220728051821.3160118-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28devlink: introduce framework for selftestsVikas Gupta
Add a framework for running selftests. Framework exposes devlink commands and test suite(s) to the user to execute and query the supported tests by the driver. Below are new entries in devlink_nl_ops devlink_nl_cmd_selftests_show_doit/dumpit: To query the supported selftests by the drivers. devlink_nl_cmd_selftests_run: To execute selftests. Users can provide a test mask for executing group tests or standalone tests. Documentation/networking/devlink/ path is already part of MAINTAINERS & the new files come under this path. Hence no update needed to the MAINTAINERS Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28net/tls: Multi-threaded calls to TX tls_dev_delTariq Toukan
Multiple TLS device-offloaded contexts can be added in parallel via concurrent calls to .tls_dev_add, while calls to .tls_dev_del are sequential in tls_device_gc_task. This is not a sustainable behavior. This creates a rate gap between add and del operations (addition rate outperforms the deletion rate). When running for enough time, the TLS device resources could get exhausted, failing to offload new connections. Replace the single-threaded garbage collector work with a per-context alternative, so they can be handled on several cores in parallel. Use a new dedicated destruct workqueue for this. Tested with mlx5 device: Before: 22141 add/sec, 103 del/sec After: 11684 add/sec, 11684 del/sec Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flagNathan Huckleberry
Add an optional flag that ensures dm_bufio_client does not sleep (primary focus is to service dm_bufio_get without sleeping). This allows the dm-bufio cache to be queried from interrupt context. To ensure that dm-bufio does not sleep, dm-bufio must use a spinlock instead of a mutex. Additionally, to avoid deadlocks, special care must be taken so that dm-bufio does not sleep while holding the spinlock. But again: the scope of this no_sleep is initially confined to dm_bufio_get, so __alloc_buffer_wait_no_callback is _not_ changed to avoid sleeping because __bufio_new avoids allocation for NF_GET. Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-07-28dm bufio: Add flags argument to dm_bufio_client_createNathan Huckleberry
Add a flags argument to dm_bufio_client_create and update all the callers. This is in preparation to add the DM_BUFIO_NO_SLEEP flag. Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-07-28dm: fix dm-raid crash if md_handle_request() splits bioMike Snitzer
Commit ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") introduced the optimization to _not_ perform bio_associate_blkg()'s relatively costly work when DM core clones its bio. But in doing so it exposed the possibility for DM's cloned bio to alter DM target behavior (e.g. crash) if a target were to issue IO without first calling bio_set_dev(). The DM raid target can trigger an MD crash due to its need to split the DM bio that is passed to md_handle_request(). The split will recurse to submit_bio_noacct() using a bio with an uninitialized ->bi_blkg. This NULL bio->bi_blkg causes blk_throtl_bio() to dereference a NULL blkg_to_tg(bio->bi_blkg). Fix this in DM core by adding a new 'needs_bio_set_dev' target flag that will make alloc_tio() call bio_set_dev() on behalf of the target. dm-raid is the only target that requires this flag. bio_set_dev() initializes the DM cloned bio's ->bi_blkg, using bio_associate_blkg, before passing the bio to md_handle_request(). Long-term fix would be to audit and refactor MD code to rely on DM to split its bio, using dm_accept_partial_bio(), but there are MD raid personalities (e.g. raid1 and raid10) whose implementation are tightly coupled to handling the bio splitting inline. Fixes: ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-07-28Merge tag 'net-5.19-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter, no known blockers for the release. Current release - regressions: - wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop(), fix taking the lock before its initialized - Bluetooth: mgmt: fix double free on error path Current release - new code bugs: - eth: ice: fix tunnel checksum offload with fragmented traffic Previous releases - regressions: - tcp: md5: fix IPv4-mapped support after refactoring, don't take the pure v6 path - Revert "tcp: change pingpong threshold to 3", improving detection of interactive sessions - mld: fix netdev refcount leak in mld_{query | report}_work() due to a race - Bluetooth: - always set event mask on suspend, avoid early wake ups - L2CAP: fix use-after-free caused by l2cap_chan_put - bridge: do not send empty IFLA_AF_SPEC attribute Previous releases - always broken: - ping6: fix memleak in ipv6_renew_options() - sctp: prevent null-deref caused by over-eager error paths - virtio-net: fix the race between refill work and close, resulting in NAPI scheduled after close and a BUG() - macsec: - fix three netlink parsing bugs - avoid breaking the device state on invalid change requests - fix a memleak in another error path Misc: - dt-bindings: net: ethernet-controller: rework 'fixed-link' schema - two more batches of sysctl data race adornment" * tag 'net-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits) stmmac: dwmac-mediatek: fix resource leak in probe ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr net: ping6: Fix memleak in ipv6_renew_options(). net/funeth: Fix fun_xdp_tx() and XDP packet reclaim sctp: leave the err path free in sctp_stream_init to sctp_stream_free sfc: disable softirqs for ptp TX ptp: ocp: Select CRC16 in the Kconfig. tcp: md5: fix IPv4-mapped support virtio-net: fix the race between refill work and close mptcp: Do not return EINPROGRESS when subflow creation succeeds Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put Bluetooth: Always set event mask on suspend Bluetooth: mgmt: Fix double free on error path wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() ice: do not setup vlan for loopback VSI ice: check (DD | EOF) bits on Rx descriptor rather than (EOP | RS) ice: Fix VSIs unable to share unicast MAC ice: Fix tunnel checksum offload with fragmented traffic ice: Fix max VLANs available for VF netfilter: nft_queue: only allow supported familes and hooks ...
2022-07-28platform/x86: pmc_atom: Fix comment typoXin Gao
The double `of' is duplicated in line 50, remove one. Signed-off-by: Xin Gao <gaoxin@cdjrlc.com> Link: https://lore.kernel.org/r/20220722022337.15903-1-gaoxin@cdjrlc.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-07-28ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptrZiyang Xuan
Change net device's MTU to smaller than IPV6_MIN_MTU or unregister device while matching route. That may trigger null-ptr-deref bug for ip6_ptr probability as following. ========================================================= BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134 Read of size 4 at addr 0000000000000308 by task ping6/263 CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14 Call trace: dump_backtrace+0x1a8/0x230 show_stack+0x20/0x70 dump_stack_lvl+0x68/0x84 print_report+0xc4/0x120 kasan_report+0x84/0x120 __asan_load4+0x94/0xd0 find_match.part.0+0x70/0x134 __find_rr_leaf+0x408/0x470 fib6_table_lookup+0x264/0x540 ip6_pol_route+0xf4/0x260 ip6_pol_route_output+0x58/0x70 fib6_rule_lookup+0x1a8/0x330 ip6_route_output_flags_noref+0xd8/0x1a0 ip6_route_output_flags+0x58/0x160 ip6_dst_lookup_tail+0x5b4/0x85c ip6_dst_lookup_flow+0x98/0x120 rawv6_sendmsg+0x49c/0xc70 inet_sendmsg+0x68/0x94 Reproducer as following: Firstly, prepare conditions: $ip netns add ns1 $ip netns add ns2 $ip link add veth1 type veth peer name veth2 $ip link set veth1 netns ns1 $ip link set veth2 netns ns2 $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1 $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2 $ip netns exec ns1 ifconfig veth1 up $ip netns exec ns2 ifconfig veth2 up $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1 $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1 Secondly, execute the following two commands in two ssh windows respectively: $ip netns exec ns1 sh $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done $ip netns exec ns1 sh $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly, then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check. cpu0 cpu1 fib6_table_lookup __find_rr_leaf addrconf_notify [ NETDEV_CHANGEMTU ] addrconf_ifdown RCU_INIT_POINTER(dev->ip6_ptr, NULL) find_match ip6_ignore_linkdown So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to fix the null-ptr-deref bug. Fixes: dcd1f572954f ("net/ipv6: Remove fib6_idev") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220728013307.656257-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28s390: add ELF note type for encrypted CPU state of a PV VCPUJanosch Frank
The type NT_S390_PV_CPU_DATA note contains the encrypted CPU state of a PV VCPU. It's only relevant in dumps of s390 PV VMs and can't be decrypted without a second block of encrypted data which provides key parts. Therefore we only reserve the note type here. The zgetdump tool from the s390-tools package can, together with a Customer Communication Key, be used to convert a PV VM dump into a normal VM dump. zgetdump will decrypt the CPU data and overwrite the other respective notes to make the data accessible for crash and other debugging tools. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> [agordeev@linux.ibm.com changed desctiption] Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28thermal/core: Add thermal_trip in thermal_zoneDaniel Lezcano
The thermal trip points are properties of a thermal zone and the different sub systems should be able to save them in the thermal zone structure instead of having their own definition. Give the opportunity to the drivers to create a thermal zone with thermal trips which will be accessible directly from the thermal core framework. As we added the thermal trip points structure in the thermal zone, let's extend the thermal zone register function to have the thermal trip structures as a parameter and store it in the 'trips' field of the thermal zone structure. The thermal zone contains the trip point, we can store them directly when registering the thermal zone. That will allow another step forward to remove the duplicate thermal zone structure we find in the thermal_of code. Cc: Alexandre Bailon <abailon@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org> Link: https://lore.kernel.org/r/20220722200007.1839356-9-daniel.lezcano@linexp.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28thermal/core: Rename 'trips' to 'num_trips'Daniel Lezcano
In order to use thermal trips defined in the thermal structure, rename the 'trips' field to 'num_trips' to have the 'trips' field containing the thermal trip points. Cc: Alexandre Bailon <abailon@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org> Link: https://lore.kernel.org/r/20220722200007.1839356-8-daniel.lezcano@linexp.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28thermal/of: Move thermal_trip structure to thermal.hDaniel Lezcano
The structure thermal_trip is now generic and will be usable by the different sensor drivers in place of their own structure. Move its definition to thermal.h to make it accessible. Cc: Alexandre Bailon <abailon@baylibre.com> Cc: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20220722200007.1839356-5-daniel.lezcano@linexp.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28thermal/core: Remove DROP_FULL and RAISE_FULLDaniel Lezcano
The trends DROP_FULL and RAISE_FULL are not used and were never used in the past AFAICT. Remove these conditions as they seems to not be handled anywhere. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20220629151012.3115773-2-daniel.lezcano@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28drivers/thermal/cpufreq_cooling : Refactor thermal_power_cpu_get_power tracingLukasz Luba
Simplify the thermal_power_cpu_get_power trace event by removing complicated cpumask and variable length array. Now the tools parsing trace output don't have to hassle to get this power data. The simplified format version uses 'policy->cpu'. Remove also the 'load' information completely since there is very little value of it in this trace event. To get the CPUs' load (or utilization) there are other dedicated trace hooks in the kernel. This patch also simplifies and speeds-up the main cooling code when that trace event is enabled. Rename the trace event to avoid confusion of tools which parse the trace file. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/20220613124327.30766-3-lukasz.luba@arm.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2022-07-28wait: Fix __wait_event_hrtimeout for RT/DL tasksJuri Lelli
Changes to hrtimer mode (potentially made by __hrtimer_init_sleeper on PREEMPT_RT) are not visible to hrtimer_start_range_ns, thus not accounted for by hrtimer_start_expires call paths. In particular, __wait_event_hrtimeout suffers from this problem as we have, for example: fs/aio.c::read_events wait_event_interruptible_hrtimeout __wait_event_hrtimeout hrtimer_init_sleeper_on_stack <- this might "mode |= HRTIMER_MODE_HARD" on RT if task runs at RT/DL priority hrtimer_start_range_ns WARN_ON_ONCE(!(mode & HRTIMER_MODE_HARD) ^ !timer->is_hard) fires since the latter doesn't see the change of mode done by init_sleeper Fix it by making __wait_event_hrtimeout call hrtimer_sleeper_start_expires, which is aware of the special RT/DL case, instead of hrtimer_start_range_ns. Reported-by: Bruno Goncalves <bgoncalv@redhat.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20220627095051.42470-1-juri.lelli@redhat.com
2022-07-28Merge tag 'timers-v5.20-rc1' of ↵Thomas Gleixner
https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clockevent/source updates from Daniel Lezcano: - Add the missing DT bindings for the MTU nomadik timer (Linus Walleij) - Fix grammar typo in the ARM global timer Kconfig option (Randy Dunlap) - Add the tegra186 timer and use it on the tegra234 board (Thierry Reding) - Add the 'CPUXGPT' CPU timer for Mediatek MT6795 and implement a workaround to overcome an ATF bug where the timer is not correctly initialized (AngeloGioacchino Del Regno) - Rework the suspend/resume approach to enable the feature on the timer even it is not an active clock and fix a compilation warning (Claudiu Beznea) - Add the Add R-Car Gen4 timer support along with the DT bindings (Wolfram Sang) - Add compatible for ti,am654-timer to support AM6 SoC (Tony Lindgren) - Fix Kconfig option to put it back to 'bool' instead of 'tristate' for the tegra186 (Daniel Lezcano) - Sort 'family,type' DT bindings for the Renesas timers (Geert Uytterhoeven) - Add compatible 'allwinner,sun20i-d1-timer' for Allwinner D1 (Samuel Holland) - Remove unnecessary (void*) conversions for sun4i (XU pengfei) - Remove unnecessary (void*) conversions for sun5i (Li zeming) Link: https://lore.kernel.org/all/7472984e-f502-5f27-82bf-070127dd85a5@linaro.org
2022-07-28Merge branch '100GbE' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: PPPoE offload support Marcin Szycik says: Add support for dissecting PPPoE and PPP-specific fields in flow dissector: PPPoE session id and PPP protocol type. Add support for those fields in tc-flower and support offloading PPPoE. Finally, add support for hardware offload of PPPoE packets in switchdev mode in ice driver. Example filter: tc filter add dev $PF1 ingress protocol ppp_ses prio 1 flower pppoe_sid \ 1234 ppp_proto ip skip_sw action mirred egress redirect dev $VF1_PR Changes in iproute2 are required to use the new fields (will be submitted soon). ICE COMMS DDP package is required to create a filter in ice. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for PPPoE hardware offload flow_offload: Introduce flow_match_pppoe net/sched: flower: Add PPPoE filter flow_dissector: Add PPPoE dissectors ==================== Link: https://lore.kernel.org/r/20220726203133.2171332-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-28can: dev: add generic function can_eth_ioctl_hwts()Vincent Mailhol
Tools based on libpcap (such as tcpdump) expect the SIOCSHWTSTAMP ioctl call to be supported. This is also specified in the kernel doc [1]. The purpose of this ioctl is to toggle the hardware timestamps. Currently, CAN devices which support hardware timestamping have those always activated. can_eth_ioctl_hwts() is a dumb function that will always succeed when requested to set tx_type to HWTSTAMP_TX_ON or rx_filter to HWTSTAMP_FILTER_ALL. [1] Kernel doc: Timestamping, section 3.1 "Hardware Timestamping Implementation: Device Drivers" Link: https://docs.kernel.org/networking/timestamping.html#hardware-timestamping-implementation-device-drivers Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20220727101641.198847-9-mailhol.vincent@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-07-28can: dev: add generic function can_ethtool_op_get_ts_info_hwts()Vincent Mailhol
Add function can_ethtool_op_get_ts_info_hwts(). This function will be used by CAN devices with hardware TX/RX timestamping support to implement ethtool_ops::get_ts_info. This function does not offer support to activate/deactivate hardware timestamps at device level nor support the filter options (which is currently the case for all CAN devices with hardware timestamping support). The fact that hardware timestamp can not be deactivated at hardware level does not impact the userland. As long as the user do not set SO_TIMESTAMPING using a setsockopt() or ioctl(), the kernel will not emit TX timestamps (RX timestamps will still be reproted as it is the case currently). Drivers which need more fine grained control remains free to implement their own function, but we foresee that the generic function introduced here will be sufficient for the majority. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20220727101641.198847-8-mailhol.vincent@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>