summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-10-20Merge tag 'dma-mapping-5.15-2' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: - fix more dma-debug fallout (Gerald Schaefer, Hamza Mahfooz) - fix a kerneldoc warning (Logan Gunthorpe) * tag 'dma-mapping-5.15-2' of git://git.infradead.org/users/hch/dma-mapping: dma-debug: teach add_dma_entry() about DMA_ATTR_SKIP_CPU_SYNC dma-debug: fix sg checks in debug_dma_map_sg() dma-mapping: fix the kerneldoc for dma_map_sgtable()
2021-10-20drm/amdgpu: support B0&B1 external revision id for yellow carpAaron Liu
B0 internal rev_id is 0x01, B1 internal rev_id is 0x02. The external rev_id for B0 and B1 is 0x20. The original expression is not suitable for B1. v2: squash in fix for display code (Alex) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amd/display: Moved dccg init to after bios golden initJake Wang
[Why] bios_golden_init will override dccg_init during init_hw. [How] Move dccg_init to after bios_golden_init. Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Reviewed-by: Eric Yang <eric.yang2@amd.com> Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com> Signed-off-by: Jake Wang <haonan.wang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amd/display: Increase watermark latencies for DCN3.1Nikola Cornij
[why] The original latencies were causing underflow in some modes [how] Replace with the up-to-date watermark values based on new measurments Reviewed-by: Ahmad Othman <ahmad.othman@amd.com> Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com> Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amd/display: increase Z9 latency to workaround underflow in Z9Eric Yang
[Why] Z9 latency is higher than when we originally tuned the watermark parameters, causing underflow. Increasing the value until the latency issues is resolved. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com> Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amd/display: Require immediate flip support for DCN3.1 planesNicholas Kazlauskas
[Why] Immediate flip can be enabled dynamically and has higher BW requirements when validating which voltage mode to use. If we validate when it's not set then potentially DCFCLK will be too low and we will underflow. [How] DM always requires support so always require it as part of DML input parameters. This can't be enabled unconditionally on older ASIC because it blocks some expected modes so only target DCN3.1 for now. Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amd/display: Fix prefetch bandwidth calculation for DCN3.1Nicholas Kazlauskas
[Why] Prefetch BW calculated is lower than the DML reference because of a porting error that's excluding cursor and row bandwidth from the pixel data bandwidth. [How] Change the dml_max4 to dml_max3 and include cursor and row bandwidth in the same calculation as the rest of the pixel data during vactive. Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amd/display: Limit display scaling to up to true 4k for DCN 3.1Nikola Cornij
[why] The requirement is that image width up to 4096 shall be supported Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com> Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20drm/amdgpu: fix out of bounds writeThelford Williams
Size can be any value and is user controlled resulting in overwriting the 40 byte array wr_buf with an arbitrary length of data from buf. Signed-off-by: Thelford Williams <tdwilliamsiv@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-10-20irqchip: Fix kernel-doc parameter typo for IRQCHIP_DECLAREFlorian Fainelli
The documentation refers to "compstr" when we have the parameter named "compat", fix the typo. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-14-f.fainelli@gmail.com
2021-10-20ARM: bcm: Removed forced select of interrupt controllersFlorian Fainelli
Now that the various second level interrupt controllers have been moved to IRQCHIP_PLATFORM_DRIVER and they do default to ARCH_BRCMSTB and ARCH_BCM2835 where relevant, remove their forced selection from the machine entry to allow an user to build them as modules. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-13-f.fainelli@gmail.com
2021-10-20arm64: broadcom: Removed forced select of interrupt controllersFlorian Fainelli
Now that the various second level interrupt controllers have been moved to IRQCHIP_PLATFORM_DRIVER and they do default to ARCH_BRCMSTB and ARCH_BCM2835 where relevant, remove their forced selection from the machine entry to allow an user to build them as modules. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-12-f.fainelli@gmail.com
2021-10-20irqchip/irq-bcm7120-l2: Switch to IRQCHIP_PLATFORM_DRIVERFlorian Fainelli
Allow the user selection and building of this interrupt controller driver as a module since it is used on ARM/ARM64 based systems as a second level interrupt controller hanging off the ARM GIC and is therefore loadable during boot. To avoid using of_irq_count() which is not exported towards module, switch the driver to use the platform_device provided by the irqchip platform driver code and resolve the number of interrupts using platform_irq_count(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-11-f.fainelli@gmail.com
2021-10-20genirq: Export irq_gc_noop()Florian Fainelli
In order to build drivers/irqchip/irq-bcm7120-l2.c as a module which references irq_gc_noop(), we need to export it towards modules. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-10-f.fainelli@gmail.com
2021-10-20irqchip/irq-brcmstb-l2: Switch to IRQCHIP_PLATFORM_DRIVERFlorian Fainelli
Allow the user selection and building of this interrupt controller driver as a module since it is used on ARM/ARM64 based systems as a second level interrupt controller hanging off the ARM GIC and is therefore loadable during boot. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-9-f.fainelli@gmail.com
2021-10-20genirq: Export irq_gc_{unmask_enable,mask_disable}_regFlorian Fainelli
In order to allow drivers/irqchip/irq-brcmstb-l2.c to be built as a module we need to export: irq_gc_unmask_enable_reg() and irq_gc_mask_disable_reg(). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-8-f.fainelli@gmail.com
2021-10-20irqchip/irq-bcm7038-l1: Switch to IRQCHIP_PLATFORM_DRIVERFlorian Fainelli
Allow the user selection and building of this interrupt controller driver as a module since it is used on ARM/ARM64 based systems as a second level interrupt controller hanging off the ARM GIC and is therefore loadable during boot. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-7-f.fainelli@gmail.com
2021-10-20irqchip/irq-bcm7038-l1: Restrict affinity setting to MIPSFlorian Fainelli
Only MIPS based platforms using this interrupt controller as first level interrupt controller can actually change the affinity of interrupts by re-programming the affinity mask of the interrupt controller and use another word group to have another CPU process the interrupt. When this interrupt is used as a second level interrupt controller on ARM/ARM64 there is no way to change the interrupt affinity. This fixes a NULL pointer de-reference while trying to change the affinity since there is only a single word group in that case, and we would have been overruning the intc->cpus[] array. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-6-f.fainelli@gmail.com
2021-10-20irqchip/irq-bcm7038-l1: Gate use of CPU logical map to MIPSFlorian Fainelli
The use of the cpu_logical_map[] array is only relevant for MIPS based platform where this driver is used as a first level interrupt controller and contains multiple register groups to map with an associated CPU. On ARM/ARM64 based systems this interrupt controller is present and used as a second level interrupt controller hanging off the ARM GIC. That copy of the interrupt controller contains a single group, resulting in the intc->cpus[] array to be of size 1. Things happened to work in that case because we install that interrupt controller as a chained handler which does not allow it to be affine to any CPU but the boot CPU which happens to be 0, therefore we never de-reference past intc->cpus[] but with the current code in place, we do leave a chance of de-referencing the array past its bounds. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-5-f.fainelli@gmail.com
2021-10-20MIPS: BMIPS: Remove use of irq_cpu_offlineFlorian Fainelli
irq_cpu_offline() is only used by MIPS and we should instead use irq_migrate_all_off_this_cpu(). This will be helpful in order to remove drivers/irqchip/irq-bcm7038-l1.c irq_cpu_offline callback which would have got in the way of making this driver modular. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-2-f.fainelli@gmail.com
2021-10-20irqchip/irq-bcm7038-l1: Use irq_get_irq_data()Florian Fainelli
Using irq_desc_get_irq_data(irq_to_desc()) to retrieve the irq_data structure from a virtual interrupt number is going to be problematic to make irq-bcm7038-l1 a module because irq_to_desc() is not exported, and there is no intent to export it to modules, see 64a1b95bb9fe ("genirq: Restrict export of irq_to_desc()"). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-4-f.fainelli@gmail.com
2021-10-20irqchip/irq-bcm7038-l1: Remove .irq_cpu_offline()Florian Fainelli
With arch/mips/kernel/smp-bmips.c having been migrated away from irq_cpu_offline() and use irq_migrate_all_off_this_cpu() instead, we no longer need to implement an .irq_cpu_offline() callback. This is a necessary change to facilitate the building of this driver as a module. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020184859.2705451-3-f.fainelli@gmail.com
2021-10-20Merge branch 'btf_dump fixes for s390'Andrii Nakryiko
Ilya Leoshkevich says: ==================== This series along with [1] and [2] fixes all the failures in the btf_dump testsuite currently present on s390, in particular: * [1] fixes intermittent build bug causing "failed to encode tag ..." * error messages. * [2] fixes missing VAR entries on s390. * Patch 1 disables Intel-specific code in a testcase. * Patch 2 fixes an endianness-related bug. * Patch 3 fixes an alignment-related bug. * Patch 4 improves overly pessimistic alignment handling. [1] https://lore.kernel.org/bpf/20211012022521.399302-1-iii@linux.ibm.com/ [2] https://lore.kernel.org/bpf/20211012022637.399365-1-iii@linux.ibm.com/ v1: https://lore.kernel.org/bpf/20211012023218.399568-1-iii@linux.ibm.com/ v1 -> v2: - Remove redundant local variables, use t->size directly instead. Best regards, Ilya ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-10-20irqchip/mchp-eic: Add support for the Microchip EICClaudiu Beznea
Add support for Microchip External Interrupt Controller. The controller supports 2 external interrupt lines. For every external input there is a connection to GIC. The interrupt controllers contains only 4 registers: - EIC_GFCS (read only): which indicates that glitch filter configuration is ready (not addressed in this implementation) - EIC_SCFG0R, EIC_SCFG1R (read, write): allows per interrupt specific settings: enable, polarity/edge settings, glitch filter settings - EIC_WPMR, EIC_WPSR: enables write protection mode specific settings (which are architecture specific) for the controller and are not addressed in this implementation Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210927063657.2157676-3-claudiu.beznea@microchip.com
2021-10-20libbpf: Fix dumping non-aligned __int128Ilya Leoshkevich
Non-aligned integers are dumped as bitfields, which is supported for at most 64-bit integers. Fix by using the same trick as btf_dump_float_data(): copy non-aligned values to the local buffer. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-4-iii@linux.ibm.com
2021-10-20dt-bindings: microchip,eic: Add bindings for the Microchip EICClaudiu Beznea
Add DT bindings for Microchip External Interrupt Controller. Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210927063657.2157676-2-claudiu.beznea@microchip.com
2021-10-20libbpf: Fix dumping big-endian bitfieldsIlya Leoshkevich
On big-endian arches not only bytes, but also bits are numbered in reverse order (see e.g. S/390 ELF ABI Supplement, but this is also true for other big-endian arches as well). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-3-iii@linux.ibm.com
2021-10-20selftests/bpf: Use cpu_number only on arches that have itIlya Leoshkevich
cpu_number exists only on Intel and aarch64, so skip the test involing it on other arches. An alternative would be to replace it with an exported non-ifdefed primitive-typed percpu variable from the common code, but there appears to be none. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-2-iii@linux.ibm.com
2021-10-20arm64: meson: remove MESON_IRQ_GPIO selectionNeil Armstrong
Selecting MESON_IRQ_GPIO forces it as built-in, but we may need to build it as a module, thus remove it here and let the "default ARCH_MESON" build as built-in by default with the option to switch it to module. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210902134914.176986-3-narmstrong@baylibre.com
2021-10-20irqchip/meson-gpio: Make it possible to build as a moduleNeil Armstrong
In order to reduce the kernel Image size on multi-platform distributions, make it possible to build the Amlogic GPIO IRQ controller as a module by switching it to a platform driver. Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Saravana Kannan <saravanak@google.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210902134914.176986-2-narmstrong@baylibre.com
2021-10-20Merge branch irq/devm-churn into irq/irqchip-nextMarc Zyngier
* irq/devm-churn: : . : Rework a number of drivers to use devm_platform_ioremap_resource() : instead of platform_get_resource() + devm_ioremap_resource(). : . irqchip/ti-sci-inta: Make use of the helper function devm_platform_ioremap_resource() irqchip/stm32: Make use of the helper function devm_platform_ioremap_resource() irqchip/irq-ts4800: Make use of the helper function devm_platform_ioremap_resource() irqchip/irq-mvebu-pic: Make use of the helper function devm_platform_ioremap_resource() irqchip/irq-mvebu-icu: Make use of the helper function devm_platform_ioremap_resource() Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-10-20irqchip: Provide stronger type checking for IRQCHIP_MATCH/IRQCHIP_DECLAREMarc Zyngier
Both IRQCHIP_DECLARE() and IRQCHIP_MATCH() use an underlying of_device_id() structure to encode the matching property and the init callback. However, this callback is stored in as a void * pointer, which obviously defeat any attempt at stronger type checking. Work around this by providing a new macro that builds on top of the __typecheck() primitive, and that can be used to warn when there is a discrepency between the drivers and core code. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211020104527.3066268-1-maz@kernel.org
2021-10-20samples/bpf: Fix application of sizeof to pointerDavid Yang
The coccinelle check report: "./samples/bpf/xdp_redirect_cpu_user.c:397:32-38: ERROR: application of sizeof to pointer" Using the "strlen" to fix it. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: David Yang <davidcomponentone@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211012111649.983253-1-davidcomponentone@gmail.com
2021-10-20bpftool: Remove useless #include to <perf-sys.h> from map_perf_ring.cQuentin Monnet
The header is no longer needed since the event_pipe implementation was updated to rely on libbpf's perf_buffer. This makes bpftool free of dependencies to perf files, and we can update the Makefile accordingly. Fixes: 9b190f185d2f ("tools/bpftool: switch map event_pipe to libbpf's perf_buffer") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211020094826.16046-1-quentin@isovalent.com
2021-10-20selftests/bpf: Remove duplicated include in cgroup_helpersWan Jiabing
Fix following checkincludes.pl warning: ./scripts/checkincludes.pl tools/testing/selftests/bpf/cgroup_helpers.c tools/testing/selftests/bpf/cgroup_helpers.c: unistd.h is included more than once. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211012023231.19911-1-wanjiabing@vivo.com
2021-10-20net/mlx5e: IPsec: Fix work queue entry ethernet segment checksum flagsEmeel Hakim
Current Work Queue Entry (WQE) checksum (csum) flags in the ethernet segment (eseg) in case of IPsec crypto offload datapath are not aligned with PRM/HW expectations. Currently the driver always sets the l3_inner_csum flag in case of IPsec because of the wrong usage of skb->encapsulation as indicator for inner IPsec header since skb->encapsulation is always ON for IPsec packets since IPsec itself is an encapsulation protocol. The above forced a failing attempts of calculating csum of non-existing segments (like in the IP|ESP|TCP packet case which does not have an l3_inner) which led to lots of packet drops hence the low throughput. Fix by using xo->inner_ipproto as indicator for inner IPsec header instead of skb->encapsulation in addition to setting the csum flags as following: * Tunnel Mode: * Pkt: MAC IP ESP IP L4 * CSUM: l3_cs | l3_inner_cs | l4_inner_cs * * Transport Mode: * Pkt: MAC IP ESP L4 * CSUM: l3_cs [ | l4_cs (checksum partial case)] * * Tunnel(VXLAN TCP/UDP) over Transport Mode * Pkt: MAC IP ESP UDP VXLAN IP L4 * CSUM: l3_cs | l3_inner_cs | l4_inner_cs Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5e: IPsec: Fix a misuse of the software parser's fieldsEmeel Hakim
IPsec crypto offload current Software Parser (SWP) fields settings in the ethernet segment (eseg) are not aligned with PRM/HW expectations. Among others in case of IP|ESP|TCP packet, current driver sets the offsets for inner_l3 and inner_l4 although there is no inner l3/l4 headers relative to ESP header in such packets. SWP provides the offsets for HW ,so it can be used to find csum fields to offload the checksum, however these are not necessarily used by HW and are used as fallback in case HW fails to parse the packet, e.g when performing IPSec Transport Aware (IP | ESP | TCP) there is no need to add SW parse on inner packet. So in some cases packets csum was calculated correctly , whereas in other cases it failed. The later faced csum errors (caused by wrong packet length calculations) which led to lots of packet drops hence the low throughput. Fix by setting the SWP fields as expected in a IP|ESP|TCP packet. the following describe the expected SWP offsets: * Tunnel Mode: * SWP: OutL3 InL3 InL4 * Pkt: MAC IP ESP IP L4 * * Transport Mode: * SWP: OutL3 OutL4 * Pkt: MAC IP ESP L4 * * Tunnel(VXLAN TCP/UDP) over Transport Mode * SWP: OutL3 InL3 InL4 * Pkt: MAC IP ESP UDP VXLAN IP L4 Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5e: Fix vlan data lost during suspend flowMoshe Shemesh
During suspend flow the driver calls mlx5e_destroy_vlan_table() which does not only delete the vlans steering flow rules, but also frees the data on currently active vlans, thus it is not restored during resume flow. This fix keeps the vlan data on suspend flow and frees it only on driver remove flow. Fixes: 6783f0a21a3c ("net/mlx5e: Dynamic alloc vlan table for netdev when needed") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5: E-switch, Return correct error code on group creation failureDmytro Linkin
Dan Carpenter report: The patch f47e04eb96e0: "net/mlx5: E-switch, Allow setting share/max tx rate limits of rate groups" from May 31, 2021, leads to the following Smatch static checker warning: drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c:483 esw_qos_create_rate_group() warn: passing zero to 'ERR_PTR' If min rate normalization failed then error code may be overwritten to 0 if scheduling element destruction succeed. Ignore this value and always return initial one. Fixes: f47e04eb96e0 ("net/mlx5: E-switch, Allow setting share/max tx rate limits of rate groups") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20net/mlx5: Lag, change multipath and bonding to be mutually exclusiveMaor Dickman
Both multipath and bonding events are changing the HW LAG state independently. Handling one of the features events while the other is already enabled can cause unwanted behavior, for example handling bonding event while multipath enabled will disable the lag and cause multipath to stop working. Fix it by ignoring bonding event while in multipath and ignoring FIB events while in bonding mode. Fixes: 544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-20bpf/preload: Clean up .gitignore and "clean-files" targetQuentin Monnet
kernel/bpf/preload/Makefile was recently updated to have it install libbpf's headers locally instead of pulling them from tools/lib/bpf. But two items still need to be addressed. First, the local .gitignore file was not adjusted to ignore the files generated in the new kernel/bpf/preload/libbpf output directory. Second, the "clean-files" target is now incorrect. The old artefacts names were not removed from the target, while the new ones were added incorrectly. This is because "clean-files" expects names relative to $(obj), but we passed the absolute path instead. This results in the output and header-destination directories for libbpf (and their contents) not being removed from kernel/bpf/preload on "make clean" from the root of the repository. This commit fixes both issues. Note that $(userprogs) needs not be added to "clean-files", because the cleaning infrastructure already accounts for it. Cleaning the files properly also prevents make from printing the following message, for builds coming after a "make clean": "make[4]: Nothing to be done for 'install_headers'." v2: Simplify the "clean-files" target. Fixes: bf60791741d4 ("bpf: preload: Install libbpf headers when building") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211020094647.15564-1-quentin@isovalent.com
2021-10-20libbpf: Migrate internal use of bpf_program__get_prog_info_linearDave Marchevsky
In preparation for bpf_program__get_prog_info_linear deprecation, move the single use in libbpf to call bpf_obj_get_info_by_fd directly. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211011082031.4148337-2-davemarchevsky@fb.com
2021-10-20gfs2: Eliminate ip->i_ghAndreas Gruenbacher
Now that gfs2_file_buffered_write is the only remaining user of ip->i_gh, we can move the glock holder to the stack (or rather, use the one we already have on the stack); there is no need for keeping the holder in the inode anymore. This is slightly complicated by the fact that we're using ip->i_gh for the statfs inode in gfs2_file_buffered_write as well. Writing to the statfs inode isn't very common, so allocate the statfs holder dynamically when needed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-10-20gfs2: Move the inode glock locking to gfs2_file_buffered_writeAndreas Gruenbacher
So far, for buffered writes, we were taking the inode glock in gfs2_iomap_begin and dropping it in gfs2_iomap_end with the intention of not holding the inode glock while iomap_write_actor faults in user pages. It turns out that iomap_write_actor is called inside iomap_begin ... iomap_end, so the user pages were still faulted in while holding the inode glock and the locking code in iomap_begin / iomap_end was completely pointless. Move the locking into gfs2_file_buffered_write instead. We'll take care of the potential deadlocks due to faulting in user pages while holding a glock in a subsequent patch. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-10-20gfs2: Introduce flag for glock holder auto-demotionBob Peterson
This patch introduces a new HIF_MAY_DEMOTE flag and infrastructure that will allow glocks to be demoted automatically on locking conflicts. When a locking request comes in that isn't compatible with the locking state of an active holder and that holder has the HIF_MAY_DEMOTE flag set, the holder will be demoted before the incoming locking request is granted. Note that this mechanism demotes active holders (with the HIF_HOLDER flag set), while before we were only demoting glocks without any active holders. This allows processes to keep hold of locks that may form a cyclic locking dependency; the core glock logic will then break those dependencies in case a conflicting locking request occurs. We'll use this to avoid giving up the inode glock proactively before faulting in pages. Processes that allow a glock holder to be taken away indicate this by calling gfs2_holder_allow_demote(), which sets the HIF_MAY_DEMOTE flag. Later, they call gfs2_holder_disallow_demote() to clear the flag again, and then they check if their holder is still queued: if it is, they are still holding the glock; if it isn't, they can re-acquire the glock (or abort). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-10-20gfs2: Clean up function may_grantAndreas Gruenbacher
Pass the first current glock holder into function may_grant and deobfuscate the logic there. While at it, switch from BUG_ON to GLOCK_BUG_ON in may_grant. To make that build cleanly, de-constify the may_grant arguments. We're now using function find_first_holder in do_promote, so move the function's definition above do_promote. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-10-20gfs2: Add wrapper for iomap_file_buffered_writeAndreas Gruenbacher
Add a wrapper around iomap_file_buffered_write. We'll add code for when the operation needs to be retried here later. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-10-20iov_iter: Introduce fault_in_iov_iter_writeableAndreas Gruenbacher
Introduce a new fault_in_iov_iter_writeable helper for safely faulting in an iterator for writing. Uses get_user_pages() to fault in the pages without actually writing to them, which would be destructive. We'll use fault_in_iov_iter_writeable in gfs2 once we've determined that the iterator passed to .read_iter isn't in memory. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-10-20nvmet: use struct_size over open coded arithmeticLen Baker
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. In this case this is not actually dynamic size: all the operands involved in the calculation are constant values. However it is better to refactor this anyway, just to keep the open-coded math idiom out of code. So, use the struct_size() helper to do the arithmetic instead of the argument "size + count * size" in the kmalloc() function. This code was detected with the help of Coccinelle and audited and fixed manually. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Signed-off-by: Len Baker <len.baker@gmx.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-10-20nvme: drop scan_lock and always kick requeue list when removing namespacesHannes Reinecke
When reading the partition table on initial scan hits an I/O error the I/O will hang with the scan_mutex held: [<0>] do_read_cache_page+0x49b/0x790 [<0>] read_part_sector+0x39/0xe0 [<0>] read_lba+0xf9/0x1d0 [<0>] efi_partition+0xf1/0x7f0 [<0>] bdev_disk_changed+0x1ee/0x550 [<0>] blkdev_get_whole+0x81/0x90 [<0>] blkdev_get_by_dev+0x128/0x2e0 [<0>] device_add_disk+0x377/0x3c0 [<0>] nvme_mpath_set_live+0x130/0x1b0 [nvme_core] [<0>] nvme_mpath_add_disk+0x150/0x160 [nvme_core] [<0>] nvme_alloc_ns+0x417/0x950 [nvme_core] [<0>] nvme_validate_or_alloc_ns+0xe9/0x1e0 [nvme_core] [<0>] nvme_scan_work+0x168/0x310 [nvme_core] [<0>] process_one_work+0x231/0x420 and trying to delete the controller will deadlock as it tries to grab the scan mutex: [<0>] nvme_mpath_clear_ctrl_paths+0x25/0x80 [nvme_core] [<0>] nvme_remove_namespaces+0x31/0xf0 [nvme_core] [<0>] nvme_do_delete_ctrl+0x4b/0x80 [nvme_core] As we're now properly ordering the namespace list there is no need to hold the scan_mutex in nvme_mpath_clear_ctrl_paths() anymore. And we always need to kick the requeue list as the path will be marked as unusable and I/O will be requeued _without_ a current path. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>