summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-09-15Merge branch 'ibmvnic-next'David S. Miller
Sukadev Bhattiprolu says: ==================== ibmvnic: Reuse ltb, rx, tx pools It can take a long time to free and reallocate rx and tx pools and long term buffer (LTB) during each reset of the VNIC. This is specially true when the partition (LPAR) is heavily loaded and going through a Logical Partition Migration (LPM). The long drawn reset causes the LPAR to lose connectivity for extended periods of time and results in "RMC connection" errors and the LPM failing. What is worse is that during the LPM we could get a failover because of the lost connectivity. At that point, the vnic driver releases even the resources it has already allocated and starts over. As long as the resources we have already allocated are valid/applicable, we might as well hold on to them while trying to allocate the remaining resources. This patch set attempts to reuse the resources previously allocated as long as they are valid. It seems to vastly improve the time taken for the vnic reset and signficantly reduces the chances of getting the RMC connection errors. We do get still them occasionally, but appears to be for reasons other than memory allocation delays and those are still being investigated. If the backing devices for a vnic adapter are not "matched" (see "pool parameters" in patches 8 and 9) it is possible that we will still free all the resources and allocate them. If that becomes a common problem, we have to address it separately. Thanks to input and extensive testing from Brian King, Cris Forno, Dany Madden, Rick Lindsley. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Reuse tx pools when possibleSukadev Bhattiprolu
Rather than releasing the tx pools on every close and reallocating them on open, reuse the tx pools unless the pool parameters (number of pools, size of each pool or size of each buffer in a pool) have changed. If the pool parameters changed, then release the old pools (if any) and allocate new ones. Specifically release tx pools, if: - adapter is removed, - pool parameters change during reset, - we encounter an error when opening the adapter in response to a user request (in ibmvnic_open()). and don't release them: - in __ibmvnic_close() or - on errors in __ibmvnic_open() in the hope that we can reuse them during this or next reset. With these changes reset_tx_pools() can be dropped because its optimization is now included in init_tx_pools() itself. cleanup_tx_pools() releases all the skbs associated with the pool and is called from ibmvnic_cleanup(), which is called on every reset. Since we want to reuse skbs across resets, move cleanup_tx_pools() out of ibmvnic_cleanup() and call it only when user closes the adapter. Add two new adapter fields, ->prev_mtu, ->prev_tx_pool_size to track the previous values and use them to decide whether to reuse or realloc the pools. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Reuse rx pools when possibleSukadev Bhattiprolu
Rather than releasing the rx pools on and reallocating them on every reset, reuse the rx pools unless the pool parameters (number of pools, size of each pool or size of each buffer in a pool) have changed. If the pool parameters changed, then release the old pools (if any) and allocate new ones. Specifically release rx pools, if: - adapter is removed, - pool parameters change during reset, - we encounter an error when opening the adapter in response to a user request (in ibmvnic_open()). and don't release them: - in __ibmvnic_close() or - on errors in __ibmvnic_open() in the hope that we can reuse them on the next reset. With these, reset_rx_pools() can be dropped because its optimzation is now included in init_rx_pools() itself. cleanup_rx_pools() releases all the skbs associated with the pool and is called from ibmvnic_cleanup(), which is called on every reset. Since we want to reuse skbs across resets, move cleanup_rx_pools() out of ibmvnic_cleanup() and call it only when user closes the adapter. Add two new adapter fields, ->prev_rx_buf_sz, ->prev_rx_pool_size to keep track of the previous values and use them to decide whether to reuse or realloc the pools. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Reuse LTB when possibleSukadev Bhattiprolu
Reuse the long term buffer during a reset as long as its size has not changed. If the size has changed, free it and allocate a new one of the appropriate size. When we do this, alloc_long_term_buff() and reset_long_term_buff() become identical. Drop reset_long_term_buff(). Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Use bitmap for LTB map_idsSukadev Bhattiprolu
In a follow-on patch, we will reuse long term buffers when possible. When doing so we have to be careful to properly assign map ids. We can no longer assign them sequentially because a lower map id may be available and we could wrap at 255 and collide with an in-use map id. Instead, use a bitmap to track active map ids and to find a free map id. Don't need to take locks here since the map_id only changes during reset and at that time only the reset worker thread should be using the adapter. Noticed this when analyzing an error Dany Madden ran into with the patch set. Reported-by: Dany Madden <drt@linux.ibm.com> Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: init_tx_pools move loop-invariant codeSukadev Bhattiprolu
In init_tx_pools() move some loop-invariant code out of the loop. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Use/rename local vars in init_tx_poolsSukadev Bhattiprolu
Use/rename local variables in init_tx_pools() for consistency with init_rx_pools() and for readability. Also add some comments Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Use/rename local vars in init_rx_poolsSukadev Bhattiprolu
To make the code more readable, use/rename some local variables. Basically we have a set of pools, num_pools. Each pool has a set of buffers, pool_size and each buffer is of size buff_size. pool_size is a bit ambiguous (whether size in bytes or buffers). Add a comment in the header file to make it explicit. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Fix up some comments and messagesSukadev Bhattiprolu
Add/update some comments/function headers and fix up some messages. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ibmvnic: Consolidate code in replenish_rx_pool()Sukadev Bhattiprolu
For better readability, consolidate related code in replenish_rx_pool() and add some comments. Reviewed-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15Merge branch 'ptp-ocp-timecard-v13-fw'David S. Miller
Jonathan Lemon says: ==================== timecard updates for v13 firmware This update mainly deals with features for the TimeCard v13 firmware. The signals provided from the external SMA connectors can be steered to different locations, and the generated SMA signals can be chosen. Future timecard revisions will allow selectable I/O on any of the SMA connectors, so name the attributes appropriately, and set up the ABI in preparation for the new features. The update also adds support for IRIG-B and DCF formats, as well as NMEA output. A ts_window_adjust tunable is also provided to fine-tune the PHC:SYS time mapping. -- v1: Earlier reviewed series was for v10 firmware, this is expanded to include the v13 features. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15docs: ABI: Add sysfs documentation for timecardJonathan Lemon
This patch describes the sysfs interface implemented by the ptp_ocp driver, under /sys/class/timecard. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add timestamp window adjustmentJonathan Lemon
The following process is used to read the PHC clock and correlate the reading with the "correct" system time. - get starting timestamp - issue PCI write command - issue PCI read command - get ending timestamp - read latched sec/nsec registers The write command is posted to PCI bus and returns. When the write arrives at the FPGA, the PHC time is latched into the sec/nsec registers, and a flag is set indicating the registers are valid. The read command returns this flag, and the time retrieval proceeds. Below is a non-scaled picture of the timing diagram involved. The PHC time corresponds to some SYS time between [start, end]. Userspace usually uses the midpoint between [start, end] to estimate the PCI delay and match this with the PHC time. [start] | | write |-------+ | | \ | read |----+ +----->| | \ * PHC time latched into register | \ | midpoint | +------->| | | | | | +----| | / | |<--------+ | [end] | | As the diagram indicates, the PHC time is latched before the midpoint, so the system clock time is slightly off the real PHC time. This shows up as a phase error with an oscilliscope. The workaround here is to provide a tunable which reduces (shrinks) the end time in the above diagram. This in turn moves the calculated midpoint so the system time and PHC time are in agreemment. Currently, the adjustment reduces the end time by 3/16th of the entire window. E.g.: [start, end] ==> [start, (end - (3/16 * end)], which produces reasonably good results. Also reduce delays by just writing to the clock control register instead of performing a read/modify/write sequence, as the contents of the control register are known. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Have FPGA fold in ns adjustment for adjtime.Jonathan Lemon
The current implementation of adjtime uses gettime/settime to perform nanosecond adjustments. This introduces addtional phase errors due to delays. Instead, use the FPGA's ability to just apply the nanosecond adjustment to the clock directly. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Enable 4th timestamper / PPS generatorJonathan Lemon
A 4th timestamper is added which timestamps the output of the PHC. The clock nanosecond offset is not always zero, so when compared to other timestampers, this provides precise measurements. Also, the timestamper interrupt from the PHC can be used to generate a PPS signal for /dev/pps. Also allow PTP_CLK_REQ_PEROUT requests for a 1PPS output, but do not actually configure any output pins, this is done via sysfs. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add second GNSS deviceJonathan Lemon
Upcoming boards may have a second GNSS receiver, getting information from a different constellation than the first receiver, which provides some measure of anti-spoofing. Expose the sysfs attribute for this device, if detected. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add NMEA outputJonathan Lemon
The timecard can provide a NMEA-1083 ZDA (time and date) output string on a serial port, which can be used to drive other devices. Add the NMEA resources, and the serial port as a sysfs attribute. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add debugfs entry for timecardJonathan Lemon
Provide a view into the timecard internals for debugging. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Separate the init and info logicJonathan Lemon
On startup, parts of the FPGA need to be initialized - break these out into their own functions, separate from the purely informational blocks. On startup, distrbute the UTC:TAI offset from the NMEA GNSS parser, if it is available. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add sysfs attribute utc_tai_offsetJonathan Lemon
IRIG and DCF output time in UTC, but the timecard operates on TAI internally. Add an attribute node which allows adding an offset to these modes before output. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add IRIG-B output mode controlJonathan Lemon
IRIG-B has several different output formats, the timecard defaults to using B007. Add a control which selects different output modes. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add IRIG-B and DCF blocksJonathan Lemon
IRIG (Inter-range Instrumentation Group) timecode format on one of the SMA output channels is provided by the IRIG master FPGA block. Enable the master when the IRIG output format is selected on either one of the output channels. By default, the output is in B007 format. DCF output format is provided by the DCF master block. Also enable the IRIG and DCF slaves, which parse an incoming signal from the external SMA connectors, and may be used to adjust the PHC. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add SMA selector and controlsJonathan Lemon
The latest firmware for the TimeCard adds selectable signals for the SMA input/outputs. Add support for SMA selectors, and the GPIO controls needed for steering signals. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Add third timestamperJonathan Lemon
The firmware may provide a third signal timestamper, so make it available for use. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Report error if resource registration fails.Jonathan Lemon
If a resource could not be registered, report the name of the resource and the error code. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Skip resources with out of range irqsJonathan Lemon
The TimeCard exposes different resources, which may have their own irqs. Space for the irqs is allocated through a MSI or MSI-X interrupt vector. On some platforms, the interrupt allocation fails. Rather than making this fatal, just skip exposing those resources. The main timecard functionality (that of a PTP clock) will work without the additional resources. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Skip I2C flash read when there is no controller.Jonathan Lemon
If an I2C controller isn't present, don't try and read the I2C flash. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: Parameterize the TOD information display.Jonathan Lemon
Only display the TOD information if there is a corresponding TOD resource. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15ptp: ocp: parameterize the i2c driver usedJonathan Lemon
Move the xilinx i2c driver parameters to the resource block instead of hardcoding things in the registration functions. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15dt-bindings: net: lantiq: Add the burst length propertiesAleksander Jan Bajkowski
The new added properties are used for configuring burst length. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15dt-bindings: net: lantiq,etop-xway: Document Lantiq Xway ETOP bindingsAleksander Jan Bajkowski
Document the Lantiq Xway SoC series External Bus Unit (ETOP) bindings. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15dt-bindings: net: lantiq-xrx200-net: convert to the json-schemaAleksander Jan Bajkowski
Convert the Lantiq PMAC Device Tree binding documentation to json-schema. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15net: lantiq: configure the burst length in ethernet driversAleksander Jan Bajkowski
Configure the burst length in Ethernet drivers. This improves Ethernet performance by 58%. According to the vendor BSP, 8W burst length is supported by ar9 and newer SoCs. The NAT benchmark results on xRX200 (Down/Up): * 2W: 330 Mb/s * 4W: 432 Mb/s 372 Mb/s * 8W: 520 Mb/s 389 Mb/s Tested on xRX200 and xRX330. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15MIPS: lantiq: dma: make the burst length configurable by the driversAleksander Jan Bajkowski
Make the burst length configurable by the drivers. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15MIPS: lantiq: dma: fix burst length for DEUAleksander Jan Bajkowski
The current definition of 2W burst length is invalid. This patch fixes it. Current downstream DEU driver doesn't use DMA. An incorrect burst length value doesn't cause any errors. This patch also adds other burst length values. Fixes: dfec1a827d2b ("MIPS: Lantiq: Add DMA support") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15MIPS: lantiq: dma: reset correct number of channelAleksander Jan Bajkowski
Different SoCs have a different number of channels, e.g .: * amazon-se has 10 channels, * danube+ar9 have 20 channels, * vr9 has 28 channels, * ar10 has 24 channels. We can read the ID register and, depending on the reported number of channels, reset the appropriate number of channels. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15MIPS: lantiq: dma: add small delay after resetAleksander Jan Bajkowski
Reading the DMA registers immediately after the reset causes Data Bus Error. Adding a small delay fixes this issue. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14ptp: dp83640: don't define PAGE0Randy Dunlap
Building dp83640.c on arch/parisc/ produces a build warning for PAGE0 being redefined. Since the macro is not used in the dp83640 driver, just make it a comment for documentation purposes. In file included from ../drivers/net/phy/dp83640.c:23: ../drivers/net/phy/dp83640_reg.h:8: warning: "PAGE0" redefined 8 | #define PAGE0 0x0000 from ../drivers/net/phy/dp83640.c:11: ../arch/parisc/include/asm/page.h:187: note: this is the location of the previous definition 187 | #define PAGE0 ((struct zeropage *)__PAGE_OFFSET) Fixes: cb646e2b02b2 ("ptp: Added a clock driver for the National Semiconductor PHYTER.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Cochran <richard.cochran@omicron.at> Cc: John Stultz <john.stultz@linaro.org> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210913220605.19682-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14net: dsa: tag_rtl4_a: Drop bit 9 from egress framesLinus Walleij
This drops the code setting bit 9 on egress frames on the Realtek "type A" (RTL8366RB) frames. This bit was set on ingress frames for unknown reason, and was set on egress frames as the format of ingress and egress frames was believed to be the same. As that assumption turned out to be false, and since this bit seems to have zero effect on the behaviour of the switch let's drop this bit entirely. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210913143156.1264570-1-linus.walleij@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14bnx2x: Fix enabling network interfaces without VFsAdrian Bunk
This function is called to enable SR-IOV when available, not enabling interfaces without VFs was a regression. Fixes: 65161c35554f ("bnx2x: Fix missing error code in bnx2x_iov_init_one()") Signed-off-by: Adrian Bunk <bunk@kernel.org> Reported-by: YunQiang Su <wzssyqa@gmail.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Cc: stable@vger.kernel.org Acked-by: Shai Malin <smalin@marvell.com> Link: https://lore.kernel.org/r/20210912190523.27991-1-bunk@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-14Merge branch 'bpf: add support for new btf kind BTF_KIND_TAG'Alexei Starovoitov
Yonghong Song says: ==================== LLVM14 added support for a new C attribute ([1]) __attribute__((btf_tag("arbitrary_str"))) This attribute will be emitted to dwarf ([2]) and pahole will convert it to BTF. Or for bpf target, this attribute will be emitted to BTF directly ([3], [4]). The attribute is intended to provide additional information for - struct/union type or struct/union member - static/global variables - static/global function or function parameter. This new attribute can be used to add attributes to kernel codes, e.g., pre- or post- conditions, allow/deny info, or any other info in which only the kernel is interested. Such attributes will be processed by clang frontend and emitted to dwarf, converting to BTF by pahole. Ultimiately the verifier can use these information for verification purpose. The new attribute can also be used for bpf programs, e.g., tagging with __user attributes for function parameters, specifying global function preconditions, etc. Such information may help verifier to detect user program bugs. After this series, pahole dwarf->btf converter will be enhanced to support new llvm tag for btf_tag attribute. With pahole support, we will then try to add a few real use case, e.g., __user/__rcu tagging, allow/deny list, some kernel function precondition, etc, in the kernel. In the rest of the series, Patches 1-2 had kernel support. Patches 3-4 added libbpf support. Patch 5 added bpftool support. Patches 6-10 added various selftests. Patch 11 added documentation for the new kind. [1] https://reviews.llvm.org/D106614 [2] https://reviews.llvm.org/D106621 [3] https://reviews.llvm.org/D106622 [4] https://reviews.llvm.org/D109560 Changelog: v2 -> v3: - put NR_BTF_KINDS and BTF_KIND_MAX into enum as well - check component_idx earlier (check_meta stage) in kernel - add more tests - fix misc nits v1 -> v2: - BTF ELF format changed in llvm ([4] above), so cross-board change to use the new format. - Clarified in commit message that BTF_KIND_TAG is not emitted by bpftool btf dump format c. - Fix various comments from Andrii. ==================== Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-09-14docs/bpf: Add documentation for BTF_KIND_TAGYonghong Song
Add BTF_KIND_TAG documentation in btf.rst. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223103.249100-1-yhs@fb.com
2021-09-14selftests/bpf: Add a test with a bpf program with btf_tag attributesYonghong Song
Add a bpf program with btf_tag attributes. The program is loaded successfully with the kernel. With the command bpftool btf dump file ./tag.o the following dump shows that tags are properly encoded: [8] STRUCT 'key_t' size=12 vlen=3 'a' type_id=2 bits_offset=0 'b' type_id=2 bits_offset=32 'c' type_id=2 bits_offset=64 [9] TAG 'tag1' type_id=8 component_id=-1 [10] TAG 'tag2' type_id=8 component_id=-1 [11] TAG 'tag1' type_id=8 component_id=1 [12] TAG 'tag2' type_id=8 component_id=1 ... [21] FUNC_PROTO '(anon)' ret_type_id=2 vlen=1 'x' type_id=2 [22] FUNC 'foo' type_id=21 linkage=static [23] TAG 'tag1' type_id=22 component_id=0 [24] TAG 'tag2' type_id=22 component_id=0 [25] TAG 'tag1' type_id=22 component_id=-1 [26] TAG 'tag2' type_id=22 component_id=-1 ... [29] VAR 'total' type_id=27, linkage=global [30] TAG 'tag1' type_id=29 component_id=-1 [31] TAG 'tag2' type_id=29 component_id=-1 If an old clang compiler, which does not support btf_tag attribute, is used, these btf_tag attributes will be silently ignored. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223058.248949-1-yhs@fb.com
2021-09-14selftests/bpf: Test BTF_KIND_TAG for deduplicationYonghong Song
Add unit tests for BTF_KIND_TAG deduplication for - struct and struct member - variable - func and func argument Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223052.248535-1-yhs@fb.com
2021-09-14selftests/bpf: Add BTF_KIND_TAG unit testsYonghong Song
Test good and bad variants of BTF_KIND_TAG encoding. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223047.248223-1-yhs@fb.com
2021-09-14selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG formatYonghong Song
BTF_KIND_TAG ELF format has a component_idx which might have value -1. test_btf may confuse it with common_type.name as NAME_NTH checkes high 16bit to be 0xffff. Change NAME_NTH high 16bit check to be 0xfffe so it won't confuse with component_idx. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223041.248009-1-yhs@fb.com
2021-09-14selftests/bpf: Test libbpf API function btf__add_tag()Yonghong Song
Add btf_write tests with btf__add_tag() function. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223036.247560-1-yhs@fb.com
2021-09-14bpftool: Add support for BTF_KIND_TAGYonghong Song
Added bpftool support to dump BTF_KIND_TAG information. The new bpftool will be used in later patches to dump btf in the test bpf program object file. Currently, the tags are not emitted with bpftool btf dump file <path> format c and they are silently ignored. The tag information is mostly used in the kernel for verification purpose and the kernel uses its own btf to check. With adding these tags to vmlinux.h, tags will be encoded in program's btf but they will not be used by the kernel, at least for now. So let us delay adding these tags to format C header files until there is a real need. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223031.246951-1-yhs@fb.com
2021-09-14libbpf: Add support for BTF_KIND_TAGYonghong Song
Add BTF_KIND_TAG support for parsing and dedup. Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not supported in the kernel, sanitize it to INTs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com
2021-09-14libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tagYonghong Song
This patch renames functions btf_{hash,equal}_int() to btf_{hash,equal}_int_tag() so they can be reused for BTF_KIND_TAG support. There is no functionality change for this patch. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com