From fa96b24204af42274ec13dfb2f2e6990d7510e55 Mon Sep 17 00:00:00 2001 From: Benjamin Tissoires Date: Fri, 5 Aug 2022 14:48:14 -0700 Subject: btf: Add a new kfunc flag which allows to mark a function to be sleepable This allows to declare a kfunc as sleepable and prevents its use in a non sleepable program. Signed-off-by: Benjamin Tissoires Co-developed-by: Yosry Ahmed Signed-off-by: Yosry Ahmed Signed-off-by: Hao Luo Link: https://lore.kernel.org/r/20220805214821.1058337-2-haoluo@google.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index c0b7dae6dbf5..c8b21de1c772 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -146,6 +146,12 @@ that operate (change some property, perform some operation) on an object that was obtained using an acquire kfunc. Such kfuncs need an unchanged pointer to ensure the integrity of the operation being performed on the expected object. +2.4.6 KF_SLEEPABLE flag +----------------------- + +The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only +be called by sleepable BPF programs (BPF_F_SLEEPABLE). + 2.5 Registering the kfuncs -------------------------- -- cgit From 4dd48c6f1f83290d4bc61b43e61d86f8bc6c310e Mon Sep 17 00:00:00 2001 From: Artem Savkov Date: Wed, 10 Aug 2022 08:59:03 +0200 Subject: bpf: add destructive kfunc flag Add KF_DESTRUCTIVE flag for destructive functions. Functions with this flag set will require CAP_SYS_BOOT capabilities. Signed-off-by: Artem Savkov Link: https://lore.kernel.org/r/20220810065905.475418-2-asavkov@redhat.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 9 +++++++++ 1 file changed, 9 insertions(+) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index c8b21de1c772..781731749e55 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -152,6 +152,15 @@ ensure the integrity of the operation being performed on the expected object. The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only be called by sleepable BPF programs (BPF_F_SLEEPABLE). +2.4.7 KF_DESTRUCTIVE flag +-------------------------- + +The KF_DESTRUCTIVE flag is used to indicate functions calling which is +destructive to the system. For example such a call can result in system +rebooting or panicking. Due to this additional restrictions apply to these +calls. At the moment they only require CAP_SYS_BOOT capability, but more can be +added later. + 2.5 Registering the kfuncs -------------------------- -- cgit From eed807f626101f6a4227bd53942892c5983b95a7 Mon Sep 17 00:00:00 2001 From: Kumar Kartikeya Dwivedi Date: Wed, 21 Sep 2022 18:48:25 +0200 Subject: bpf: Tweak definition of KF_TRUSTED_ARGS Instead of forcing all arguments to be referenced pointers with non-zero reg->ref_obj_id, tweak the definition of KF_TRUSTED_ARGS to mean that only PTR_TO_BTF_ID (and socket types translated to PTR_TO_BTF_ID) have that constraint, and require their offset to be set to 0. The rest of pointer types are also accomodated in this definition of trusted pointers, but with more relaxed rules regarding offsets. The inherent meaning of setting this flag is that all kfunc pointer arguments have a guranteed lifetime, and kernel object pointers (PTR_TO_BTF_ID, PTR_TO_CTX) are passed in their unmodified form (with offset 0). In general, this is not true for PTR_TO_BTF_ID as it can be obtained using pointer walks. Signed-off-by: Kumar Kartikeya Dwivedi Signed-off-by: Lorenzo Bianconi Link: https://lore.kernel.org/r/cdede0043c47ed7a357f0a915d16f9ce06a1d589.1663778601.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov --- Documentation/bpf/kfuncs.rst | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index 781731749e55..0f858156371d 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -137,14 +137,22 @@ KF_ACQUIRE and KF_RET_NULL flags. -------------------------- The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It -indicates that the all pointer arguments will always be refcounted, and have -their offset set to 0. It can be used to enforce that a pointer to a refcounted -object acquired from a kfunc or BPF helper is passed as an argument to this -kfunc without any modifications (e.g. pointer arithmetic) such that it is -trusted and points to the original object. This flag is often used for kfuncs -that operate (change some property, perform some operation) on an object that -was obtained using an acquire kfunc. Such kfuncs need an unchanged pointer to -ensure the integrity of the operation being performed on the expected object. +indicates that the all pointer arguments will always have a guaranteed lifetime, +and pointers to kernel objects are always passed to helpers in their unmodified +form (as obtained from acquire kfuncs). + +It can be used to enforce that a pointer to a refcounted object acquired from a +kfunc or BPF helper is passed as an argument to this kfunc without any +modifications (e.g. pointer arithmetic) such that it is trusted and points to +the original object. + +Meanwhile, it is also allowed pass pointers to normal memory to such kfuncs, +but those can have a non-zero offset. + +This flag is often used for kfuncs that operate (change some property, perform +some operation) on an object that was obtained using an acquire kfunc. Such +kfuncs need an unchanged pointer to ensure the integrity of the operation being +performed on the expected object. 2.4.6 KF_SLEEPABLE flag ----------------------- -- cgit From 6166da0a02cde26c065692d0c05eb685178fee75 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Tue, 27 Sep 2022 18:59:44 +0000 Subject: bpf, docs: Move legacy packet instructions to a separate file Move legacy packet instructions to a separate file. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20220927185958.14995-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/instruction-set.rst | 38 ++------------------ Documentation/bpf/linux-notes.rst | 65 +++++++++++++++++++++++++++++++++++ 2 files changed, 68 insertions(+), 35 deletions(-) create mode 100644 Documentation/bpf/linux-notes.rst (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index 1b0e6711dec9..352f25a1eb17 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -282,8 +282,6 @@ arithmetic operations in the imm field to encode the atomic operation: *(u64 *)(dst_reg + off16) += src_reg -``BPF_XADD`` is a deprecated name for ``BPF_ATOMIC | BPF_ADD``. - In addition to the simple atomic operations, there also is a modifier and two complex atomic operations: @@ -331,36 +329,6 @@ There is currently only one such instruction. Legacy BPF Packet access instructions ------------------------------------- -eBPF has special instructions for access to packet data that have been -carried over from classic BPF to retain the performance of legacy socket -filters running in the eBPF interpreter. - -The instructions come in two forms: ``BPF_ABS | | BPF_LD`` and -``BPF_IND | | BPF_LD``. - -These instructions are used to access packet data and can only be used when -the program context is a pointer to networking packet. ``BPF_ABS`` -accesses packet data at an absolute offset specified by the immediate data -and ``BPF_IND`` access packet data at an offset that includes the value of -a register in addition to the immediate data. - -These instructions have seven implicit operands: - - * Register R6 is an implicit input that must contain pointer to a - struct sk_buff. - * Register R0 is an implicit output which contains the data fetched from - the packet. - * Registers R1-R5 are scratch registers that are clobbered after a call to - ``BPF_ABS | BPF_LD`` or ``BPF_IND | BPF_LD`` instructions. - -These instructions have an implicit program exit condition as well. When an -eBPF program is trying to access the data beyond the packet boundary, the -program execution will be aborted. - -``BPF_ABS | BPF_W | BPF_LD`` means:: - - R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32)) - -``BPF_IND | BPF_W | BPF_LD`` means:: - - R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32)) +eBPF previously introduced special instructions for access to packet data that were +carried over from classic BPF. However, these instructions are +deprecated and should no longer be used. diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst new file mode 100644 index 000000000000..93c01386d92c --- /dev/null +++ b/Documentation/bpf/linux-notes.rst @@ -0,0 +1,65 @@ +.. contents:: +.. sectnum:: + +========================== +Linux implementation notes +========================== + +This document provides more details specific to the Linux kernel implementation of the eBPF instruction set. + +Legacy BPF Packet access instructions +===================================== + +As mentioned in the `ISA standard documentation `_, +Linux has special eBPF instructions for access to packet data that have been +carried over from classic BPF to retain the performance of legacy socket +filters running in the eBPF interpreter. + +The instructions come in two forms: ``BPF_ABS | | BPF_LD`` and +``BPF_IND | | BPF_LD``. + +These instructions are used to access packet data and can only be used when +the program context is a pointer to a networking packet. ``BPF_ABS`` +accesses packet data at an absolute offset specified by the immediate data +and ``BPF_IND`` access packet data at an offset that includes the value of +a register in addition to the immediate data. + +These instructions have seven implicit operands: + +* Register R6 is an implicit input that must contain a pointer to a + struct sk_buff. +* Register R0 is an implicit output which contains the data fetched from + the packet. +* Registers R1-R5 are scratch registers that are clobbered by the + instruction. + +These instructions have an implicit program exit condition as well. If an +eBPF program attempts access data beyond the packet boundary, the +program execution will be aborted. + +``BPF_ABS | BPF_W | BPF_LD`` (0x20) means:: + + R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + imm)) + +where ``ntohl()`` converts a 32-bit value from network byte order to host byte order. + +``BPF_IND | BPF_W | BPF_LD`` (0x40) means:: + + R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + src + imm)) + +Appendix +======== + +For reference, the following table lists legacy Linux-specific opcodes in order by value. + +====== ==== =================================================== ============= +opcode imm description reference +====== ==== =================================================== ============= +0x20 any dst = ntohl(\*(uint32_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ +0x28 any dst = ntohs(\*(uint16_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ +0x30 any dst = (\*(uint8_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ +0x38 any dst = ntohll(\*(uint64_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ +0x40 any dst = ntohl(\*(uint32_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ +0x48 any dst = ntohs(\*(uint16_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ +0x50 any dst = \*(uint8_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ +0x58 any dst = ntohll(\*(uint64_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ -- cgit From 9a0bf21337c667375d918adc41239ce54304a12c Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Tue, 27 Sep 2022 18:59:45 +0000 Subject: bpf, docs: Linux byteswap note Add Linux byteswap note. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20220927185958.14995-2-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/instruction-set.rst | 4 ---- Documentation/bpf/linux-notes.rst | 5 +++++ 2 files changed, 5 insertions(+), 4 deletions(-) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index 352f25a1eb17..1735b91ec4c7 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -156,10 +156,6 @@ Examples: dst_reg = htobe64(dst_reg) -``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and -``BPF_TO_BE`` respectively. - - Jump instructions ----------------- diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst index 93c01386d92c..1c31379b469f 100644 --- a/Documentation/bpf/linux-notes.rst +++ b/Documentation/bpf/linux-notes.rst @@ -7,6 +7,11 @@ Linux implementation notes This document provides more details specific to the Linux kernel implementation of the eBPF instruction set. +Byte swap instructions +====================== + +``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively. + Legacy BPF Packet access instructions ===================================== -- cgit From 6c7aaffb24efbd5d1ae067b2b629b3ffcc37e18e Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Tue, 27 Sep 2022 18:59:46 +0000 Subject: bpf, docs: Move Clang notes to a separate file Move Clang notes to a separate file. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20220927185958.14995-3-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/clang-notes.rst | 24 ++++++++++++++++++++++++ Documentation/bpf/instruction-set.rst | 6 ------ 2 files changed, 24 insertions(+), 6 deletions(-) create mode 100644 Documentation/bpf/clang-notes.rst (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst new file mode 100644 index 000000000000..b15179cb5117 --- /dev/null +++ b/Documentation/bpf/clang-notes.rst @@ -0,0 +1,24 @@ +.. contents:: +.. sectnum:: + +========================== +Clang implementation notes +========================== + +This document provides more details specific to the Clang/LLVM implementation of the eBPF instruction set. + +Versions +======== + +Clang defined "CPU" versions, where a CPU version of 3 corresponds to the current eBPF ISA. + +Clang can select the eBPF ISA version using ``-mcpu=v3`` for example to select version 3. + +Atomic operations +================= + +Clang can generate atomic instructions by default when ``-mcpu=v3`` is +enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction +Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable +the atomics features, while keeping a lower ``-mcpu`` version, you can use +``-Xclang -target-feature -Xclang +alu32``. diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index 1735b91ec4c7..541483118f65 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -303,12 +303,6 @@ The ``BPF_CMPXCHG`` operation atomically compares the value addressed by value that was at ``dst_reg + off`` before the operation is zero-extended and loaded back to ``R0``. -Clang can generate atomic instructions by default when ``-mcpu=v3`` is -enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction -Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable -the atomics features, while keeping a lower ``-mcpu`` version, you can use -``-Xclang -target-feature -Xclang +alu32``. - 64-bit immediate instructions ----------------------------- -- cgit From ee159bdbdbce293e66d7b9249208f367faff5d81 Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Tue, 27 Sep 2022 18:59:47 +0000 Subject: bpf, docs: Add Clang note about BPF_ALU Add Clang note about BPF_ALU. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20220927185958.14995-4-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/clang-notes.rst | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/clang-notes.rst b/Documentation/bpf/clang-notes.rst index b15179cb5117..528feddf2db9 100644 --- a/Documentation/bpf/clang-notes.rst +++ b/Documentation/bpf/clang-notes.rst @@ -14,6 +14,12 @@ Clang defined "CPU" versions, where a CPU version of 3 corresponds to the curren Clang can select the eBPF ISA version using ``-mcpu=v3`` for example to select version 3. +Arithmetic instructions +======================= + +For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with +``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included. + Atomic operations ================= -- cgit From 5a8921ba96ceaec0c00c8855e48940d2739c5c3b Mon Sep 17 00:00:00 2001 From: Dave Thaler Date: Tue, 27 Sep 2022 18:59:48 +0000 Subject: bpf, docs: Add TOC and fix formatting. Add TOC and fix formatting. Signed-off-by: Dave Thaler Link: https://lore.kernel.org/r/20220927185958.14995-5-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/instruction-set.rst | 268 +++++++++++++++++----------------- 1 file changed, 136 insertions(+), 132 deletions(-) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst index 541483118f65..4997d2088fef 100644 --- a/Documentation/bpf/instruction-set.rst +++ b/Documentation/bpf/instruction-set.rst @@ -1,7 +1,12 @@ +.. contents:: +.. sectnum:: + +======================================== +eBPF Instruction Set Specification, v1.0 +======================================== + +This document specifies version 1.0 of the eBPF instruction set. -==================== -eBPF Instruction Set -==================== Registers and calling convention ================================ @@ -11,10 +16,10 @@ all of which are 64-bits wide. The eBPF calling convention is defined as: - * R0: return value from function calls, and exit value for eBPF programs - * R1 - R5: arguments for function calls - * R6 - R9: callee saved registers that function calls will preserve - * R10: read-only frame pointer to access stack +* R0: return value from function calls, and exit value for eBPF programs +* R1 - R5: arguments for function calls +* R6 - R9: callee saved registers that function calls will preserve +* R10: read-only frame pointer to access stack R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if necessary across calls. @@ -24,17 +29,17 @@ Instruction encoding eBPF has two instruction encodings: - * the basic instruction encoding, which uses 64 bits to encode an instruction - * the wide instruction encoding, which appends a second 64-bit immediate value - (imm64) after the basic instruction for a total of 128 bits. +* the basic instruction encoding, which uses 64 bits to encode an instruction +* the wide instruction encoding, which appends a second 64-bit immediate value + (imm64) after the basic instruction for a total of 128 bits. The basic instruction encoding looks as follows: - ============= ======= =============== ==================== ============ - 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) - ============= ======= =============== ==================== ============ - immediate offset source register destination register opcode - ============= ======= =============== ==================== ============ +============= ======= =============== ==================== ============ +32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) +============= ======= =============== ==================== ============ +immediate offset source register destination register opcode +============= ======= =============== ==================== ============ Note that most instructions do not use all of the fields. Unused fields shall be cleared to zero. @@ -44,30 +49,30 @@ Instruction classes The three LSB bits of the 'opcode' field store the instruction class: - ========= ===== =============================== - class value description - ========= ===== =============================== - BPF_LD 0x00 non-standard load operations - BPF_LDX 0x01 load into register operations - BPF_ST 0x02 store from immediate operations - BPF_STX 0x03 store from register operations - BPF_ALU 0x04 32-bit arithmetic operations - BPF_JMP 0x05 64-bit jump operations - BPF_JMP32 0x06 32-bit jump operations - BPF_ALU64 0x07 64-bit arithmetic operations - ========= ===== =============================== +========= ===== =============================== =================================== +class value description reference +========= ===== =============================== =================================== +BPF_LD 0x00 non-standard load operations `Load and store instructions`_ +BPF_LDX 0x01 load into register operations `Load and store instructions`_ +BPF_ST 0x02 store from immediate operations `Load and store instructions`_ +BPF_STX 0x03 store from register operations `Load and store instructions`_ +BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_ +BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_ +BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_ +BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_ +========= ===== =============================== =================================== Arithmetic and jump instructions ================================ -For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and -BPF_JMP32), the 8-bit 'opcode' field is divided into three parts: +For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and +``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts: - ============== ====== ================= - 4 bits (MSB) 1 bit 3 bits (LSB) - ============== ====== ================= - operation code source instruction class - ============== ====== ================= +============== ====== ================= +4 bits (MSB) 1 bit 3 bits (LSB) +============== ====== ================= +operation code source instruction class +============== ====== ================= The 4th bit encodes the source operand: @@ -84,51 +89,51 @@ The four MSB bits store the operation code. Arithmetic instructions ----------------------- -BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for +``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for otherwise identical operations. -The code field encodes the operation as below: - - ======== ===== ================================================= - code value description - ======== ===== ================================================= - BPF_ADD 0x00 dst += src - BPF_SUB 0x10 dst -= src - BPF_MUL 0x20 dst \*= src - BPF_DIV 0x30 dst /= src - BPF_OR 0x40 dst \|= src - BPF_AND 0x50 dst &= src - BPF_LSH 0x60 dst <<= src - BPF_RSH 0x70 dst >>= src - BPF_NEG 0x80 dst = ~src - BPF_MOD 0x90 dst %= src - BPF_XOR 0xa0 dst ^= src - BPF_MOV 0xb0 dst = src - BPF_ARSH 0xc0 sign extending shift right - BPF_END 0xd0 byte swap operations (see separate section below) - ======== ===== ================================================= - -BPF_ADD | BPF_X | BPF_ALU means:: +The 'code' field encodes the operation as below: + +======== ===== ========================================================== +code value description +======== ===== ========================================================== +BPF_ADD 0x00 dst += src +BPF_SUB 0x10 dst -= src +BPF_MUL 0x20 dst \*= src +BPF_DIV 0x30 dst /= src +BPF_OR 0x40 dst \|= src +BPF_AND 0x50 dst &= src +BPF_LSH 0x60 dst <<= src +BPF_RSH 0x70 dst >>= src +BPF_NEG 0x80 dst = ~src +BPF_MOD 0x90 dst %= src +BPF_XOR 0xa0 dst ^= src +BPF_MOV 0xb0 dst = src +BPF_ARSH 0xc0 sign extending shift right +BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below) +======== ===== ========================================================== + +``BPF_ADD | BPF_X | BPF_ALU`` means:: dst_reg = (u32) dst_reg + (u32) src_reg; -BPF_ADD | BPF_X | BPF_ALU64 means:: +``BPF_ADD | BPF_X | BPF_ALU64`` means:: dst_reg = dst_reg + src_reg -BPF_XOR | BPF_K | BPF_ALU means:: +``BPF_XOR | BPF_K | BPF_ALU`` means:: src_reg = (u32) src_reg ^ (u32) imm32 -BPF_XOR | BPF_K | BPF_ALU64 means:: +``BPF_XOR | BPF_K | BPF_ALU64`` means:: src_reg = src_reg ^ imm32 Byte swap instructions ----------------------- +~~~~~~~~~~~~~~~~~~~~~~ The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit -code field of ``BPF_END``. +'code' field of ``BPF_END``. The byte swap instructions operate on the destination register only and do not use a separate source register or immediate value. @@ -136,14 +141,14 @@ only and do not use a separate source register or immediate value. The 1-bit source operand field in the opcode is used to to select what byte order the operation convert from or to: - ========= ===== ================================================= - source value description - ========= ===== ================================================= - BPF_TO_LE 0x00 convert between host byte order and little endian - BPF_TO_BE 0x08 convert between host byte order and big endian - ========= ===== ================================================= +========= ===== ================================================= +source value description +========= ===== ================================================= +BPF_TO_LE 0x00 convert between host byte order and little endian +BPF_TO_BE 0x08 convert between host byte order and big endian +========= ===== ================================================= -The imm field encodes the width of the swap operations. The following widths +The 'imm' field encodes the width of the swap operations. The following widths are supported: 16, 32 and 64. Examples: @@ -159,28 +164,28 @@ Examples: Jump instructions ----------------- -BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for +``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for otherwise identical operations. -The code field encodes the operation as below: - - ======== ===== ========================= ============ - code value description notes - ======== ===== ========================= ============ - BPF_JA 0x00 PC += off BPF_JMP only - BPF_JEQ 0x10 PC += off if dst == src - BPF_JGT 0x20 PC += off if dst > src unsigned - BPF_JGE 0x30 PC += off if dst >= src unsigned - BPF_JSET 0x40 PC += off if dst & src - BPF_JNE 0x50 PC += off if dst != src - BPF_JSGT 0x60 PC += off if dst > src signed - BPF_JSGE 0x70 PC += off if dst >= src signed - BPF_CALL 0x80 function call - BPF_EXIT 0x90 function / program return BPF_JMP only - BPF_JLT 0xa0 PC += off if dst < src unsigned - BPF_JLE 0xb0 PC += off if dst <= src unsigned - BPF_JSLT 0xc0 PC += off if dst < src signed - BPF_JSLE 0xd0 PC += off if dst <= src signed - ======== ===== ========================= ============ +The 'code' field encodes the operation as below: + +======== ===== ========================= ============ +code value description notes +======== ===== ========================= ============ +BPF_JA 0x00 PC += off BPF_JMP only +BPF_JEQ 0x10 PC += off if dst == src +BPF_JGT 0x20 PC += off if dst > src unsigned +BPF_JGE 0x30 PC += off if dst >= src unsigned +BPF_JSET 0x40 PC += off if dst & src +BPF_JNE 0x50 PC += off if dst != src +BPF_JSGT 0x60 PC += off if dst > src signed +BPF_JSGE 0x70 PC += off if dst >= src signed +BPF_CALL 0x80 function call +BPF_EXIT 0x90 function / program return BPF_JMP only +BPF_JLT 0xa0 PC += off if dst < src unsigned +BPF_JLE 0xb0 PC += off if dst <= src unsigned +BPF_JSLT 0xc0 PC += off if dst < src signed +BPF_JSLE 0xd0 PC += off if dst <= src signed +======== ===== ========================= ============ The eBPF program needs to store the return value into register R0 before doing a BPF_EXIT. @@ -189,14 +194,26 @@ BPF_EXIT. Load and store instructions =========================== -For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the +For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the 8-bit 'opcode' field is divided as: - ============ ====== ================= - 3 bits (MSB) 2 bits 3 bits (LSB) - ============ ====== ================= - mode size instruction class - ============ ====== ================= +============ ====== ================= +3 bits (MSB) 2 bits 3 bits (LSB) +============ ====== ================= +mode size instruction class +============ ====== ================= + +The mode modifier is one of: + + ============= ===== ==================================== ============= + mode modifier value description reference + ============= ===== ==================================== ============= + BPF_IMM 0x00 64-bit immediate instructions `64-bit immediate instructions`_ + BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_ + BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_ + BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_ + BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_ + ============= ===== ==================================== ============= The size modifier is one of: @@ -209,19 +226,6 @@ The size modifier is one of: BPF_DW 0x18 double word (8 bytes) ============= ===== ===================== -The mode modifier is one of: - - ============= ===== ==================================== - mode modifier value description - ============= ===== ==================================== - BPF_IMM 0x00 64-bit immediate instructions - BPF_ABS 0x20 legacy BPF packet access (absolute) - BPF_IND 0x40 legacy BPF packet access (indirect) - BPF_MEM 0x60 regular load and store operations - BPF_ATOMIC 0xc0 atomic operations - ============= ===== ==================================== - - Regular load and store operations --------------------------------- @@ -252,42 +256,42 @@ by other eBPF programs or means outside of this specification. All atomic operations supported by eBPF are encoded as store operations that use the ``BPF_ATOMIC`` mode modifier as follows: - * ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations - * ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations - * 8-bit and 16-bit wide atomic operations are not supported. +* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations +* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations +* 8-bit and 16-bit wide atomic operations are not supported. -The imm field is used to encode the actual atomic operation. +The 'imm' field is used to encode the actual atomic operation. Simple atomic operation use a subset of the values defined to encode -arithmetic operations in the imm field to encode the atomic operation: +arithmetic operations in the 'imm' field to encode the atomic operation: - ======== ===== =========== - imm value description - ======== ===== =========== - BPF_ADD 0x00 atomic add - BPF_OR 0x40 atomic or - BPF_AND 0x50 atomic and - BPF_XOR 0xa0 atomic xor - ======== ===== =========== +======== ===== =========== +imm value description +======== ===== =========== +BPF_ADD 0x00 atomic add +BPF_OR 0x40 atomic or +BPF_AND 0x50 atomic and +BPF_XOR 0xa0 atomic xor +======== ===== =========== -``BPF_ATOMIC | BPF_W | BPF_STX`` with imm = BPF_ADD means:: +``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means:: *(u32 *)(dst_reg + off16) += src_reg -``BPF_ATOMIC | BPF_DW | BPF_STX`` with imm = BPF ADD means:: +``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means:: *(u64 *)(dst_reg + off16) += src_reg In addition to the simple atomic operations, there also is a modifier and two complex atomic operations: - =========== ================ =========================== - imm value description - =========== ================ =========================== - BPF_FETCH 0x01 modifier: return old value - BPF_XCHG 0xe0 | BPF_FETCH atomic exchange - BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange - =========== ================ =========================== +=========== ================ =========================== +imm value description +=========== ================ =========================== +BPF_FETCH 0x01 modifier: return old value +BPF_XCHG 0xe0 | BPF_FETCH atomic exchange +BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange +=========== ================ =========================== The ``BPF_FETCH`` modifier is optional for simple atomic operations, and always set for the complex atomic operations. If the ``BPF_FETCH`` flag @@ -306,7 +310,7 @@ and loaded back to ``R0``. 64-bit immediate instructions ----------------------------- -Instructions with the ``BPF_IMM`` mode modifier use the wide instruction +Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction encoding for an extra imm64 value. There is currently only one such instruction. -- cgit From b502a6fb46d275aa978c1e0655bada2cafc81fea Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Sat, 1 Oct 2022 08:49:45 -0700 Subject: bpf, docs: Delete misformatted table. Delete misformatted table. Fixes: 6166da0a02cd ("bpf, docs: Move legacy packet instructions to a separate file") Signed-off-by: Alexei Starovoitov --- Documentation/bpf/linux-notes.rst | 17 ----------------- 1 file changed, 17 deletions(-) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst index 1c31379b469f..956b0c86699d 100644 --- a/Documentation/bpf/linux-notes.rst +++ b/Documentation/bpf/linux-notes.rst @@ -51,20 +51,3 @@ where ``ntohl()`` converts a 32-bit value from network byte order to host byte o ``BPF_IND | BPF_W | BPF_LD`` (0x40) means:: R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + src + imm)) - -Appendix -======== - -For reference, the following table lists legacy Linux-specific opcodes in order by value. - -====== ==== =================================================== ============= -opcode imm description reference -====== ==== =================================================== ============= -0x20 any dst = ntohl(\*(uint32_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ -0x28 any dst = ntohs(\*(uint16_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ -0x30 any dst = (\*(uint8_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ -0x38 any dst = ntohll(\*(uint64_t \*)(R6->data + imm)) `Legacy BPF Packet access instructions`_ -0x40 any dst = ntohl(\*(uint32_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ -0x48 any dst = ntohs(\*(uint16_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ -0x50 any dst = \*(uint8_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ -0x58 any dst = ntohll(\*(uint64_t \*)(R6->data + src + imm)) `Legacy BPF Packet access instructions`_ -- cgit From 736baae643cb1ccaeaf989ef8eb09ce085020479 Mon Sep 17 00:00:00 2001 From: Bagas Sanjaya Date: Sun, 2 Oct 2022 10:20:23 +0700 Subject: Documentation: bpf: Add implementation notes documentations to table of contents Sphinx reported warnings on missing implementation notes documentations in the table of contents: Documentation/bpf/clang-notes.rst: WARNING: document isn't included in any toctree Documentation/bpf/linux-notes.rst: WARNING: document isn't included in any toctree Add these documentations to the table of contents (index.rst) of BPF documentation to fix the warnings. Link: https://lore.kernel.org/linux-doc/202210020749.yfgDZbRL-lkp@intel.com/ Fixes: 6c7aaffb24efbd ("bpf, docs: Move Clang notes to a separate file") Fixes: 6166da0a02cde2 ("bpf, docs: Move legacy packet instructions to a separate file") Reported-by: kernel test robot Signed-off-by: Bagas Sanjaya Link: https://lore.kernel.org/r/20221002032022.24693-1-bagasdotme@gmail.com Signed-off-by: Alexei Starovoitov --- Documentation/bpf/index.rst | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation/bpf') diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst index 1bc2c5c58bdb..1b50de1983ee 100644 --- a/Documentation/bpf/index.rst +++ b/Documentation/bpf/index.rst @@ -26,6 +26,8 @@ that goes into great technical depth about the BPF Architecture. classic_vs_extended.rst bpf_licensing test_debug + clang-notes + linux-notes other .. only:: subproject and html -- cgit