Merge branch 'bpf-token'

Andrii Nakryiko says: ==================== BPF token This patch set is a combination of three BPF token-related patch sets ([0], [1], [2]) with fixes ([3]) to kernel-side token_fd passing APIs incorporated into relevant patches, bpf_token_capable() changes requested by Christian Brauner, and necessary libbpf and BPF selftests side adjustments. This patch set introduces an ability to delegate a subset of BPF subsystem functionality from privileged system-wide daemon (e.g., systemd or any other container manager) through special mount options for userns-bound BPF FS to a *trusted* unprivileged application. Trust is the key here. This functionality is not about allowing unconditional unprivileged BPF usage. Establishing trust, though, is completely up to the discretion of respective privileged application that would create and mount a BPF FS instance with delegation enabled, as different production setups can and do achieve it through a combination of different means (signing, LSM, code reviews, etc), and it's undesirable and infeasible for kernel to enforce any particular way of validating trustworthiness of particular process. The main motivation for this work is a desire to enable containerized BPF applications to be used together with user namespaces. This is currently impossible, as CAP_BPF, required for BPF subsystem usage, cannot be namespaced or sandboxed, as a general rule. E.g., tracing BPF programs, thanks to BPF helpers like bpf_probe_read_kernel() and bpf_probe_read_user() can safely read arbitrary memory, and it's impossible to ensure that they only read memory of processes belonging to any given namespace. This means that it's impossible to have a mechanically verifiable namespace-aware CAP_BPF capability, and as such another mechanism to allow safe usage of BPF functionality is necessary. BPF FS delegation mount options and BPF token derived from such BPF FS instance is such a mechanism. Kernel makes no assumption about what "trusted" constitutes in any particular case, and it's up to specific privileged applications and their surrounding infrastructure to decide that. What kernel provides is a set of APIs to setup and mount special BPF FS instance and derive BPF tokens from it. BPF FS and BPF token are both bound to its owning userns and in such a way are constrained inside intended container. Users can then pass BPF token FD to privileged bpf() syscall commands, like BPF map creation and BPF program loading, to perform such operations without having init userns privileges. This version incorporates feedback and suggestions ([4]) received on earlier iterations of BPF token approach, and instead of allowing to create BPF tokens directly assuming capable(CAP_SYS_ADMIN), we instead enhance BPF FS to accept a few new delegation mount options. If these options are used and BPF FS itself is properly created, set up, and mounted inside the user namespaced container, user application is able to derive a BPF token object from BPF FS instance, and pass that token to bpf() syscall. As explained in patch #3, BPF token itself doesn't grant access to BPF functionality, but instead allows kernel to do namespaced capabilities checks (ns_capable() vs capable()) for CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, and CAP_SYS_ADMIN, as applicable. So it forms one half of a puzzle and allows container managers and sys admins to have safe and flexible configuration options: determining which containers get delegation of BPF functionality through BPF FS, and then which applications within such containers are allowed to perform bpf() commands, based on namespaces capabilities. Previous attempt at addressing this very same problem ([5]) attempted to utilize authoritative LSM approach, but was conclusively rejected by upstream LSM maintainers. BPF token concept is not changing anything about LSM approach, but can be combined with LSM hooks for very fine-grained security policy. Some ideas about making BPF token more convenient to use with LSM (in particular custom BPF LSM programs) was briefly described in recent LSF/MM/BPF 2023 presentation ([6]). E.g., an ability to specify user-provided data (context), which in combination with BPF LSM would allow implementing a very dynamic and fine-granular custom security policies on top of BPF token. In the interest of minimizing API surface area and discussions this was relegated to follow up patches, as it's not essential to the fundamental concept of delegatable BPF token. It should be noted that BPF token is conceptually quite similar to the idea of /dev/bpf device file, proposed by Song a while ago ([7]). The biggest difference is the idea of using virtual anon_inode file to hold BPF token and allowing multiple independent instances of them, each (potentially) with its own set of restrictions. And also, crucially, BPF token approach is not using any special stateful task-scoped flags. Instead, bpf() syscall accepts token_fd parameters explicitly for each relevant BPF command. This addresses main concerns brought up during the /dev/bpf discussion, and fits better with overall BPF subsystem design. Second part of this patch set adds full support for BPF token in libbpf's BPF object high-level API. Good chunk of the changes rework libbpf feature detection internals, which are the most affected by BPF token presence. Besides internal refactorings, libbpf allows to pass location of BPF FS from which BPF token should be created by libbpf. This can be done explicitly though a new bpf_object_open_opts.bpf_token_path field. But we also add implicit BPF token creation logic to BPF object load step, even without any explicit involvement of the user. If the environment is setup properly, BPF token will be created transparently and used implicitly. This allows for all existing application to gain BPF token support by just linking with latest version of libbpf library. No source code modifications are required. All that under assumption that privileged container management agent properly set up default BPF FS instance at /sys/bpf/fs to allow BPF token creation. libbpf adds support to override default BPF FS location for BPF token creation through LIBBPF_BPF_TOKEN_PATH envvar knowledge. This allows admins or container managers to mount BPF token-enabled BPF FS at non-standard location without the need to coordinate with applications. LIBBPF_BPF_TOKEN_PATH can also be used to disable BPF token implicit creation by setting it to an empty value. [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=805707&state=* [1] https://patchwork.kernel.org/project/netdevbpf/list/?series=810260&state=* [2] https://patchwork.kernel.org/project/netdevbpf/list/?series=809800&state=* [3] https://patchwork.kernel.org/project/netdevbpf/patch/20231219053150.336991-1-andrii@kernel.org/ [4] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ [5] https://lore.kernel.org/bpf/20230412043300.360803-1-andrii@kernel.org/ [6] http://vger.kernel.org/bpfconf2023_material/Trusted_unprivileged_BPF_LSFMM2023.pdf [7] https://lore.kernel.org/bpf/20190627201923.2589391-2-songliubraving@fb.com/ v1->v2: - disable BPF token creation in init userns, and simplify bpf_token_capable() logic (Christian); - use kzalloc/kfree instead of kvzalloc/kvfree (Linus); - few more selftest cases to validate LSM and BPF token interations. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> ==================== Link: https://lore.kernel.org/r/20240124022127.2379740-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
author: Andrii Nakryiko <andrii@kernel.org> 2024-01-24 11:42:58 -0800
committer: Alexei Starovoitov <ast@kernel.org> 2024-01-24 16:21:03 -0800
commit: c8632acf193beac64bbdaebef013368c480bf74f (patch)
tree: d9b85c5c1cc0518b3c7d98fbd814a4aa51b636d5 /security/security.c
parent: c9f115564561af63db662791e9a35fcf1dfefd2a (diff)
parent: 906ee42cb1be1152ef24465704cc89edc3f571c1 (diff)
1 files changed, 85 insertions, 16 deletions
diff --git a/security/security.c b/security/security.c
index 0144a98d3712..73e009e3d937 100644
--- a/security/security.c
+++ b/security/security.c
@@ -5410,29 +5410,87 @@ int security_bpf_prog(struct bpf_prog *prog)
 }
 
 /**
- * security_bpf_map_alloc() - Allocate a bpf map LSM blob
- * @map: bpf map
+ * security_bpf_map_create() - Check if BPF map creation is allowed
+ * @map: BPF map object
+ * @attr: BPF syscall attributes used to create BPF map
+ * @token: BPF token used to grant user access
+ *
+ * Do a check when the kernel creates a new BPF map. This is also the
+ * point where LSM blob is allocated for LSMs that need them.
+ *
+ * Return: Returns 0 on success, error on failure.
+ */
+int security_bpf_map_create(struct bpf_map *map, union bpf_attr *attr,
+			    struct bpf_token *token)
+{
+	return call_int_hook(bpf_map_create, 0, map, attr, token);
+}
+
+/**
+ * security_bpf_prog_load() - Check if loading of BPF program is allowed
+ * @prog: BPF program object
+ * @attr: BPF syscall attributes used to create BPF program
+ * @token: BPF token used to grant user access to BPF subsystem
  *
- * Initialize the security field inside bpf map.
+ * Perform an access control check when the kernel loads a BPF program and
+ * allocates associated BPF program object. This hook is also responsible for
+ * allocating any required LSM state for the BPF program.
  *
  * Return: Returns 0 on success, error on failure.
  */
-int security_bpf_map_alloc(struct bpf_map *map)
+int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr,
+			   struct bpf_token *token)
 {
-	return call_int_hook(bpf_map_alloc_security, 0, map);
+	return call_int_hook(bpf_prog_load, 0, prog, attr, token);
 }
 
 /**
- * security_bpf_prog_alloc() - Allocate a bpf program LSM blob
- * @aux: bpf program aux info struct
+ * security_bpf_token_create() - Check if creating of BPF token is allowed
+ * @token: BPF token object
+ * @attr: BPF syscall attributes used to create BPF token
+ * @path: path pointing to BPF FS mount point from which BPF token is created
  *
- * Initialize the security field inside bpf program.
+ * Do a check when the kernel instantiates a new BPF token object from BPF FS
+ * instance. This is also the point where LSM blob can be allocated for LSMs.
  *
  * Return: Returns 0 on success, error on failure.
  */
-int security_bpf_prog_alloc(struct bpf_prog_aux *aux)
+int security_bpf_token_create(struct bpf_token *token, union bpf_attr *attr,
+			      struct path *path)
 {
-	return call_int_hook(bpf_prog_alloc_security, 0, aux);
+	return call_int_hook(bpf_token_create, 0, token, attr, path);
+}
+
+/**
+ * security_bpf_token_cmd() - Check if BPF token is allowed to delegate
+ * requested BPF syscall command
+ * @token: BPF token object
+ * @cmd: BPF syscall command requested to be delegated by BPF token
+ *
+ * Do a check when the kernel decides whether provided BPF token should allow
+ * delegation of requested BPF syscall command.
+ *
+ * Return: Returns 0 on success, error on failure.
+ */
+int security_bpf_token_cmd(const struct bpf_token *token, enum bpf_cmd cmd)
+{
+	return call_int_hook(bpf_token_cmd, 0, token, cmd);
+}
+
+/**
+ * security_bpf_token_capable() - Check if BPF token is allowed to delegate
+ * requested BPF-related capability
+ * @token: BPF token object
+ * @cap: capabilities requested to be delegated by BPF token
+ *
+ * Do a check when the kernel decides whether provided BPF token should allow
+ * delegation of requested BPF-related capabilities.
+ *
+ * Return: Returns 0 on success, error on failure.
+ */
+int security_bpf_token_capable(const struct bpf_token *token, int cap)
+{
+	return call_int_hook(bpf_token_capable, 0, token, cap);
 }
 
 /**
@@ -5443,18 +5501,29 @@ int security_bpf_prog_alloc(struct bpf_prog_aux *aux)
  */
 void security_bpf_map_free(struct bpf_map *map)
 {
-	call_void_hook(bpf_map_free_security, map);
+	call_void_hook(bpf_map_free, map);
+}
+
+/**
+ * security_bpf_prog_free() - Free a BPF program's LSM blob
+ * @prog: BPF program struct
+ *
+ * Clean up the security information stored inside BPF program.
+ */
+void security_bpf_prog_free(struct bpf_prog *prog)
+{
+	call_void_hook(bpf_prog_free, prog);
 }
 
 /**
- * security_bpf_prog_free() - Free a bpf program's LSM blob
- * @aux: bpf program aux info struct
+ * security_bpf_token_free() - Free a BPF token's LSM blob
+ * @token: BPF token struct
  *
- * Clean up the security information stored inside bpf prog.
+ * Clean up the security information stored inside BPF token.
  */
-void security_bpf_prog_free(struct bpf_prog_aux *aux)
+void security_bpf_token_free(struct bpf_token *token)
 {
-	call_void_hook(bpf_prog_free_security, aux);
+	call_void_hook(bpf_token_free, token);
 }
 #endif /* CONFIG_BPF_SYSCALL */
author	Andrii Nakryiko <andrii@kernel.org>	2024-01-24 11:42:58 -0800
committer	Alexei Starovoitov <ast@kernel.org>	2024-01-24 16:21:03 -0800
commit	c8632acf193beac64bbdaebef013368c480bf74f (patch)
tree	d9b85c5c1cc0518b3c7d98fbd814a4aa51b636d5 /security/security.c
parent	c9f115564561af63db662791e9a35fcf1dfefd2a (diff)
parent	906ee42cb1be1152ef24465704cc89edc3f571c1 (diff)