diff options
Diffstat (limited to 'Documentation/dev-tools/propeller.rst')
-rw-r--r-- | Documentation/dev-tools/propeller.rst | 162 |
1 files changed, 162 insertions, 0 deletions
diff --git a/Documentation/dev-tools/propeller.rst b/Documentation/dev-tools/propeller.rst new file mode 100644 index 000000000000..92195958e3db --- /dev/null +++ b/Documentation/dev-tools/propeller.rst @@ -0,0 +1,162 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================================== +Using Propeller with the Linux kernel +===================================== + +This enables Propeller build support for the kernel when using Clang +compiler. Propeller is a profile-guided optimization (PGO) method used +to optimize binary executables. Like AutoFDO, it utilizes hardware +sampling to gather information about the frequency of execution of +different code paths within a binary. Unlike AutoFDO, this information +is then used right before linking phase to optimize (among others) +block layout within and across functions. + +A few important notes about adopting Propeller optimization: + +#. Although it can be used as a standalone optimization step, it is + strongly recommended to apply Propeller on top of AutoFDO, + AutoFDO+ThinLTO or Instrument FDO. The rest of this document + assumes this paradigm. + +#. Propeller uses another round of profiling on top of + AutoFDO/AutoFDO+ThinLTO/iFDO. The whole build process involves + "build-afdo - train-afdo - build-propeller - train-propeller - + build-optimized". + +#. Propeller requires LLVM 19 release or later for Clang/Clang++ + and the linker(ld.lld). + +#. In addition to LLVM toolchain, Propeller requires a profiling + conversion tool: https://github.com/google/autofdo with a release + after v0.30.1: https://github.com/google/autofdo/releases/tag/v0.30.1. + +The Propeller optimization process involves the following steps: + +#. Initial building: Build the AutoFDO or AutoFDO+ThinLTO binary as + you would normally do, but with a set of compile-time / link-time + flags, so that a special metadata section is created within the + kernel binary. The special section is only intend to be used by the + profiling tool, it is not part of the runtime image, nor does it + change kernel run time text sections. + +#. Profiling: The above kernel is then run with a representative + workload to gather execution frequency data. This data is collected + using hardware sampling, via perf. Propeller is most effective on + platforms supporting advanced PMU features like LBR on Intel + machines. This step is the same as profiling the kernel for AutoFDO + (the exact perf parameters can be different). + +#. Propeller profile generation: Perf output file is converted to a + pair of Propeller profiles via an offline tool. + +#. Optimized build: Build the AutoFDO or AutoFDO+ThinLTO optimized + binary as you would normally do, but with a compile-time / + link-time flag to pick up the Propeller compile time and link time + profiles. This build step uses 3 profiles - the AutoFDO profile, + the Propeller compile-time profile and the Propeller link-time + profile. + +#. Deployment: The optimized kernel binary is deployed and used + in production environments, providing improved performance + and reduced latency. + +Preparation +=========== + +Configure the kernel with:: + + CONFIG_AUTOFDO_CLANG=y + CONFIG_PROPELLER_CLANG=y + +Customization +============= + +The default CONFIG_PROPELLER_CLANG setting covers kernel space objects +for Propeller builds. One can, however, enable or disable Propeller build +for individual files and directories by adding a line similar to the +following to the respective kernel Makefile: + +- For enabling a single file (e.g. foo.o):: + + PROPELLER_PROFILE_foo.o := y + +- For enabling all files in one directory:: + + PROPELLER_PROFILE := y + +- For disabling one file:: + + PROPELLER_PROFILE_foo.o := n + +- For disabling all files in one directory:: + + PROPELLER__PROFILE := n + + +Workflow +======== + +Here is an example workflow for building an AutoFDO+Propeller kernel: + +1) Assuming an AutoFDO profile is already collected following + instructions in the AutoFDO document, build the kernel on the host + machine, with AutoFDO and Propeller build configs :: + + CONFIG_AUTOFDO_CLANG=y + CONFIG_PROPELLER_CLANG=y + + and :: + + $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<autofdo-profile-name> + +2) Install the kernel on the test machine. + +3) Run the load tests. The '-c' option in perf specifies the sample + event period. We suggest using a suitable prime number, like 500009, + for this purpose. + + - For Intel platforms:: + + $ perf record -e BR_INST_RETIRED.NEAR_TAKEN:k -a -N -b -c <count> -o <perf_file> -- <loadtest> + + - For AMD platforms:: + + $ perf record --pfm-event RETIRED_TAKEN_BRANCH_INSTRUCTIONS:k -a -N -b -c <count> -o <perf_file> -- <loadtest> + + Note you can repeat the above steps to collect multiple <perf_file>s. + +4) (Optional) Download the raw perf file(s) to the host machine. + +5) Use the create_llvm_prof tool (https://github.com/google/autofdo) to + generate Propeller profile. :: + + $ create_llvm_prof --binary=<vmlinux> --profile=<perf_file> + --format=propeller --propeller_output_module_name + --out=<propeller_profile_prefix>_cc_profile.txt + --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt + + "<propeller_profile_prefix>" can be something like "/home/user/dir/any_string". + + This command generates a pair of Propeller profiles: + "<propeller_profile_prefix>_cc_profile.txt" and + "<propeller_profile_prefix>_ld_profile.txt". + + If there are more than 1 perf_file collected in the previous step, + you can create a temp list file "<perf_file_list>" with each line + containing one perf file name and run:: + + $ create_llvm_prof --binary=<vmlinux> --profile=@<perf_file_list> + --format=propeller --propeller_output_module_name + --out=<propeller_profile_prefix>_cc_profile.txt + --propeller_symorder=<propeller_profile_prefix>_ld_profile.txt + +6) Rebuild the kernel using the AutoFDO and Propeller + profiles. :: + + CONFIG_AUTOFDO_CLANG=y + CONFIG_PROPELLER_CLANG=y + + and :: + + $ make LLVM=1 CLANG_AUTOFDO_PROFILE=<profile_file> CLANG_PROPELLER_PROFILE_PREFIX=<propeller_profile_prefix> |