From a5967db9af51a84f5e181600954714a9e4c69f1f Mon Sep 17 00:00:00 2001 From: Stephen Rothwell Date: Wed, 24 Aug 2016 22:29:19 +1000 Subject: kbuild: allow architectures to use thin archives instead of ld -r ld -r is an incremental link used to create built-in.o files in build subdirectories. It produces relocatable object files containing all its input files, and these are are then pulled together and relocated in the final link. Aside from the bloat, this constrains the final link relocations, which has bitten large powerpc builds with unresolvable relocations in the final link. Alan Modra has recommended the kernel use thin archives for linking. This is an alternative and means that the linker has more information available to it when it links the kernel. This patch enables a config option architectures can select, which causes all built-in.o files to be built as thin archives. built-in.o files in subdirectories do not get symbol table or index attached, which improves speed and size. The final link pass creates a built-in.o archive in the root output directory which includes the symbol table and index. The linker then uses takes this file to link. The --whole-archive linker option is required, because the linker now has visibility to every individual object file, and it will otherwise just completely avoid including those without external references (consider a file with EXPORT_SYMBOL or initcall or hardware exceptions as its only entry points). The traditional built works "by luck" as built-in.o files are large enough that they're going to get external references. However this optimisation is unpredictable for the kernel (due to above external references), ineffective at culling unused, and costly because the .o files have to be searched for references. Superior alternatives for link-time culling should be used instead. Build characteristics for inclink vs thinarc, on a small powerpc64le pseries VM with a modest .config: inclink thinarc sizes vmlinux 15 618 680 15 625 028 sum of all built-in.o 56 091 808 1 054 334 sum excluding root built-in.o 151 430 find -name built-in.o | xargs rm ; time make vmlinux real 22.772s 21.143s user 13.280s 13.430s sys 4.310s 2.750s - Final kernel pulled in only about 6K more, which shows how ineffective the object file culling is. - Build performance looks improved due to less pagecache activity. On IO constrained systems it could be a bigger win. - Build size saving is significant. Side note, the toochain understands archives, so there's some tricks, $ ar t built-in.o # list all files you linked with $ size built-in.o # and their sizes $ objdump -d built-in.o # disassembly (unrelocated) with filenames Implementation by sfr, minor tweaks by npiggin. Signed-off-by: Stephen Rothwell Signed-off-by: Nicholas Piggin Signed-off-by: Michal Marek --- arch/Kconfig | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index bd8056b5b246..6842154813e5 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -461,6 +461,12 @@ config CC_STACKPROTECTOR_STRONG endchoice +config THIN_ARCHIVES + bool + help + Select this if the architecture wants to use thin archives + instead of ld -r to create the built-in.o files. + config HAVE_CONTEXT_TRACKING bool help -- cgit From b67067f1176df6ee727450546b58704e4b588563 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin Date: Wed, 24 Aug 2016 22:29:20 +1000 Subject: kbuild: allow archs to select link dead code/data elimination Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to select to build with -ffunction-sections, -fdata-sections, and link with --gc-sections. It requires some work (documented) to ensure all unreferenced entrypoints are live, and requires toolchain and build verification, so it is made a per-arch option for now. On a random powerpc64le build, this yelds a significant size saving, it boots and runs fine, but there is a lot I haven't tested as yet, so these savings may be reduced if there are bugs in the link. text data bss dec filename 11169741 1180744 1923176 14273661 vmlinux 10445269 1004127 1919707 13369103 vmlinux.dce ~700K text, ~170K data, 6% removed from kernel image size. Signed-off-by: Nicholas Piggin Signed-off-by: Michal Marek --- arch/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index 6842154813e5..3f948c422d9d 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -467,6 +467,19 @@ config THIN_ARCHIVES Select this if the architecture wants to use thin archives instead of ld -r to create the built-in.o files. +config LD_DEAD_CODE_DATA_ELIMINATION + bool + help + Select this if the architecture wants to do dead code and + data elimination with the linker by compiling with + -ffunction-sections -fdata-sections and linking with + --gc-sections. + + This requires that the arch annotates or otherwise protects + its external entry points from being discarded. Linker scripts + must also merge .text.*, .data.*, and .bss.* correctly into + output sections. + config HAVE_CONTEXT_TRACKING bool help -- cgit From 0f4c4af06eec5717041d6fe568b80ee4cf3c1475 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin Date: Wed, 14 Sep 2016 12:24:03 +1000 Subject: kbuild: -ffunction-sections fix for archs with conflicting sections Enabling -ffunction-sections modified the generic linker script to pull .text.* sections into regular TEXT_TEXT section, conflicting with some architectures. Revert that change and require archs that enable the option to ensure they have no conflicting section names, and do the appropriate merging. Reported-by: Guenter Roeck Tested-by: Guenter Roeck Fixes: b67067f1176d ("kbuild: allow archs to select link dead code/data elimination") Signed-off-by: Nicholas Piggin Signed-off-by: Michal Marek --- arch/Kconfig | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'arch/Kconfig') diff --git a/arch/Kconfig b/arch/Kconfig index 3f948c422d9d..48d1e76a1ee3 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -478,7 +478,9 @@ config LD_DEAD_CODE_DATA_ELIMINATION This requires that the arch annotates or otherwise protects its external entry points from being discarded. Linker scripts must also merge .text.*, .data.*, and .bss.* correctly into - output sections. + output sections. Care must be taken not to pull in unrelated + sections (e.g., '.text.init'). Typically '.' in section names + is used to distinguish them from label names / C identifiers. config HAVE_CONTEXT_TRACKING bool -- cgit