Age | Commit message (Collapse) | Author |
|
A longstanding issue with genksyms is that it has hidden syntax errors.
For example, genksyms fails to parse the following valid code:
int x, __attribute__((__section__(".init.data")))y;
Here, only 'y' is annotated by the attribute, although I am not aware
of actual uses of this pattern in the kernel tree.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
$ echo 'int x, __attribute__((__section__(".init.data")))y;' | scripts/genksyms/genksyms -w
<stdin>:1: syntax error
This commit allows attributes to be placed between a comma and
init_declarator.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, genksyms fails to parse the following code in
arch/arm64/lib/xor-neon.c:
static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
{
[ snip ]
}
The syntax error occurs because genksyms does not recognize the
uint64x2_t keyword.
This commit adds support for builtin types described in Arm Neon
Intrinsics Reference.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ fs/lockd/svc.i
$ cat fs/lockd/svc.i | scripts/genksyms/genksyms -w
[ snip ]
./include/net/addrconf.h:35: syntax error
The syntax error occurs in the following code in include/net/addrconf.h:
union __packed {
[ snip ]
};
The issue arises from __packed, which is defined as
__attribute__((__packed__)), immediately after the 'union' keyword.
This commit allows the 'union' keyword to be followed by attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ arch/x86/kernel/cpu/mshyperv.i
$ cat arch/x86/kernel/cpu/mshyperv.i | scripts/genksyms/genksyms -w
[ snip ]
./arch/x86/include/asm/svm.h:122: syntax error
The syntax error occurs in the following code in arch/x86/include/asm/svm.h:
struct __attribute__ ((__packed__)) vmcb_control_area {
[ snip ]
};
The issue arises from __attribute__ immediately after the 'struct'
keyword.
This commit allows the 'struct' keyword to be followed by attributes.
The lexer must be adjusted because dont_want_brace_phase should not be
decremented while processing attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ kernel/module/main.i
$ cat kernel/module/main.i | scripts/genksyms/genksyms -w
[ snip ]
kernel/module/main.c:97: syntax error
The syntax error occurs in the following code in kernel/module/main.c:
static void __mod_update_bounds(enum mod_mem_type type __maybe_unused, void *base,
unsigned int size, struct mod_tree_root *tree)
{
[ snip ]
}
The issue arises from __maybe_unused, which is defined as
__attribute__((__unused__)).
This commit allows direct_abstract_declarator to be followed with
attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ drivers/acpi/prmt.i
$ cat drivers/acpi/prmt.i | scripts/genksyms/genksyms -w
[ snip ]
drivers/acpi/prmt.c:56: syntax error
The syntax error occurs in the following code in drivers/acpi/prmt.c:
struct prm_handler_info {
[ snip ]
efi_status_t (__efiapi *handler_addr)(u64, void *);
[ snip ]
};
The issue arises from __efiapi, which is defined as either
__attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows nested_declarator to be prefixed with attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ init/main.i
$ cat init/main.i | scripts/genksyms/genksyms -w
[ snip ]
./include/linux/efi.h:1225: syntax error
The syntax error occurs in the following code in include/linux/efi.h:
efi_status_t
efi_call_acpi_prm_handler(efi_status_t (__efiapi *handler_addr)(u64, void *),
u64 param_buffer_addr, void *context);
The issue arises from __efiapi, which is defined as either
__attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows abstract_declarator to be prefixed with attributes.
To avoid conflicts, I tweaked the rule for decl_specifier_seq. Due to
this change, a standalone attribute cannot become decl_specifier_seq.
Otherwise, I do not know how to resolve the conflicts.
The following code, which was previously accepted by genksyms, will now
result in a syntax error:
void my_func(__attribute__((unused))x);
I do not think it is a big deal because GCC also fails to parse it.
$ echo 'void my_func(__attribute__((unused))x);' | gcc -c -x c -
<stdin>:1:37: error: unknown type name 'x'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
The __attribute__ keyword can appear in more contexts than 'const' or
'volatile'.
To avoid grammatical conflicts with future changes, ATTRIBUTE_PHRASE
should not be reduced into type_qualifier.
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
I believe the missing action here is a bug.
For rules with no explicit action, the following default is used:
{ $$ = $1; }
However, in this case, $1 is the value of attribute_opt itself. As a
result, the value of attribute_opt is always NULL.
The following test code demonstrates inconsistent behavior.
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
The attribute is recorded only when followed by an initializer.
This commit adds the correct action to propagate the value of the
ATTRIBUTE_PHRASE token.
With this change, the attribute in the example above is consistently
recorded for both 'x' and 'y'.
[Before]
$ cat <<EOF | scripts/genksyms/genksyms -d
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
EOF
Defn for type0 x == <int x >
Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Hash table occupancy 2/4096 = 0.000488281
[After]
$ cat <<EOF | scripts/genksyms/genksyms -d
int x __attribute__((__aligned__(4)));
int y __attribute__((__aligned__(4))) = 0;
EOF
Defn for type0 x == <int x __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) >
Hash table occupancy 2/4096 = 0.000488281
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
Similar to the previous commit, this change makes the parser logic a
little more accurate.
Currently, genksyms accepts the following invalid code:
struct foo {
int (*callback)(int)(int)(int);
};
A direct-declarator should not recursively absorb multiple
( parameter-type-list ) constructs.
In the example above, (*callback) should be followed by at most one
(int).
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
While there is no more grammatical ambiguity in genksyms, the parser
logic is still inaccurate.
For example, genksyms accepts the following invalid C code:
void my_func(int ()(int));
This should result in a syntax error because () cannot be reduced to
<direct-abstract-declarator>.
( <abstract-declarator> ) can be reduced, but <abstract-declarator>
must not be empty in the following grammar from K&R [1]:
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
Furthermore, genksyms accepts the following weird code:
void my_func(int (*callback)(int)(int)(int));
The parser allows <direct-abstract-declarator> to recursively absorb
multiple ( {<parameter-type-list>}? ), but this behavior is incorrect.
In the example above, (*callback) should be followed by at most one
(int).
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
This workaround was introduced for suppressing the reduce/reduce conflict
warnings because the %expect-rr directive, which is applicable only to GLR
parsers, cannot be used for genksyms.
Since there are no longer any conflicts, this Makefile hack is now
unnecessary.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The ambiguity arises when decl_specifier_seq is followed by '(' because
the following two interpretations are possible:
- decl_specifier_seq direct_abstract_declarator '(' parameter_declaration_clause ')'
- decl_specifier_seq '(' abstract_declarator ')'
This issue occurs because the current parser allows an empty string to
be reduced to direct_abstract_declarator, which is incorrect.
K&R [1] explains the correct grammar:
<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+
<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
This commit resolves all remaining conflicts.
We need to consider the difference between the following two examples:
[Example 1] ( <abstract-declarator> ) can become <direct-abstract-declarator>
void my_func(int (foo));
... is equivalent to:
void my_func(int foo);
[Example 2] ( <parameter-type-list> ) can become <direct-abstract-declarator>
typedef int foo;
void my_func(int (foo));
... is equivalent to:
void my_func(int (*callback)(int));
Please note that the function declaration is identical in both examples,
but the preceding typedef creates the distinction. I introduced a new
term, open_paren, to enable the type lookup immediately after the '('
token. Without this, we cannot distinguish between [Example 1] and
[Example 2].
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The comment in the parser describes the current problem:
/* This wasn't really a typedef name but an identifier that
shadows one. */
Consider the following simple C code:
typedef int foo;
void my_func(foo foo) {}
In the function parameter list (foo foo), the first 'foo' is a type
specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.
However, the lexer cannot distinguish between the two. Since 'foo' is
already typedef'ed, the lexer returns TYPE for both instances, instead
of returning IDENT for the second one.
To support shadowed identifiers, TYPE can be reduced to either a
simple_type_specifier or a direct_abstract_declarator, which creates
a grammatical ambiguity.
Without analyzing the grammar context, it is very difficult to resolve
this correctly.
This commit introduces a flag, dont_want_type_specifier, which allows
the parser to inform the lexer whether an identifier is expected. When
dont_want_type_specifier is true, the type lookup is suppressed, and
the lexer returns IDENT regardless of any preceding typedef.
After this commit, only 3 shift/reduce conflicts will remain.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
A type_qualifier (const, volatile, etc.) is not a type_specifier.
According to K&R [1], a type-qualifier should be directly reduced to
a declaration-specifier.
<declaration-specifier> ::= <storage-class-specifier>
| <type-specifier>
| <type-qualifier>
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
I believe "cvar" stands for "Const, Volatile, Attribute, or Restrict".
This is called "type-qualifier" in K&R. [1]
Adopt this more generic naming.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
This is called "abstract-declarator" in K&R. [1]
I am not sure what "m_" stands for, but the name is clear enough
without it.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
|
|
When running the sign script the kernel is within the source directory
of external modules. This caused issues when the kernel uses relative
paths, like:
make[5]: Entering directory '/build/client/devel/kernel/work/linux-2.6'
make[6]: Entering directory '/build/client/devel/addmodules/vtx/work/vtx'
INSTALL /build/client/devel/addmodules/vtx/_/lib/modules/6.13.0-devel+/extra/vtx.ko
SIGN /build/client/devel/addmodules/vtx/_/lib/modules/6.13.0-devel+/extra/vtx.ko
/bin/sh: 1: scripts/sign-file: not found
DEPMOD /build/client/devel/addmodules/vtx/_/lib/modules/6.13.0-devel+
Working around it by using absolute pathes here.
Fixes: 13b25489b6f8 ("kbuild: change working directory to external module directory with M=")
Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Commit 654102df2ac2 ("kbuild: add generic support for built-in boot
DTBs") introduced generic support for built-in DTBs.
Select GENERIC_BUILTIN_DTB to use the generic rule.
To keep consistency across architectures, this commit also renames
CONFIG_ARC_BUILTIN_DTB_NAME to CONFIG_BUILTIN_DTB_NAME.
Now, "nsim_700" is the default value for CONFIG_BUILTIN_DTB_NAME, rather
than a fallback in case it is empty.
Acked-by: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Previously, two things stopped Rust from using MODVERSIONS:
1. Rust symbols are occasionally too long to be represented in the
original versions table
2. Rust types cannot be properly hashed by the existing genksyms
approach because:
* Looking up type definitions in Rust is more complex than C
* Type layout is potentially dependent on the compiler in Rust,
not just the source type declaration.
CONFIG_EXTENDED_MODVERSIONS addresses the first point, and
CONFIG_GENDWARFKSYMS the second. If Rust wants to use MODVERSIONS, allow
it to do so by selecting both features.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Co-developed-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Document where exported and imported symbols are kept, format options,
and limitations.
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
If you know that your kernel modules will only ever be loaded by a newer
kernel, you can disable BASIC_MODVERSIONS to save space. This also
allows easy creation of test modules to see how tooling will respond to
modules that only have the new format.
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Generate both the existing modversions format and the new extended one
when running modpost. Presence of this metadata in the final .ko is
guarded by CONFIG_EXTENDED_MODVERSIONS.
We no longer generate an error on long symbols in modpost if
CONFIG_EXTENDED_MODVERSIONS is set, as they can now be appropriately
encoded in the extended section. These symbols will be skipped in the
previous encoding. An error will still be generated if
CONFIG_EXTENDED_MODVERSIONS is not set.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Adds a new format for MODVERSIONS which stores each field in a separate
ELF section. This initially adds support for variable length names, but
could later be used to add additional fields to MODVERSIONS in a
backwards compatible way if needed. Any new fields will be ignored by
old user tooling, unlike the current format where user tooling cannot
tolerate adjustments to the format (for example making the name field
longer).
Since PPC munges its version records to strip leading dots, we reproduce
the munging for the new format. Other architectures do not appear to
have architecture-specific usage of this information.
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Add documentation for gendwarfksyms changes, and the kABI stability
features that can be useful for distributions even though they're not
used in mainline kernels.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
When MODVERSIONS is enabled, allow selecting gendwarfksyms as the
implementation, but default to genksyms.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
With gendwarfksyms, we need each TU where the EXPORT_SYMBOL() macro
is used to also contain DWARF type information for the symbols it
exports. However, as a TU can also export external symbols and
compilers may choose not to emit debugging information for symbols not
defined in the current TU, the missing types will result in missing
symbol versions. Stand-alone assembly code also doesn't contain type
information for exported symbols, so we need to compile a temporary
object file with asm-prototypes.h instead, and similarly need to
ensure the DWARF in the temporary object file contains the necessary
types.
To always emit type information for external exports, add explicit
__gendwarfksyms_ptr_<symbol> references to them in EXPORT_SYMBOL().
gendwarfksyms will use the type information for __gendwarfksyms_ptr_*
if needed. Discard the pointers from the final binary to avoid further
bloat.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
The compiler may choose not to emit type information in DWARF for
external symbols. Clang, for example, does this for symbols not
defined in the current TU.
To provide a way to work around this issue, add support for
__gendwarfksyms_ptr_<symbol> pointers that force the compiler to emit
the necessary type information in DWARF also for the missing symbols.
Example usage:
#define GENDWARFKSYMS_PTR(sym) \
static typeof(sym) *__gendwarfksyms_ptr_##sym __used \
__section(".discard.gendwarfksyms") = &sym;
extern int external_symbol(void);
GENDWARFKSYMS_PTR(external_symbol);
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Distributions that want to maintain a stable kABI need the ability
to make ABI compatible changes to kernel data structures without
affecting symbol versions, either because of LTS updates or backports.
With genksyms, developers would typically hide these changes from
version calculation with #ifndef __GENKSYMS__, which would result
in the symbol version not changing even though the actual type has
changed. When we process precompiled object files, this isn't an
option.
Change union processing to recognize field name prefixes that allow
the user to ignore the union completely during symbol versioning with
a __kabi_ignored prefix in a field name, or to replace the type of a
placeholder field using a __kabi_reserved field name prefix.
For example, assume we want to add a new field to an existing
alignment hole in a data structure, and ignore the new field when
calculating symbol versions:
struct struct1 {
int a;
/* a 4-byte alignment hole */
unsigned long b;
};
To add `int n` to the alignment hole, we can add a union that includes
a __kabi_ignored field that causes gendwarfksyms to ignore the entire
union:
struct struct1 {
int a;
union {
char __kabi_ignored_0;
int n;
};
unsigned long b;
};
With --stable, both structs produce the same symbol version.
Alternatively, when a distribution expects future modification to a
data structure, they can explicitly add reserved fields:
struct struct2 {
long a;
long __kabi_reserved_0; /* reserved for future use */
};
To take the field into use, we can again replace it with a union, with
one of the fields keeping the __kabi_reserved name prefix to indicate
the original type:
struct struct2 {
long a;
union {
long __kabi_reserved_0;
struct {
int b;
int v;
};
};
Here gendwarfksyms --stable replaces the union with the type of the
placeholder field when calculating versions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Distributions that want to maintain a stable kABI need the ability
to make ABI compatible changes to kernel without affecting symbol
versions, either because of LTS updates or backports.
With genksyms, developers would typically hide these changes from
version calculation with #ifndef __GENKSYMS__, which would result
in the symbol version not changing even though the actual type has
changed. When we process precompiled object files, this isn't an
option.
To support this use case, add a --stable command line flag that
gates kABI stability features that are not needed in mainline
kernels, but can be useful for distributions, and add support for
kABI rules, which can be used to restrict gendwarfksyms output.
The rules are specified as a set of null-terminated strings stored
in the .discard.gendwarfksyms.kabi_rules section. Each rule consists
of four strings as follows:
"version\0type\0target\0value"
The version string ensures the structure can be changed in a
backwards compatible way. The type string indicates the type of the
rule, and target and value strings contain rule-specific data.
Initially support two simple rules:
1. Declaration-only types
A type declaration can change into a full definition when
additional includes are pulled in to the TU, which changes the
versions of any symbol that references the type. Add support
for defining declaration-only types whose definition is not
expanded during versioning.
2. Ignored enumerators
It's possible to add new enum fields without changing the ABI,
but as the fields are included in symbol versioning, this would
change the versions. Add support for ignoring specific fields.
3. Overridden enumerator values
Add support for overriding enumerator values when calculating
versions. This may be needed when the last field of the enum
is used as a sentinel and new fields must be added before it.
Add examples for using the rules under the examples/ directory.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Calculate symbol versions from the fully expanded type strings in
type_map, and output the versions in a genksyms-compatible format.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Add support for producing genksyms-style symtypes files. Process
die_map to find the longest expansions for each type, and use symtypes
references in type definitions. The basic file format is similar to
genksyms, with two notable exceptions:
1. Type names with spaces (common with Rust) in references are
wrapped in single quotes. E.g.:
s#'core::result::Result<u8, core::num::error::ParseIntError>'
2. The actual type definition is the simple parsed DWARF format we
output with --dump-dies, not the preprocessed C-style format
genksyms produces.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Debugging the DWARF processing can be somewhat challenging, so add
more detailed debugging output for die_map operations. Add the
--dump-die-map flag, which adds color coded tags to the output for
die_map changes.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Expand each structure type only once per exported symbol. This
is necessary to support self-referential structures, which would
otherwise result in infinite recursion, and it's sufficient for
catching ABI changes.
Types defined in .c files are opaque to external users and thus
cannot affect the ABI. Consider type definitions in .c files to
be declarations to prevent opaque types from changing symbol
versions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Recursively expand DWARF structure types, i.e. structs, unions, and
enums. Also include relevant DWARF attributes in type strings to
encode structure layout, for example.
Example output with --dump-dies:
subprogram (
formal_parameter structure_type &str {
member pointer_type {
base_type u8 byte_size(1) encoding(7)
} data_ptr data_member_location(0) ,
member base_type usize byte_size(8) encoding(7) length data_member_location(8)
} byte_size(16) alignment(8) msg
)
-> base_type void
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Add support for expanding DW_TAG_array_type, and the subrange type
indicating array size.
Example source code:
const char *s[34];
Output with --dump-dies:
variable array_type[34] {
pointer_type {
const_type {
base_type char byte_size(1) encoding(6)
}
} byte_size(8)
}
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Add support for expanding DW_TAG_subroutine_type and the parameters
in DW_TAG_formal_parameter. Use this to also expand subprograms.
Example output with --dump-dies:
subprogram (
formal_parameter pointer_type {
const_type {
base_type char byte_size(1) encoding(6)
}
}
)
-> base_type unsigned long byte_size(8) encoding(7)
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Add support for expanding DWARF type modifiers, such as pointers,
const values etc., and typedefs. These types all have DW_AT_type
attribute pointing to the underlying type, and thus produce similar
output.
Also add linebreaks and indentation to debugging output to make it
more readable.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Basic types in DWARF repeat frequently and traversing the DIEs using
libdw is relatively slow. Add a simple hashtable based cache for the
processed DIEs.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Start making gendwarfksyms more useful by adding support for
expanding DW_TAG_base_type types and basic DWARF attributes.
Example:
$ echo loops_per_jiffy | \
scripts/gendwarfksyms/gendwarfksyms \
--debug --dump-dies vmlinux.o
...
gendwarfksyms: process_symbol: loops_per_jiffy
variable base_type unsigned long byte_size(8) encoding(7)
...
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
The compiler may choose not to emit type information in DWARF for all
aliases, but it's possible for each alias to be exported separately.
To ensure we find type information for the aliases as well, read
{section, address} tuples from the symbol table and match symbols also
by address.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Add a basic DWARF parser, which uses libdw to traverse the debugging
information in an object file and looks for functions and variables.
In follow-up patches, this will be expanded to produce symbol versions
for CONFIG_MODVERSIONS from DWARF.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Currently, 'unsigned long' is used for intermediate variables when
calculating CRCs.
The size of 'long' differs depending on the architecture: it is 32 bits
on 32-bit architectures and 64 bits on 64-bit architectures.
The CRC values generated by genksyms represent the compatibility of
exported symbols. Therefore, reproducibility is important. In other
words, we need to ensure that the output is the same when the kernel
source is identical, regardless of whether genksyms is running on a
32-bit or 64-bit build machine.
Fortunately, the output from genksyms is not affected by the build
machine's architecture because only the lower 32 bits of the
'unsigned long' variables are used.
To make it even clearer that the CRC calculation is independent of
the build machine's architecture, this commit explicitly uses the
fixed-width type, uint32_t.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Use macros provided by hashtable.h
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
free_list() must be called before returning from this for-loop.
Swap 'break' and the combination of free_list() and 'return'.
This reduces the code and minimizes the risk of introducing memory
leaks in future changes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
To improve readability, reduce the indentation as follows:
- Use 'continue' earlier when the symbol does not match
- flip !sym->is_declared to flatten the if-else chain
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
When a symbol that is already registered is read again from *.symref
file, __add_symbol() removes the previous one from the hash table without
freeing it.
[Test Case]
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) {}
EXPORT_SYMBOL(foo);
$ cat foo.symref
foo void foo ( void )
foo void foo ( void )
When a symbol is removed from the hash table, it must be freed along
with its ->name and ->defn members. However, sym->name cannot be freed
because it is sometimes shared with node->string, but not always. If
sym->name and node->string share the same memory, free(sym->name) could
lead to a double-free bug.
To resolve this issue, always assign a strdup'ed string to sym->name.
Fixes: 64e6c1e12372 ("genksyms: track symbol checksum changes")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
When a symbol that is already registered is added again, __add_symbol()
returns without freeing the symbol definition, making it unreachable.
The following test cases demonstrate different memory leak points.
[Test Case 1]
Forward declaration with exactly the same definition
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) {}
EXPORT_SYMBOL(foo);
[Test Case 2]
Forward declaration with a different definition (e.g. attribute)
$ cat foo.c
#include <linux/export.h>
void foo(void);
__attribute__((__section__(".ref.text"))) void foo(void) {}
EXPORT_SYMBOL(foo);
[Test Case 3]
Preserving an overridden symbol (compile with KBUILD_PRESERVE=1)
$ cat foo.c
#include <linux/export.h>
void foo(void);
void foo(void) { }
EXPORT_SYMBOL(foo);
$ cat foo.symref
override foo void foo ( int )
The memory leaks in Test Case 1 and 2 have existed since the introduction
of genksyms into the kernel tree. [1]
The memory leak in Test Case 3 was introduced by commit 5dae9a550a74
("genksyms: allow to ignore symbol checksum changes").
When multiple init_declarators are reduced to an init_declarator_list,
the decl_spec must be duplicated. Otherwise, the following Test Case 4
would result in a double-free bug.
[Test Case 4]
$ cat foo.c
#include <linux/export.h>
extern int foo, bar;
int foo, bar;
EXPORT_SYMBOL(foo);
In this case, 'foo' and 'bar' share the same decl_spec, 'int'. It must
be unshared before being passed to add_symbol().
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=46bd1da672d66ccd8a639d3c1f8a166048cca608
Fixes: 5dae9a550a74 ("genksyms: allow to ignore symbol checksum changes")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
I do not think the '#' flag is useful here because adding the explicit
'0x' is clearer. Add the '0' flag to zero-pad the CRC values.
This change gives better alignment in the generated *.mod.c files.
There is no impact to the compiled modules.
[Before]
$ grep -A5 modversion_info fs/efivarfs/efivarfs.mod.c
static const struct modversion_info ____versions[]
__used __section("__versions") = {
{ 0x907d14d, "blocking_notifier_chain_register" },
{ 0x53d3b64, "simple_inode_init_ts" },
{ 0x65487097, "__x86_indirect_thunk_rax" },
{ 0x122c3a7e, "_printk" },
[After]
$ grep -A5 modversion_info fs/efivarfs/efivarfs.mod.c
static const struct modversion_info ____versions[]
__used __section("__versions") = {
{ 0x0907d14d, "blocking_notifier_chain_register" },
{ 0x053d3b64, "simple_inode_init_ts" },
{ 0x65487097, "__x86_indirect_thunk_rax" },
{ 0x122c3a7e, "_printk" },
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
Commit 71810db27c1c ("modversions: treat symbol CRCs as 32 bit
quantities") changed the CRC fields to s32 because the __kcrctab and
__kcrctab_gpl sections contained relative references to the actual
CRC values stored in the .rodata section when CONFIG_MODULE_REL_CRCS=y.
Commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing
CONFIG_MODULE_REL_CRCS") removed this complexity. Now, the __kcrctab
and __kcrctab_gpl sections directly contain the CRC values in all cases.
The genksyms tool outputs unsigned 32-bit CRC values, so u32 is preferred
over s32.
No functional changes are intended.
Regardless of this change, the CRC value is assigned to the u32 variable
'crcval' before the comparison, as seen in kernel/module/version.c:
crcval = *crc;
It was previously mandatory (but now optional) in order to avoid sign
extension because the following line previously compared 'unsigned long'
and 's32':
if (versions[i].crc == crcval)
return 1;
versions[i].crc is still 'unsigned long' for backward compatibility.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
|