Add BuildID Support
Summary
- Make core dumps self-identifying enough to find the exact correct versions of all relevant binaries and debuginfo
- Optimize debuginfo validation to avoid current whole-file checksumming
Owner
- Name: RolandMcGrath
Current status
- Targeted release: Fedora 8
- Last updated: 2007-10-04
- Percentage of completion: 100%
Detailed Description
The Problems
Quick debuginfo validation
The current .gnu_debuglink
ties a stripped object (executable, DSO,
.ko
) to its .debug
file with a name (in practice always foobar.debug
in a file originally called /some/thing/foobar
), and a (weak) CRC32
checksum of the .debug
file. The only way to find the .debug
file is
by name based on knowing the name of the stripped object, the two having
come in parallel from rpm or whatever. The only way to verify it's really
the right one is to read the entire contents of the .debug
file from disk
to checksum it. This is obviously poor for memory and I/O performance,
especially as the DWARF consumers keep improving to avoid reading in as
much debuginfo at once.
A while ago, a few of us spoke about improving this. What we discussed was
switching to a strong checksum (sha1sum) so that comparing two stored
checksums is sufficient for confidence in validation. The important change
is to have the checksum stored in both the .debug
file and the stripped
file, so that there is a small amount of data to read from the file that is
sufficient to do the validation. We had in mind simply a new kind of
debuglink section (still not allocated) that eu-strip
/objcopy
would insert
into both output files. We didn't really change the checksum plan aside
from the algorithm, i.e. using sha1sum of the entire unstripped file or of
the .debug
file with an all-zeros debuglink section.
In later contemplation, I became dissatisfied with using a checksum on the
entire contents of the .debug
file. I would like to be able to perform
transformations on the DWARF data after the fact (fancy compression et al)
and still say "this is the DWARF data for your binary". It feels wrong to
have to edit the stripped binary to make it match transformed debug data.
(In the abstract, I also think one should always be able to do spurious ELF
file layout juggling that doesn't change the semantics of the data.) So I
thought about using sha1sum of the loaded segments and phdrs or something.
But that's not right. Your main(){}
and my main(){}
are going to
produce the same stripped binaries, but my binaries should get me the
source code with my comments in it, and not yours just because it's
identical modulo pontification. (It's all about the pontification!)
But, the real plan behind using a strong checksum was never actually to
compute a checksum from the data ever again after build time, because you
really just rely on the comparison of a strongly unique embedded identifier.
Finding binaries for dumps
People dealing with core files often confront the problem of not knowing exactly which executable and DSO binaries were in use when the program was running. It may have been hours or months after the crash happened when a person is attempting ernest post-mortem analysis, and one really needs to have the right binaries to get anywhere. Normally the text segments are elided from the dump, so you don't have the original code to examine or to compare to a file on disk you think was the one in use.
For post-mortem debugging now, the only way to find the DSOs involved is to examine the dynamic linker's data in the dump's memory image where quasi-standard data structures give the file names the dynamic linker used to open the DSOs. That process requires you know the right address in the executable's data to look at, or know the right dynamic linker file to consult and grok its internal symbols. So even to get bootstrapped, you need to have the right executable file (or perhaps the right dynamic linker). Then, all you have are names, with no clear way to know whether the files by those names now match what existed there when this dump was made.
The case of kernel-mode/whole-machine dumps is not fundamentally different from user core dump files, except that usually all the memory is dumped so you have the text to look at.
File storage space being what it is, perhaps many people would be happy to
have core dumps include the full text segments ( see below );
the data being dumped is often far larger than the text already. Full text
segments are very handy to make e.g. disassembly and unwind info (if in
.eh_frame
) available immediately without consulting any other files at
all. But, often you really want to know the detailed provenance of the
binary, or at least quickly find the debuginfo that goes with it. What you
need from the dump is an unambiguous identifier that you can associate with
each DSO/executable that went into the crash. Having the whole text is not
so important, because you really just rely on the lookup and comparison of
a strongly unique embedded identifier.
The Plan
Unique build ID
What we really want is a unique build ID. At first I liked the canonical
UUID generation for this (128 bits of random or of something time and
host-based). But that has the very undesireable property of making for
unreproducible builds, where it's between difficult and impossible to start
from the same conditions and repeat the procedure of making binaries from
all the same constituents to get binaries without gratuitous differences
from the original build. Perhaps something like sha1sum
of the unstripped
file is what we want to use as the basis of a reproducible identifier unique
to completely identical builds. But I'd like to specify it explicitly as
being a unique identifier good only for matching, not any kind of checksum
that can be verified against the contents. (There are external general
means for content verification, and I don't think debuginfo association
needs to do that.)
To embed an ID into both the stripped object and its .debug
file, I've
chosen to use an ELF note section. strip
et al can keep the section
intact in both files when its type is SHT_NOTE
. The new section is
canonically called .note.gnu.build-id
, but the name is not normative, and
the section can be merged with other SHT_NOTE
sections. The ELF note
headers give name "GNU"
and type 3 (NT_GNU_BUILD_ID
) for a build ID
note, of which there can be only one in a linked object (or an ET_REL
file
of the .ko
style). The descsz
and the following data can be any nonzero
number of bytes chosen by the producer. It should be long enough to be
plausibly truly unique while not being unreasonably long to use as a
shorthand. The bits should be chosen in a fashion that makes it a useful
approximation of true uniqueness across all binaries that might be used by
overlapping sets of people. Likely common sizes are 16 (UUID or MD5) and 20
(SHA1). The section canonically has SHF_ALLOC
in sh_flags
, meaning it
appears in the loaded memory image, and the normal link order puts allocated
note sections very early in the memory image.
Use build ID to validate debuginfo
DWARF consumers that look at separate debuginfo files to go with a stripped
file now validate their association by doing CRC32 on the whole debuginfo
file (sometimes huge) to compare with the .gnu_debuglink
checksum. They
can change to check for a build ID note in the stripped file. When a build
ID is present, check that the debuginfo file contains an identical
corresponding SHT_NOTE
section. This requires reading only a small part of
each file, probably the first and last pages of each.
Include build IDs in core dumps
In the kernel's core dump code, it's easy to detect the mappings likely to
be the first mapping for an executable or DSO: MAP_PRIVATE
to a file at
offset 0, with ELFMAG
in the first word of the mapped contents. For each
of those, include the first page of the mapping in the core dump
(p_filesz
= PAGE_SIZE
, p_memsz
= larger total). The build ID note is
normally near the beginning of the image and will be in the first page
unless there are an unreasonable number of phdrs or other notes. (The
kernel should not deal with anything more complex than a simple rule like
the first page, so the odd binary with its notes in the wrong places will
just lose.) If it was some innocent mmap'ing and not really a loaded DSO
or the executable, then no harm done. Noone minds several extra pages in a
core dump (it just shouldn't have whole huge text segments).
It's possible that existing consumers are confused by a core file
PT_LOAD
segment with p_filesz
< p_memsz
but not p_filesz
== 0.
If need be, we can write a simple tool to extract ELF ident notes and list
ID:address associations while removing the extra pages from the core file.
GDB loaded the remaining part of the code segments zeroed, fix is accepted
upstream and present since Rawhide gdb-6.6-21.fc8.
The Work
Work I have done so far that is not already upstream is in http://people.redhat.com/roland/build-id/.
Put a build ID into every binary
- Compiler toolchain
ld
: new option--build-id
DONE
This adds an option to ld
to synthesize a .note.gnu.build-id
section with type SHT_NOTE
and flags SHF_ALLOC
(read-only data), that
contains an ELF note header and the build ID bits. This then goes into
the link as if it were part of the first object file (so it may be
placed or merged by the linker script). The build ID bits are
determined as the very last thing ld
does before writing out the linked
file. You can give --build-id=style
chose md5
, uuid
(128
random bits), or 0xabcdef
(your chosen bytes in hex). Just
--build-id
defaults to md5
, which computes an 128-bit MD5 signature
based all the ELF header bits and section contents in the file--i.e.,
an ID that is unique among the set of meaningful contents for ELF files
and identical when the output file would otherwise have been identical.
The Linux binutils-2.17.50.0.17 release includes this, in f8test1.
binutils-2.17.50.0.17-3 and later in Rawhide have the feature.
eu-strip
: keep allocated notes in debuginfo DONE
The -f
option used for separate debuginfo generation has been changed
to preserve all SHT_NOTE
sections intact in the debuginfo file.
The version released in elfutils-0.128
does this.
objcopy
: keep allocated notes in debuginfo DONE
The --only-keep-debug
option is used for separate debuginfo
generation in regimes not using elfutils
. I have committed a change
upstream to keep all SHT_NOTE
sections intact in the debuginfo file.
The Linux binutils-2.17.50.0.17 release includes this, in f8test1.
- compiler passes option by default POSTED, RAWHIDE
So this can be useful all the time, the default compiler settings should
pass --build-id
for every normal final link of a DSO or executable.
Probably %{!nostartfiles:--build-id}
or something. We can try this
first in Fedora 8, but I think it is a reasonable default for upstream
gcc once ld supports --build-id
.
f8test1 gcc has this. I've sent a patch upstream, but it has not been
integrated yet.
- Packaging
debugedit
DONE
The program /usr/lib/rpm/debugedit
is used in rpm
's separate
debuginfo creation. It modifies DWARF data to replace the build-time
absolute directory names with consistent names. This makes the installed
debuginfo's source references usable, and it makes for reproducible rpm
builds from identical constituents to produce identical binaries. The
build ID computed by ld
was affected by the name of the rpm build
directory, so it will differ between two otherwise identical builds that
used a different $RPM_BUILD_ROOT
.
My patch
adds the -i
option to make debugedit
recompute the build ID
based on the contents of the file after the transformation. It also
prints out the build ID bits in hex.
This is in rpm-4.4.2.1, in f8-test1.
find-debuginfo.sh
DONE
The /usr/lib/rpm/find-debuginfo.sh
script is what runs debugedit
.
I've modified
the script to pass the new option to debugedit
. Since my goal for
Fedora8 is to have a build ID in every binary in the distribution, I've
made it more sensitive to errors from debugedit
and it will fail if
there was no build ID. It also adds to the debuginfo package a symlink
to the binary and to the debuginfo file from /usr/lib/debug/.build-id/
;
[#symlinks see below] .
The new script is part of rpm-build >= 4.4.2.1-4.fc8, in Rawhide after test1.
Linux kernel changes
- core dump code SUBMITTED,RAWHIDE
The upstream 2.6.23 kernel includes new controls via /proc/pid/coredump_filter. My patch adds a new bit there, which makes core dumping include the first page of an ELF file mapping that would otherwise have been elided. This is turned on by default in F8, but can be disabled for a process and its children via /proc/pid/coredump_filter.
kernel-2.6.23-0.30.rc0.git6.fc8 and later in Rawhide have the feature.
- use
--build-id
on the kernel DONE
My patch series changes
the kernel linker script on several machines to handle allocated note
sections, and uses --build-id
when ld
supports it. Any other oddball
programs that have a really good reason to use the linker directly instead
of letting the compiler chose the options also need to be changed.
This is incorporated upstream now.
- xen NEEDED
Use --build-id
when linking the hypervisor.
- assembly debuginfo POSTED,RAWHIDE
My patch series turns on
-g
for assembly files in the kernel, so the debuginfo is useful for even
more of the code (to read comments in the source and so forth).
This is in -mm, and is enabled in Rawhide kernels now.
- vDSO debuginfo POSTED,RAWHIDE
My patch series also has
the kernel's make install
copy unstripped versions of the vDSO images
into /lib/modules/.../vdso
alongside the kernel's modules. These are
linked as normal DSOs via gcc
, so they will get build IDs implicitly.
Having these on disk means that when a build ID is found in the vDSO image
embedded in a core dump, the same means used for normal DSOs can find the
kernel debuginfo package and lead to seeing the vDSO's assembly in original
source form.
This is in -mm, and is enabled in Rawhide kernels now.
- /sys/kernel/notes DONE
The bonus feature in my patch series
adds the magic file /sys/kernel/notes
. Reading this gives you the
binary contents of the ELF notes section built into the kernel.
Here you can find the build ID of the running kernel. This gives a
solution to a problem that has arisen for systemtap
users, where nothing
prevents them from using the kernel-debuginfo.i586
data to drive
Systemtap's probe details, but are actually running the kernel from the
kernel.i686
rpm. This is a failure on many levels, but some simple
sanity-checking at the bottom always helps. Now it is easy to verify you
have the right debuginfo file for the kernel you are running.
This is incorporated upstream now.
Replace debuginfo CRC32 check
The consumers of debuginfo files that now use the CRC32 value in the
.gnu_debuglink
section can look for build IDs and compare those instead.
gdb
(bfd
) RAWHIDEelfutils
(libdwfl
) UPSTREAM
Find files by build ID
There needs to be some standard way to take a build ID gleaned from a memory dump and look it up to learn its proper name and find its debuginfo file. A large organization producing lots of builds might want to incorporate build ID info into their own database tracking the details of all their builds. Such complex things are beyond the scope of my what I've designed.
- Packager/consumer convention. now de facto standard
My current thinking is to use some simple filesystem conventions for looking up build IDs. This uses everybody's favorite backend database for file-sized objects keyed by fixed-size bitstrings, the humble Unix directory. At least for system installed binaries, this seems appropriately simple and adequate for the number of unique IDs in use on one system.
My plan is that the debuginfo directory contains a .build-id/
subdirectory of symlinks named by the build ID bits rendered in ASCII hex,
one symlink to the stripped file and one to the debuginfo file. For
example, /usr/lib/debug/.build-id/ab/cdef1234
for abcdef1234
(real
ones are 32 or more chars long, not 10), so /usr/bin/foo
might yield:
/usr/lib/debug/.build-id/ab/cdef1234 -> ../../../../bin/foo /usr/lib/debug/.build-id/ab/cdef1234.debug -> ../../usr/bin/foo.debug
The advantages of this are:
- Use existing configuration for debuginfo directory or path
Consumers can just start looking for .build-id/...
in the
directories where they now look for debuginfo files by name,
no new user configuration is required.
- Simple convention for all users, packagers, consumers to follow
- Optimal for consumers
Go from build ID bits to open debuginfo file in one system call.
- Simple addition to existing packaging
For rpm, the changes in find-debuginfo.sh
are straightforward,
so build-ID symlinks are included in the debuginfo rpm automatically
along with the debuginfo files and source files.
- Leverage existing packaging support for files
yum install /usr/lib/debug/.build-id/...
works with no special support- If something goes awry in the toolchain, existing mechanisms for
e.g. finding file names included in two different packages in the same repository can flag it for the debuginfo repository.
- Fedora 8 rebuild with build IDs in binaries, symlinks in debuginfo DONE
libdwfl
(elfutils
) build ID-aware consumer- build ID-driven find_debuginfo UPSTREAM
- build ID recovery from core files WORK IN PROGRESS
This is in upstream elfutils and may still make F8 release. If not, it will be in an elfutils update shortly after F8.
bfd
(gdb
) build ID-aware consumer RAWHIDE
Jan Kratochvil is working on this for F8.
Compatibility testing
DONE: old gdb no good
Existing versions of gdb are confused by a core file with segments that are only partially present in the dump but not elided completely. With my patch the kernel produces these, which have not been seen before. The meaning of the ELF header fields is perfectly clear and valid for this case, but it is a new wrinkle compared to tradition. This is why the new format option won't be enabled by default in upstream kernels for some time to come.
User Experience
End users don't really notice an impact. The big help is that development tools will be more efficient in finding their debuginfo
Dependencies
Self-contained, but a fair number of tools need changing (see above)
Contingency Plan
Things have to continue to work whether or not they're built with the new information, so the contingency is just that this partially implemented.
Documentation
See above.
Release Notes
See above.