Add BuildID Support

Summary

Make core dumps self-identifying enough to find the exact correct versions of all relevant binaries and debuginfo
Optimize debuginfo validation to avoid current whole-file checksumming

Owner

Name: RolandMcGrath

Current status

Targeted release: Fedora 8
Last updated: 2007-10-04
Percentage of completion: 100%

Detailed Description

The Problems

Quick debuginfo validation

The current .gnu_debuglink ties a stripped object (executable, DSO, .ko) to its .debug file with a name (in practice always foobar.debug in a file originally called /some/thing/foobar), and a (weak) CRC32 checksum of the .debug file. The only way to find the .debug file is by name based on knowing the name of the stripped object, the two having come in parallel from rpm or whatever. The only way to verify it's really the right one is to read the entire contents of the .debug file from disk to checksum it. This is obviously poor for memory and I/O performance, especially as the DWARF consumers keep improving to avoid reading in as much debuginfo at once.

A while ago, a few of us spoke about improving this. What we discussed was switching to a strong checksum (sha1sum) so that comparing two stored checksums is sufficient for confidence in validation. The important change is to have the checksum stored in both the .debug file and the stripped file, so that there is a small amount of data to read from the file that is sufficient to do the validation. We had in mind simply a new kind of debuglink section (still not allocated) that eu-strip/objcopy would insert into both output files. We didn't really change the checksum plan aside from the algorithm, i.e. using sha1sum of the entire unstripped file or of the .debug file with an all-zeros debuglink section.

In later contemplation, I became dissatisfied with using a checksum on the entire contents of the .debug file. I would like to be able to perform transformations on the DWARF data after the fact (fancy compression et al) and still say "this is the DWARF data for your binary". It feels wrong to have to edit the stripped binary to make it match transformed debug data. (In the abstract, I also think one should always be able to do spurious ELF file layout juggling that doesn't change the semantics of the data.) So I thought about using sha1sum of the loaded segments and phdrs or something. But that's not right. Your main(){} and my main(){} are going to produce the same stripped binaries, but my binaries should get me the source code with my comments in it, and not yours just because it's identical modulo pontification. (It's all about the pontification!) But, the real plan behind using a strong checksum was never actually to compute a checksum from the data ever again after build time, because you really just rely on the comparison of a strongly unique embedded identifier.

Finding binaries for dumps

People dealing with core files often confront the problem of not knowing exactly which executable and DSO binaries were in use when the program was running. It may have been hours or months after the crash happened when a person is attempting ernest post-mortem analysis, and one really needs to have the right binaries to get anywhere. Normally the text segments are elided from the dump, so you don't have the original code to examine or to compare to a file on disk you think was the one in use.

For post-mortem debugging now, the only way to find the DSOs involved is to examine the dynamic linker's data in the dump's memory image where quasi-standard data structures give the file names the dynamic linker used to open the DSOs. That process requires you know the right address in the executable's data to look at, or know the right dynamic linker file to consult and grok its internal symbols. So even to get bootstrapped, you need to have the right executable file (or perhaps the right dynamic linker). Then, all you have are names, with no clear way to know whether the files by those names now match what existed there when this dump was made.

The case of kernel-mode/whole-machine dumps is not fundamentally different from user core dump files, except that usually all the memory is dumped so you have the text to look at.

File storage space being what it is, perhaps many people would be happy to have core dumps include the full text segments ( see below ); the data being dumped is often far larger than the text already. Full text segments are very handy to make e.g. disassembly and unwind info (if in .eh_frame) available immediately without consulting any other files at all. But, often you really want to know the detailed provenance of the binary, or at least quickly find the debuginfo that goes with it. What you need from the dump is an unambiguous identifier that you can associate with each DSO/executable that went into the crash. Having the whole text is not so important, because you really just rely on the lookup and comparison of a strongly unique embedded identifier.

The Plan

Unique build ID

What we really want is a unique build ID. At first I liked the canonical UUID generation for this (128 bits of random or of something time and host-based). But that has the very undesireable property of making for unreproducible builds, where it's between difficult and impossible to start from the same conditions and repeat the procedure of making binaries from all the same constituents to get binaries without gratuitous differences from the original build. Perhaps something like sha1sum of the unstripped file is what we want to use as the basis of a reproducible identifier unique to completely identical builds. But I'd like to specify it explicitly as being a unique identifier good only for matching, not any kind of checksum that can be verified against the contents. (There are external general means for content verification, and I don't think debuginfo association needs to do that.)

To embed an ID into both the stripped object and its .debug file, I've chosen to use an ELF note section. strip et al can keep the section intact in both files when its type is SHT_NOTE. The new section is canonically called .note.gnu.build-id, but the name is not normative, and the section can be merged with other SHT_NOTE sections. The ELF note headers give name "GNU" and type 3 (NT_GNU_BUILD_ID) for a build ID note, of which there can be only one in a linked object (or an ET_REL file of the .ko style). The descsz and the following data can be any nonzero number of bytes chosen by the producer. It should be long enough to be plausibly truly unique while not being unreasonably long to use as a shorthand. The bits should be chosen in a fashion that makes it a useful approximation of true uniqueness across all binaries that might be used by overlapping sets of people. Likely common sizes are 16 (UUID or MD5) and 20 (SHA1). The section canonically has SHF_ALLOC in sh_flags, meaning it appears in the loaded memory image, and the normal link order puts allocated note sections very early in the memory image.

Use build ID to validate debuginfo

DWARF consumers that look at separate debuginfo files to go with a stripped file now validate their association by doing CRC32 on the whole debuginfo file (sometimes huge) to compare with the .gnu_debuglink checksum. They can change to check for a build ID note in the stripped file. When a build ID is present, check that the debuginfo file contains an identical corresponding SHT_NOTE section. This requires reading only a small part of each file, probably the first and last pages of each.

Include build IDs in core dumps

In the kernel's core dump code, it's easy to detect the mappings likely to be the first mapping for an executable or DSO: MAP_PRIVATE to a file at offset 0, with ELFMAG in the first word of the mapped contents. For each of those, include the first page of the mapping in the core dump (p_filesz = PAGE_SIZE, p_memsz = larger total). The build ID note is normally near the beginning of the image and will be in the first page unless there are an unreasonable number of phdrs or other notes. (The kernel should not deal with anything more complex than a simple rule like the first page, so the odd binary with its notes in the wrong places will just lose.) If it was some innocent mmap'ing and not really a loaded DSO or the executable, then no harm done. Noone minds several extra pages in a core dump (it just shouldn't have whole huge text segments).

It's possible that existing consumers are confused by a core file PT_LOAD segment with p_filesz < p_memsz but not p_filesz == 0. If need be, we can write a simple tool to extract ELF ident notes and list ID:address associations while removing the extra pages from the core file. GDB loaded the remaining part of the code segments zeroed, fix is accepted upstream and present since Rawhide gdb-6.6-21.fc8.

The Work

Work I have done so far that is not already upstream is in http://people.redhat.com/roland/build-id/.

Put a build ID into every binary

Compiler toolchain

ld: new option --build-id DONE

This adds an option to ld to synthesize a .note.gnu.build-id section with type SHT_NOTE and flags SHF_ALLOC (read-only data), that contains an ELF note header and the build ID bits. This then goes into the link as if it were part of the first object file (so it may be placed or merged by the linker script). The build ID bits are determined as the very last thing ld does before writing out the linked file. You can give --build-id=style chose md5, uuid (128 random bits), or 0xabcdef (your chosen bytes in hex). Just --build-id defaults to md5, which computes an 128-bit MD5 signature based all the ELF header bits and section contents in the file--i.e., an ID that is unique among the set of meaningful contents for ELF files and identical when the output file would otherwise have been identical.

The Linux binutils-2.17.50.0.17 release includes this, in f8test1.

binutils-2.17.50.0.17-3 and later in Rawhide have the feature.

eu-strip: keep allocated notes in debuginfo DONE

The -f option used for separate debuginfo generation has been changed to preserve all SHT_NOTE sections intact in the debuginfo file. The version released in elfutils-0.128 does this.

objcopy: keep allocated notes in debuginfo DONE

The --only-keep-debug option is used for separate debuginfo generation in regimes not using elfutils. I have committed a change upstream to keep all SHT_NOTE sections intact in the debuginfo file. The Linux binutils-2.17.50.0.17 release includes this, in f8test1.

compiler passes option by default POSTED, RAWHIDE

So this can be useful all the time, the default compiler settings should pass --build-id for every normal final link of a DSO or executable. Probably %{!nostartfiles:--build-id} or something. We can try this first in Fedora 8, but I think it is a reasonable default for upstream gcc once ld supports --build-id.

f8test1 gcc has this. I've sent a patch upstream, but it has not been integrated yet.

Packaging

debugedit DONE

The program /usr/lib/rpm/debugedit is used in rpm's separate debuginfo creation. It modifies DWARF data to replace the build-time absolute directory names with consistent names. This makes the installed debuginfo's source references usable, and it makes for reproducible rpm builds from identical constituents to produce identical binaries. The build ID computed by ld was affected by the name of the rpm build directory, so it will differ between two otherwise identical builds that used a different $RPM_BUILD_ROOT.

My patch adds the -i option to make debugedit recompute the build ID based on the contents of the file after the transformation. It also prints out the build ID bits in hex.

This is in rpm-4.4.2.1, in f8-test1.

find-debuginfo.sh DONE

The /usr/lib/rpm/find-debuginfo.sh script is what runs debugedit. I've modified the script to pass the new option to debugedit. Since my goal for Fedora8 is to have a build ID in every binary in the distribution, I've made it more sensitive to errors from debugedit and it will fail if there was no build ID. It also adds to the debuginfo package a symlink to the binary and to the debuginfo file from /usr/lib/debug/.build-id/; [#symlinks see below] .

The new script is part of rpm-build >= 4.4.2.1-4.fc8, in Rawhide after test1.

Linux kernel changes

core dump code SUBMITTED,RAWHIDE

The upstream 2.6.23 kernel includes new controls via /proc/pid/coredump_filter. My patch adds a new bit there, which makes core dumping include the first page of an ELF file mapping that would otherwise have been elided. This is turned on by default in F8, but can be disabled for a process and its children via /proc/pid/coredump_filter.

kernel-2.6.23-0.30.rc0.git6.fc8 and later in Rawhide have the feature.

use --build-id on the kernel DONE

My patch series changes the kernel linker script on several machines to handle allocated note sections, and uses --build-id when ld supports it. Any other oddball programs that have a really good reason to use the linker directly instead of letting the compiler chose the options also need to be changed.

This is incorporated upstream now.

xen NEEDED

Use --build-id when linking the hypervisor.

assembly debuginfo POSTED,RAWHIDE

My patch series turns on -g for assembly files in the kernel, so the debuginfo is useful for even more of the code (to read comments in the source and so forth).

This is in -mm, and is enabled in Rawhide kernels now.

vDSO debuginfo POSTED,RAWHIDE

My patch series also has the kernel's make install copy unstripped versions of the vDSO images into /lib/modules/.../vdso alongside the kernel's modules. These are linked as normal DSOs via gcc, so they will get build IDs implicitly. Having these on disk means that when a build ID is found in the vDSO image embedded in a core dump, the same means used for normal DSOs can find the kernel debuginfo package and lead to seeing the vDSO's assembly in original source form.

This is in -mm, and is enabled in Rawhide kernels now.

/sys/kernel/notes DONE

The bonus feature in my patch series adds the magic file /sys/kernel/notes. Reading this gives you the binary contents of the ELF notes section built into the kernel. Here you can find the build ID of the running kernel. This gives a solution to a problem that has arisen for systemtap users, where nothing prevents them from using the kernel-debuginfo.i586 data to drive Systemtap's probe details, but are actually running the kernel from the kernel.i686 rpm. This is a failure on many levels, but some simple sanity-checking at the bottom always helps. Now it is easy to verify you have the right debuginfo file for the kernel you are running.

This is incorporated upstream now.

Replace debuginfo CRC32 check

The consumers of debuginfo files that now use the CRC32 value in the .gnu_debuglink section can look for build IDs and compare those instead.

gdb (bfd) RAWHIDE
elfutils (libdwfl) UPSTREAM

Find files by build ID

There needs to be some standard way to take a build ID gleaned from a memory dump and look it up to learn its proper name and find its debuginfo file. A large organization producing lots of builds might want to incorporate build ID info into their own database tracking the details of all their builds. Such complex things are beyond the scope of my what I've designed.

Packager/consumer convention. now de facto standard

My current thinking is to use some simple filesystem conventions for looking up build IDs. This uses everybody's favorite backend database for file-sized objects keyed by fixed-size bitstrings, the humble Unix directory. At least for system installed binaries, this seems appropriately simple and adequate for the number of unique IDs in use on one system.

My plan is that the debuginfo directory contains a .build-id/ subdirectory of symlinks named by the build ID bits rendered in ASCII hex, one symlink to the stripped file and one to the debuginfo file. For example, /usr/lib/debug/.build-id/ab/cdef1234 for abcdef1234 (real ones are 32 or more chars long, not 10), so /usr/bin/foo might yield:

/usr/lib/debug/.build-id/ab/cdef1234 -> ../../../../bin/foo
/usr/lib/debug/.build-id/ab/cdef1234.debug -> ../../usr/bin/foo.debug

The advantages of this are:

Use existing configuration for debuginfo directory or path

Consumers can just start looking for .build-id/... in the directories where they now look for debuginfo files by name, no new user configuration is required.

Simple convention for all users, packagers, consumers to follow
Optimal for consumers

Go from build ID bits to open debuginfo file in one system call.

Simple addition to existing packaging

For rpm, the changes in find-debuginfo.sh are straightforward, so build-ID symlinks are included in the debuginfo rpm automatically along with the debuginfo files and source files.

Leverage existing packaging support for files
yum install /usr/lib/debug/.build-id/... works with no special support
If something goes awry in the toolchain, existing mechanisms for

e.g. finding file names included in two different packages in the same repository can flag it for the debuginfo repository.

Fedora 8 rebuild with build IDs in binaries, symlinks in debuginfo DONE

libdwfl (elfutils) build ID-aware consumer
build ID-driven find_debuginfo UPSTREAM
build ID recovery from core files WORK IN PROGRESS

This is in upstream elfutils and may still make F8 release. If not, it will be in an elfutils update shortly after F8.

bfd (gdb) build ID-aware consumer RAWHIDE

Jan Kratochvil is working on this for F8.

Compatibility testing

DONE: old gdb no good

Existing versions of gdb are confused by a core file with segments that are only partially present in the dump but not elided completely. With my patch the kernel produces these, which have not been seen before. The meaning of the ELF header fields is perfectly clear and valid for this case, but it is a new wrinkle compared to tradition. This is why the new format option won't be enabled by default in upstream kernels for some time to come.

User Experience

End users don't really notice an impact. The big help is that development tools will be more efficient in finding their debuginfo

Dependencies

Self-contained, but a fair number of tools need changing (see above)

Contingency Plan

Things have to continue to work whether or not they're built with the new information, so the contingency is just that this partially implemented.

Documentation

See above.

Release Notes