No edit summary |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
RELRO is not discussed below because we believe it is fully transparent. | RELRO is not discussed below because we believe it is fully transparent. | ||
== Kernel issues == | |||
* On x86_64, Kernels without commit a3defbe5c337dbc6da911f8cc49ae3cc3b49b453 (binfmt_elf: fix PIE execution with randomization disabled; presently not included in the Red Hat Enterprise Linux 6 kernel) allocate a single 128 MiB memory region for the heap and stack, and will grow the heap into the stack area if ASLR is disabled. As a result, running such binaries under GDB can cause crashes with stack overflows because the kernel effectively does not reserve the configured amount of stack. | |||
== Package-specific issues == | == Package-specific issues == | ||
Line 27: | Line 31: | ||
* Process startup time can degrade significantly with <code>BIND_NOW</code> and shared objects with long dependency chains. (Affects <code>BIND_NOW</code> only, but affects self-compiled code.) Per-symbol binding is very fast, but many shared objects with long symbol chains still introduce noticeable startup delays. Several hundred milliseconds have been observed for <tt>qemu-kvm</tt> and <tt>emacs</tt> (the package with GUI support). Previous <code>BIND_NOW</code> benchmarks have focused on raw symbol binding performance, which is still excellent, but deep DSO dependency chains multiply the lookup work performed by the current dynamic loading algorithm. | * Process startup time can degrade significantly with <code>BIND_NOW</code> and shared objects with long dependency chains. (Affects <code>BIND_NOW</code> only, but affects self-compiled code.) Per-symbol binding is very fast, but many shared objects with long symbol chains still introduce noticeable startup delays. Several hundred milliseconds have been observed for <tt>qemu-kvm</tt> and <tt>emacs</tt> (the package with GUI support). Previous <code>BIND_NOW</code> benchmarks have focused on raw symbol binding performance, which is still excellent, but deep DSO dependency chains multiply the lookup work performed by the current dynamic loading algorithm. | ||
* Position-independent code is unnecessarily slow on i686. (Affects PIE only, unlikely to affect self-compiled code.) i686 does not support PC-relative addressing, so a register is required to hold the GOT pointer. Older GCC versions require that <code>%ebx</code> holds that pointer, pessimizing register allocation. | * Position-independent code is unnecessarily slow on i686. (Affects PIE only, unlikely to affect self-compiled code.) i686 does not support PC-relative addressing, so a register is required to hold the GOT pointer. Older GCC versions require that <code>%ebx</code> holds that pointer, pessimizing register allocation. | ||
* Position-independent executables optimization are missing from the toolchain. (Affects PIE only | * Position-independent executables optimization are missing from the toolchain. (Affects PIE only.) The current toolchain uses a hybrid mode to create binaries which are PIE but not PIC, almost completely eliminating the performance overhead of PIE on architectures with PC-relative addressing. The Fedora 19/20 toolchain did not support this. |
Latest revision as of 11:19, 5 January 2017
Regressions introduced by the “Harden All Packages” Fedora Change
This page collects some of the regressions introduced by Changes/Harden_All_Packages. This page primarily covers the Fedora 19/20 state of affairs, not the Fedora 23 toolchain state, when the change was actually implemented.
In the notes below, “self-compiled code” refers to code compiled with the default (non-hardened) toolchain flags, possibly with optimization added, without using the distribution-specific defaults supplied by the redhat-rpm-config package. All PIE changes are unlikely to affect self-compiled code because static linking is discouraged, and dynamically linked code used by applications is already PIC.
RELRO is not discussed below because we believe it is fully transparent.
Kernel issues
- On x86_64, Kernels without commit a3defbe5c337dbc6da911f8cc49ae3cc3b49b453 (binfmt_elf: fix PIE execution with randomization disabled; presently not included in the Red Hat Enterprise Linux 6 kernel) allocate a single 128 MiB memory region for the heap and stack, and will grow the heap into the stack area if ASLR is disabled. As a result, running such binaries under GDB can cause crashes with stack overflows because the kernel effectively does not reserve the configured amount of stack.
Package-specific issues
These issues have to be fixed in packages themselves.
- Lack of position-independent code. (Affects PIE only, does not affect self-compiled code.) These issues result in errors from the static linker (due to unsupported relocations). On architectures which support text relocations, NX/XD/execmod/W^X enforcement by SELinux can fail at run time.
- If source code is not compiled as position-independent because the
CFLAGS
is not passed through, the package build system needs to be updated. - Hand-written assembly must be ported to be position-independent.
- If source code is not compiled as position-independent because the
- Problems with enabling position-independent code due to register allocation. (Affects PIE only, does not affect self-compiled code.)
- On i686 with
-fPIE
or-fPIC
,%ebx
is a hard register reserved for the GOT pointer. This means that it cannot be used in GCC extendedasm
constraints. This can lead to register allocation failures at compile time. - The increased register pressure changes register allocation, altering the way GCC extended
asm
constraints are used. If these constraints are incorrect, builds succeed, but applications may fail at run time. (Example: a"g"
-constrained variable now references an on-stack variable using a SP-relative memory operand, where previously, a register was used, but theasm
statement temporarily modifies the stack pointer, resulting in an incorrect offset being applied. Bug 1312551 is a similar example.)
- On i686 with
BIND_NOW
requires all symbols to be defined. (AffectsBIND_NOW
only, unlikely to affect self-compiled code.) Loading a shared object at run time requires all symbols to be defined, otherwise loading the object will fail. (They do not have to be defined when the static linker created the DSO, though.) This is a consequence of the fact thatBIND_NOW
resolves all symbols at load time, and cannot be changed without run-time code generation. It means that cycles in the dependency graph between DSOs (whether expressed throughDT_NEEDED
or not) are not allowed. Applications and libraries may have to be updated to work under these circumstances.- Additional address space layout randomization (ASLR) interferes with
MAP_FIXED
mappings and other implicit assumptions about address space layout. (Affects PIE only, unlikely to affect self-compiled code.) This is a problem already with shared objects, and some programs (such as Emacs) disable ASLR as a result. Failures will manifest only at run time, and PIE needs to disabled manually for such binaries.
Toolchain issues
- IFUNC resolvers can start crashing because they reference unresolved symbols. (Affects
BIND_NOW
, unlikely to affect self-compiled code.) An example is bug 1326903. It is unclear whether we can fix this in the dynamic linker in all cases without run-time code generation because in some cases, theDT_NEEDED
entries do not convey sufficient dependency information. However, the dynamic linker needs to be changed to perform non-IFUNC GOT updates before running IFUNC resolvers, which covers a lot of IFUNC usage scenarios. - GCC and glibc need tweaks to support statically linked PIE binaries. (Affects PIE only.) It is not a problem to link PIC/PIE code into a statically-linked executable, but it currently has to be mapped at a fixed address. (It is also unclear to what extend available ELF hardening features such as RELRO are applied to statically linked binaries.)
- Process startup time can degrade significantly with
BIND_NOW
and shared objects with long dependency chains. (AffectsBIND_NOW
only, but affects self-compiled code.) Per-symbol binding is very fast, but many shared objects with long symbol chains still introduce noticeable startup delays. Several hundred milliseconds have been observed for qemu-kvm and emacs (the package with GUI support). PreviousBIND_NOW
benchmarks have focused on raw symbol binding performance, which is still excellent, but deep DSO dependency chains multiply the lookup work performed by the current dynamic loading algorithm. - Position-independent code is unnecessarily slow on i686. (Affects PIE only, unlikely to affect self-compiled code.) i686 does not support PC-relative addressing, so a register is required to hold the GOT pointer. Older GCC versions require that
%ebx
holds that pointer, pessimizing register allocation. - Position-independent executables optimization are missing from the toolchain. (Affects PIE only.) The current toolchain uses a hybrid mode to create binaries which are PIE but not PIC, almost completely eliminating the performance overhead of PIE on architectures with PC-relative addressing. The Fedora 19/20 toolchain did not support this.