From Fedora Project Wiki
(IFUNC resolvers)
 
(16 intermediate revisions by the same user not shown)
Line 2: Line 2:


This page collects some of the regressions introduced by [[Changes/Harden_All_Packages]].  This page primarily covers the Fedora 19/20 state of affairs, not the Fedora 23 toolchain state, when the change was actually implemented.
This page collects some of the regressions introduced by [[Changes/Harden_All_Packages]].  This page primarily covers the Fedora 19/20 state of affairs, not the Fedora 23 toolchain state, when the change was actually implemented.
In the notes below, “self-compiled code” refers to code compiled with the default (non-hardened) toolchain flags, possibly with optimization added, without using the distribution-specific defaults supplied by the <tt>redhat-rpm-config</tt> package.  All PIE changes are unlikely to affect self-compiled code because static linking is discouraged, and dynamically linked code used by applications is already PIC.
RELRO is not discussed below because we believe it is fully transparent.
== Kernel issues ==
* On x86_64, Kernels without commit a3defbe5c337dbc6da911f8cc49ae3cc3b49b453 (binfmt_elf: fix PIE execution with randomization disabled; presently not included in the Red Hat Enterprise Linux 6 kernel) allocate a single 128 MiB memory region for the heap and stack, and will grow the heap into the stack area if ASLR is disabled.  As a result, running such binaries under GDB can cause crashes with stack overflows because the kernel effectively does not reserve the configured amount of stack.


== Package-specific issues ==
== Package-specific issues ==
Line 7: Line 15:
These issues have to be fixed in packages themselves.
These issues have to be fixed in packages themselves.


* Lack of position-independent code. (Affects PIE only.)  These issues result in errors from the static linker (due to unsupported relocations).  On architectures which support text relocations, NX/XD/execmod/W^X enforcement by SELinux can fail at run time.
* Lack of position-independent code. (Affects PIE only, does not affect self-compiled code.)  These issues result in errors from the static linker (due to unsupported relocations).  On architectures which support text relocations, NX/XD/execmod/W^X enforcement by SELinux can fail at run time.
** If source code is not compiled as position-independent because the <code>CFLAGS</code> is not passed through, the package build system needs to be updated.
** If source code is not compiled as position-independent because the <code>CFLAGS</code> is not passed through, the package build system needs to be updated.
** Hand-written assembly must be ported to be position-independent.
** Hand-written assembly must be ported to be position-independent.


* Problems with enabling position-independent code due to register allocation.  (Affects PIE only.)
* Problems with enabling position-independent code due to register allocation.  (Affects PIE only, does not affect self-compiled code.)
** On i686 with <code>-fPIE</code> or <code>-fPIC</code>, <code>%ebx</code> is a hard register reserved for the GOT pointer.  This means that it cannot be used in GCC extended <code>asm</code> constraints.  This can lead to register allocation failures at compile time.
** On i686 with <code>-fPIE</code> or <code>-fPIC</code>, <code>%ebx</code> is a hard register reserved for the GOT pointer.  This means that it cannot be used in GCC extended <code>asm</code> constraints.  This can lead to register allocation failures at compile time.
** The increased register pressure changes register allocation, altering the way GCC extended <code>asm</code> constraints are used.  If these constraints are incorrect, builds succeed, but applications may fail at run time.  (Example: a <code>"g"</code>-constrained variable now references an on-stack variable using a SP-relative memory operand, where previously, a register was used, but the <code>asm</code> statement temporarily modifies the stack pointer, resulting in an incorrect offset being applied.)
** The increased register pressure changes register allocation, altering the way GCC extended <code>asm</code> constraints are used.  If these constraints are incorrect, builds succeed, but applications may fail at run time.  (Example: a <code>"g"</code>-constrained variable now references an on-stack variable using a SP-relative memory operand, where previously, a register was used, but the <code>asm</code> statement temporarily modifies the stack pointer, resulting in an incorrect offset being applied.  [https://bugzilla.redhat.com/show_bug.cgi?id=1312551 Bug 1312551] is a similar example.)
* <code>BIND_NOW</code> requires all symbols to be defined.  (Affects <code>BIND_NOW</code> only, unlikely to affect self-compiled code.)  Loading a shared object at run time requires all symbols to be defined, otherwise loading the object will fail.  (They do not have to be defined when the static linker created the DSO, though.)  This is a consequence of the fact that <code>BIND_NOW</code> resolves all symbols at load time, and cannot be changed without run-time code generation.  It means that cycles in the dependency graph between DSOs (whether expressed through <code>DT_NEEDED</code> or not) are not allowed.  Applications and libraries may have to be updated to work under these circumstances.
* Additional address space layout randomization (ASLR) interferes with <code>MAP_FIXED</code> mappings and other implicit assumptions about address space layout.  (Affects PIE only, unlikely to affect self-compiled code.)  This is a problem already with shared objects, and some programs (such as Emacs) disable ASLR as a result.  Failures will manifest only at run time, and PIE needs to disabled manually for such binaries.


== Toolchain bugs ==
== Toolchain issues ==


* IFUNC resolvers can start crashing because they reference unresolved symbols.  (Affects <code>BIND_NOW</code>.)  An example is [https://bugzilla.redhat.com/show_bug.cgi?id=1326903 rhbz#1326903].  It is unclear whether we can fix this in the dynamic linker in all cases without run-time code generation because in some cases, the <code>DT_NEEDED</code> entries do not convey sufficient dependency information.  However, the dynamic linker needs to be changed to perform non-IFUNC GOT updates before running IFUNC resolvers, which covers a lot of IFUNC usage scenarios.
* IFUNC resolvers can start crashing because they reference unresolved symbols.  (Affects <code>BIND_NOW</code>, unlikely to affect self-compiled code.)  An example is [https://bugzilla.redhat.com/show_bug.cgi?id=1326903 bug 1326903].  It is unclear whether we can fix this in the dynamic linker in all cases without run-time code generation because in some cases, the <code>DT_NEEDED</code> entries do not convey sufficient dependency information.  However, the dynamic linker needs to be changed to perform non-IFUNC GOT updates before running IFUNC resolvers, which covers a lot of IFUNC usage scenarios.
* GCC and glibc need tweaks to support statically linked PIE binaries.  (Affects PIE only.)  It is not a problem to link PIC/PIE code into a statically-linked executable, but it currently has to be mapped at a fixed address.  (It is also unclear to what extend available ELF hardening features such as RELRO are applied to statically linked binaries.)
* Process startup time can degrade significantly with <code>BIND_NOW</code> and shared objects with long dependency chains.  (Affects <code>BIND_NOW</code> only, but affects self-compiled code.) Per-symbol binding is very fast, but many shared objects with long symbol chains still introduce noticeable startup delays.  Several hundred milliseconds have been observed for <tt>qemu-kvm</tt> and <tt>emacs</tt> (the package with GUI support).  Previous <code>BIND_NOW</code> benchmarks have focused on raw symbol binding performance, which is still excellent, but deep DSO dependency chains multiply the lookup work performed by the current dynamic loading algorithm.
* Position-independent code is unnecessarily slow on i686.  (Affects PIE only, unlikely to affect self-compiled code.)  i686 does not support PC-relative addressing, so a register is required to hold the GOT pointer.  Older GCC versions require that <code>%ebx</code> holds that pointer, pessimizing register allocation.
* Position-independent executables optimization are missing from the toolchain.  (Affects PIE only.)  The current toolchain uses a hybrid mode to create binaries which are PIE but not PIC, almost completely eliminating the performance overhead of PIE on architectures with PC-relative addressing.  The Fedora 19/20 toolchain did not support this.

Latest revision as of 11:19, 5 January 2017

Regressions introduced by the “Harden All Packages” Fedora Change

This page collects some of the regressions introduced by Changes/Harden_All_Packages. This page primarily covers the Fedora 19/20 state of affairs, not the Fedora 23 toolchain state, when the change was actually implemented.

In the notes below, “self-compiled code” refers to code compiled with the default (non-hardened) toolchain flags, possibly with optimization added, without using the distribution-specific defaults supplied by the redhat-rpm-config package. All PIE changes are unlikely to affect self-compiled code because static linking is discouraged, and dynamically linked code used by applications is already PIC.

RELRO is not discussed below because we believe it is fully transparent.

Kernel issues

  • On x86_64, Kernels without commit a3defbe5c337dbc6da911f8cc49ae3cc3b49b453 (binfmt_elf: fix PIE execution with randomization disabled; presently not included in the Red Hat Enterprise Linux 6 kernel) allocate a single 128 MiB memory region for the heap and stack, and will grow the heap into the stack area if ASLR is disabled. As a result, running such binaries under GDB can cause crashes with stack overflows because the kernel effectively does not reserve the configured amount of stack.

Package-specific issues

These issues have to be fixed in packages themselves.

  • Lack of position-independent code. (Affects PIE only, does not affect self-compiled code.) These issues result in errors from the static linker (due to unsupported relocations). On architectures which support text relocations, NX/XD/execmod/W^X enforcement by SELinux can fail at run time.
    • If source code is not compiled as position-independent because the CFLAGS is not passed through, the package build system needs to be updated.
    • Hand-written assembly must be ported to be position-independent.
  • Problems with enabling position-independent code due to register allocation. (Affects PIE only, does not affect self-compiled code.)
    • On i686 with -fPIE or -fPIC, %ebx is a hard register reserved for the GOT pointer. This means that it cannot be used in GCC extended asm constraints. This can lead to register allocation failures at compile time.
    • The increased register pressure changes register allocation, altering the way GCC extended asm constraints are used. If these constraints are incorrect, builds succeed, but applications may fail at run time. (Example: a "g"-constrained variable now references an on-stack variable using a SP-relative memory operand, where previously, a register was used, but the asm statement temporarily modifies the stack pointer, resulting in an incorrect offset being applied. Bug 1312551 is a similar example.)
  • BIND_NOW requires all symbols to be defined. (Affects BIND_NOW only, unlikely to affect self-compiled code.) Loading a shared object at run time requires all symbols to be defined, otherwise loading the object will fail. (They do not have to be defined when the static linker created the DSO, though.) This is a consequence of the fact that BIND_NOW resolves all symbols at load time, and cannot be changed without run-time code generation. It means that cycles in the dependency graph between DSOs (whether expressed through DT_NEEDED or not) are not allowed. Applications and libraries may have to be updated to work under these circumstances.
  • Additional address space layout randomization (ASLR) interferes with MAP_FIXED mappings and other implicit assumptions about address space layout. (Affects PIE only, unlikely to affect self-compiled code.) This is a problem already with shared objects, and some programs (such as Emacs) disable ASLR as a result. Failures will manifest only at run time, and PIE needs to disabled manually for such binaries.

Toolchain issues

  • IFUNC resolvers can start crashing because they reference unresolved symbols. (Affects BIND_NOW, unlikely to affect self-compiled code.) An example is bug 1326903. It is unclear whether we can fix this in the dynamic linker in all cases without run-time code generation because in some cases, the DT_NEEDED entries do not convey sufficient dependency information. However, the dynamic linker needs to be changed to perform non-IFUNC GOT updates before running IFUNC resolvers, which covers a lot of IFUNC usage scenarios.
  • GCC and glibc need tweaks to support statically linked PIE binaries. (Affects PIE only.) It is not a problem to link PIC/PIE code into a statically-linked executable, but it currently has to be mapped at a fixed address. (It is also unclear to what extend available ELF hardening features such as RELRO are applied to statically linked binaries.)
  • Process startup time can degrade significantly with BIND_NOW and shared objects with long dependency chains. (Affects BIND_NOW only, but affects self-compiled code.) Per-symbol binding is very fast, but many shared objects with long symbol chains still introduce noticeable startup delays. Several hundred milliseconds have been observed for qemu-kvm and emacs (the package with GUI support). Previous BIND_NOW benchmarks have focused on raw symbol binding performance, which is still excellent, but deep DSO dependency chains multiply the lookup work performed by the current dynamic loading algorithm.
  • Position-independent code is unnecessarily slow on i686. (Affects PIE only, unlikely to affect self-compiled code.) i686 does not support PC-relative addressing, so a register is required to hold the GOT pointer. Older GCC versions require that %ebx holds that pointer, pessimizing register allocation.
  • Position-independent executables optimization are missing from the toolchain. (Affects PIE only.) The current toolchain uses a hybrid mode to create binaries which are PIE but not PIC, almost completely eliminating the performance overhead of PIE on architectures with PC-relative addressing. The Fedora 19/20 toolchain did not support this.