No edit summary |
(Updated wiki content per request) |
||
Line 52: | Line 52: | ||
--> | --> | ||
* Tracker bug: <will be assigned by the Wrangler> | * Tracker bug: <will be assigned by the Wrangler> | ||
* https://pagure.io/releng/issue/7412 | |||
== Detailed Description == | == Detailed Description == | ||
389-ds project have found an issue which causes system instability on all versions of 1.4.x of the server on i686 platform. This is a hardware limitation of the platform related to how we consume atomic types. This may lead to thread unsafety and other issues. | 389-ds project have found an issue which causes system instability on all versions of 1.4.x of the server on i686 platform. This is a hardware limitation of the platform related to how we consume atomic types. Our testing has shown that in i686 the atomic counters can have unpredictable values/behaviour. We use atomic counters all over the code base, and it many critical objects. This may lead to thread unsafety and other serious issues. | ||
There is more detail in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1544386 | |||
Here is a bug snippet from the engineer who originally found the problem and did the all the investigation work (who is also not at Red Hat any more): | |||
--------------------------------- | |||
In libc there are a number of __atomic_* types that promise that "They | |||
perform atomic manipulations of the data, falling back to a mutex in | |||
the case that a native cpu atomic can not be used". On 32bit platforms | |||
this fallback does *not* occur correctly, meaning that either the lower | |||
or upper half only is atomically updated. This is due to variable | |||
alignment not being correctly emited by gcc, meaning we would have to | |||
then align every data that uses this. | |||
This was discovered due to expansion of our testing capability of the C | |||
code base - a server stress test showed that counters could become | |||
wildly inaccurate. | |||
Given we use atomics for reference counting in a number of objects, as | |||
well as for monitors, this can cause objects to leak, be freed early, | |||
or to report incorrect data... | |||
----------------------------------- | |||
Also impacted: | |||
- FreeIPA server will not be available on i686 due to this | - FreeIPA server will not be available on i686 due to this | ||
Line 118: | Line 143: | ||
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES --> | * Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES --> | ||
* Blocks product? product <!-- Applicable for Changes that blocks specific product release/Fedora.next --> | * Blocks product? product <!-- Applicable for Changes that blocks specific product release/Fedora.next --> | ||
Somehow flag the build as unstable, use at your own risk. | |||
== Documentation == | == Documentation == |
Revision as of 20:53, 29 March 2018
Stop building 389-ds-base on i686
Summary
389-ds-base does not work properly on i686 hardware in regards to atomic types.
Owner
- Name: Mark Reynolds
- Email: mreynolds@redhat.com
- Release notes owner:
- Product: 389-ds-base. FreeIPA, slapi-nis
- Responsible WG:
Current status
- Targeted release: Fedora 28
- Last updated: 2018-03-29
- Tracker bug: <will be assigned by the Wrangler>
- https://pagure.io/releng/issue/7412
Detailed Description
389-ds project have found an issue which causes system instability on all versions of 1.4.x of the server on i686 platform. This is a hardware limitation of the platform related to how we consume atomic types. Our testing has shown that in i686 the atomic counters can have unpredictable values/behaviour. We use atomic counters all over the code base, and it many critical objects. This may lead to thread unsafety and other serious issues.
There is more detail in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1544386
Here is a bug snippet from the engineer who originally found the problem and did the all the investigation work (who is also not at Red Hat any more):
In libc there are a number of __atomic_* types that promise that "They perform atomic manipulations of the data, falling back to a mutex in the case that a native cpu atomic can not be used". On 32bit platforms this fallback does *not* occur correctly, meaning that either the lower or upper half only is atomically updated. This is due to variable alignment not being correctly emited by gcc, meaning we would have to then align every data that uses this.
This was discovered due to expansion of our testing capability of the C code base - a server stress test showed that counters could become wildly inaccurate.
Given we use atomics for reference counting in a number of objects, as well as for monitors, this can cause objects to leak, be freed early, or to report incorrect data...
Also impacted:
- FreeIPA server will not be available on i686 due to this - slapi-nis set of plugins will not be available on i686 due to this - Upgrade of i686 instance of Fedora with FreeIPA server will not be possible without fully uninstalling FreeIPA replica
Benefit to Fedora
Stable release of 389-ds-base, FreeIPA, and slapi-nis on remaining architectures
Scope
- Proposal owners:
This only requires a change to spec file to exclude i686
- Other developers: N/A (not a System Wide Change)
- Release engineering: #Releng issue number (a check of an impact with Release Engineering is needed)
- List of deliverables: N/A (not a System Wide Change)
- Policies and guidelines: N/A (not a System Wide Change)
- Trademark approval: N/A (not needed for this Change)
Upgrade/compatibility impact
N/A (not a System Wide Change)
How To Test
Nothing to test except making sure there is no i686 builds present on f28
N/A (not a System Wide Change)
User Experience
N/A (not a System Wide Change)
Dependencies
N/A (not a System Wide Change)
Contingency Plan
- Contingency mechanism: (What to do? Who will do it?) N/A (not a System Wide Change)
- Contingency deadline: N/A (not a System Wide Change)
- Blocks release? N/A (not a System Wide Change), Yes/No
- Blocks product? product
Somehow flag the build as unstable, use at your own risk.
Documentation
N/A (not a System Wide Change)