This is an evaluation by the Big Data SIG of the issues that need to be addressed in order to get Ambari into Fedora. This work was started on the 1.2.5 branch, which didn't support Hadoop 2.x at the time. The current stable release (1.4.4) does.
Issues To Be Resolved
Missing node.js Dependencies
The Ambari build uses brunch and other node.js parts to generate static web content. A significant portion of the dependency chain for the node.js parts are not in Fedora and would need to be packaged. There are a few ways to handle the node.js dependency chain:
- Package all the node.js dependencies as individual rpms
- Package all the node.js dependencies as a single rpm
- Work with upstream to remove the need to generate the static web content
- Re-implement the node.js parts in source native to Ambari
- Find similar functionality that is already packaged in Fedora and provide support for its use in the Ambari build
- Abandon packaging Ambari for Fedora
Missing Java Dependencies
There are only 2 java dependencies that aren't in Fedora that ambari needs, with a 3rd existing package needing modification.
- org.xerial:sqlite-jdbc
- org.springframework:spring-mock
- org.powermock:powermock-api-easymock
Dependency Version Issues
Puppet
Ambari uses puppet manifests and directives for provisioning Hadoop components on hosts. At build, puppet version 2.7.9 is downloaded from Puppet Labs and added to the agent package. However, the version that is currently available in rawhide is 3.4.3. The puppet parser validates a config at application so this poses problems in two areas (so far):
- Agent modules have variable declarations like "$core-site=...". Version 3.4.3 forbids hyphens in variable names (alphanumeric and underscore only).
- At install, the agent retrieves puppet manifests for the selected stack (e.g., HDP 2.0.6). The structure of those cannot be processed by version 3.4.3 and it fails validation with "Import loop detected" errors.
The last version of puppet 2.7 built for Fedora still builds in rawhide at this point in time. However, it obviously would have to replace the incumbent version (3.4.x) due to files in common.
Facter and Ruby
Rawhide currently has Facter 1.7.4 and Ruby 2.0.0 (deps for Puppet) while the Ambari build bundles older versions of both in the agent. These appear to be compatible with the older version of Puppet but without complete deployment testing into Fedora, are unknown variables.
Python
The ambari build/runtime is hard coded to use python2.6. This needs to be cleaned up. There are 2 (AMBARI-1790, AMBARI-1779) upstream jiras with patches to address these issues, but they have bit rotted some.
Jetty
The current version of Jetty in Fedora is Jetty 9, but Ambari is coded for Jetty 7. Fedora now has a Jetty 8 compatibility package in rawhide and necessary modifications are here.
Postgres
The version of postgres in Fedora may require updates to the database initialization done by ambari. There is an upstream patch to address this, which appears to have been fixed for 1.5.0 (yet to be released).
Easymock
In F19 there are currently 3 easymock packages, one for 1.x, one for 2.x, and one for 3.x. ambari actually needs the newer 3.x line of easymock. Unfortunately the jar resolver is non-deterministic when multiple jars have the same gid:aid, as is the case with the 3 easymock packages. The easymock2 package seems to get top billing which causes the ambari and the powermock easymock module builds to fail.
The fact that none of the easymock packages have been updated to the latest packaging guidelines complicates a resolution because none of them are being registered as compatibility packages and each of them is claiming to be the primary version of easymock. This apparently means Fedora's mechanism for resolving compatibility packages (which requires the build to query for the exact version in the compatibility package) won't work.
On F19 easymock2 gets top billing over eaymock3 when easymock is resolved. easymock2 is supposed to be retired in F20, and the long term goal seems to be to get rid of the easymock3 package and have a single easymock package on the 3.x stream. It is possible this could get done for F21.
It is possible that F20 will be a "good enough" solution to remove the easymock road block for ambari and powermock.
Fedora Packaging Repository
Ambari has the ability to install packages on a client machine and it pulls those packages from Hortonworks repos that are hard coded in the server. It determines which repositories to use based upon the OS, and Fedora is not recognized as a valid/supported OS. Ambari will need to be modified to not only accept Fedora as a valid OS, but also to pull the packages from Fedora repos and not from Hortonworks. This issue has been logged with upstream.
Compilation Errors
There are various compilation issues due to newer/different version of java dependencies. These should be resolved with upstream if possible.
Open Issues
Hadoop 2.x Support
The current Ambari release (1.4.4) supports the Hadoop 2.0.6 HDP release; Fedora has 2.2.0. Due to the nature of executing a downloaded HDP stack from Hortonworks, it is unknown at this time if there are specific compatibility issues with 2.2.0.