If you're wondering what Big Data things are in Fedora, or are interested in working on packaging or reviews to help out the Big Data SIG, this is the page to look at!
If you know of a big-data-related package that is already in Fedora, or have one that you'd like to get into Fedora, be sure to list it here, or link to the page describing what needs to be done, or link to the bugzilla that needs help.
Packages available in Fedora
- HTCondor - since F8, a scalable batch scheduling system
- Savanna - since F20, an OpenStack project for managing Hadoop clusters and workflow
- Apache Hadoop - since F20, the core of the Hadoop ecosystem
- Apache ZooKeeper - since F18, a service for highly reliable distributed coordination
- GlusterFS Hadoop - since F20, an HCFS plugin for Gluster
Packages in review
Packages we're working on
- Ambari - see rsquared
- Spark - see willb
- Apache Hive - see pmackinn and SIGs/bigdata/packaging/Hive
- Apache Mahout - see besser82
- Apache Oozie - see rsquared
- Apache HBase - see rsquared
- tachyon - see tstclair or Packaging Notes
- List your package here!
Packages we'd like to include
Becoming a packager
Not yet a packager? Check out the Package Maintainers, or the Join the package collection maintainers page to get more information. You could also ask on the Big Data SIG mailing list for assistance and see if you can find a willing helper or sponsor. For bundling Java packages read the Java packaging guidelines first.
Typical workflow (relies on github)
- Clone original repo, if modifications are required.
- Patch where necessary. (Use github tickets where possible if working as a group).
- Try to organize your patch set into meaningful units, and create tickets to push upstream where possible.
- For patches that require carrying, they should be applied to the raw-sources where possible.
- Create a package-rpm repo with specs and system integration files (systemd, custom-conf, etc).
- Use rpmbuild | hack fedpkg to enable prototype package building
- spectool -g package.spec (will download sources)
- md5sum package-sources.tar.gz > sources
- fedpkg local
- Once you feel you have a package ready for review run the following prior to submit:
- rpmlint package.spec
- mock --clean --init -r fedora-rawhide-x86_64 && fedora-review -m fedora-rawhide-x86_64 -n package.srpm
- fedora-review -n package.srpm --rpm-spec
Packaging Notes
- Fedora java rpms can not bundle dependent jars. Every jar file not created by the build must come from an rpm in the Fedora repository.
- All jars must be built from source
- Fedora build tools: xmvn-resolve,
mvn-local, mvn-rpmbuild, mvn-buildno longer available in rawhide, considered private implementation - Fedora rpm macros: %pom_*, %mvn_build, %mvn_install, %mvn_file
- xmvn-subst for dependency jars when packaging
- Fedora Java Packaging guidelines: https://fedoraproject.org/wiki/Packaging:Java JNI handling: System.load replaces System.loadLibrary, jar file in %{_jnidir} Jar files in %{_javadir}
- Fedora build systems have no internet access, avoid DNS if possible.