Apache Hive
Summary
Apache Hive is a data warehouse built on top of Apache Hadoop.
Owner
- Name: Peter MacKinnon
- Email: pmackinn@redhat.com
- Release notes owner:
Current status
- Targeted release: Fedora 21
- Last updated: 26 March 2014
- Tracker bug: <will be assigned by the Wrangler>
Detailed Description
The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Apache Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL.
Benefit to Fedora
Apache Hive is a data warehouse used by many parts of the Hadoop ecosystem. Including it in Fedora increases usefulness of the Apache Hadoop package that is already in Fedora.
Scope
- Proposal owners: The Hive package has been accepted into Fedora and provides all the functionality from the upstream release with the exception of HBase support since the latest stable versions are not currently aligned.
- Other developers: N/A (not a System Wide Change)
- Release engineering: N/A (not a System Wide Change)
- Policies and guidelines: N/A (not a System Wide Change)
Upgrade/compatibility impact
N/A (not a System Wide Change)
How To Test
An upstream quickstart guide is available here which describes setup and simple examples.
User Experience
Users should be able to write and run applications that use Apache Hive for executing queries on large data sets stored in Hadoop.
Dependencies
N/A (not a System Wide Change)
Contingency Plan
- Contingency mechanism: N/A (not a System Wide Change)
- Contingency deadline: N/A (not a System Wide Change)
- Blocks release? N/A (not a System Wide Change)
- Blocks product? N/A
Documentation
N/A (not a System Wide Change)
Release Notes
Fedora 21 includes Apache Hive, the Hadoop data warehouse.