Apache Hive

Summary

Apache Hive is a data warehouse built on top of Apache Hadoop.

Owner

Name: Peter MacKinnon
Email: pmackinn@redhat.com
Release notes owner:

Current status

Targeted release: Fedora 21
Last updated: 26 March 2014
Tracker bug: <will be assigned by the Wrangler>

Detailed Description

The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Apache Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL.

Benefit to Fedora

Apache Hive is a data warehouse used by many parts of the Hadoop ecosystem. Including it in Fedora increases usefulness of the Apache Hadoop package that is already in Fedora.

Scope

Proposal owners: The Hive package has been accepted into Fedora and provides all the functionality from the upstream release.
Other developers: N/A (not a System Wide Change)
Release engineering: N/A (not a System Wide Change)
Policies and guidelines: N/A (not a System Wide Change)

Upgrade/compatibility impact

N/A (not a System Wide Change)

How To Test

An upstream quickstart guide is available here which describes setup and simple examples.

User Experience

Users should be able to write/run applications that use Apache Hive for their database.

Dependencies

Apache HBase

Contingency Plan

Contingency mechanism: N/A (not a System Wide Change)
Contingency deadline: N/A (not a System Wide Change)
Blocks release? N/A (not a System Wide Change)
Blocks product? N/A

Documentation