From Fedora Project Wiki

Revision as of 12:46, 23 October 2014 by Sclark (talk | contribs) (→‎Owner)

Apache Hive

Summary

Apache Hive is a data warehouse built on top of Apache Hadoop.

Owner

Current status

Detailed Description

The Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Apache Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL.

Benefit to Fedora

Apache Hive is a data warehouse used by many parts of the Hadoop ecosystem. Including it in Fedora increases usefulness of the Apache Hadoop package that is already in Fedora.

Scope

  • Proposal owners: The Hive package has been accepted into Fedora and provides all the functionality from the upstream release with the exception of HBase support since the latest stable versions are not currently aligned.
  • Other developers: N/A (not a System Wide Change)
  • Release engineering: N/A (not a System Wide Change)
  • Policies and guidelines: N/A (not a System Wide Change)

Upgrade/compatibility impact

N/A (not a System Wide Change)

How To Test

An upstream quickstart guide is available here which describes setup and simple examples.

User Experience

Users should be able to write and run applications that use Apache Hive for executing queries on large data sets stored in Hadoop.

Dependencies

N/A (not a System Wide Change)

Contingency Plan

  • Contingency mechanism: N/A (not a System Wide Change)
  • Contingency deadline: N/A (not a System Wide Change)
  • Blocks release? N/A (not a System Wide Change)
  • Blocks product? N/A

Documentation

N/A (not a System Wide Change)

Release Notes

Fedora 21 includes Apache Hive, the Hadoop data warehouse.