m (→How To Test) |
m (→Owner) |
||
(8 intermediate revisions by 3 users not shown) | |||
Line 7: | Line 7: | ||
* Name: [[User:pmackinn| Peter MacKinnon]] | * Name: [[User:pmackinn| Peter MacKinnon]] | ||
* Email: [mailto:pmackinn@redhat.com pmackinn@redhat.com] | * Email: [mailto:pmackinn@redhat.com pmackinn@redhat.com] | ||
* Release notes owner: | * Release notes owner: [mailto:sclark@fedoraproject.org Simon Clark] ([[User:sclark|sclark]]) | ||
== Current status == | == Current status == | ||
* Targeted release: [[Releases/21 | Fedora 21 ]] | * Targeted release: [[Releases/21 | Fedora 21 ]] | ||
* Last updated: 26 March 2014 | * Last updated: 26 March 2014 | ||
* Tracker bug: | * Tracker bug: [https://bugzilla.redhat.com/show_bug.cgi?id=1089198 #1089198] | ||
== Detailed Description == | == Detailed Description == | ||
Line 37: | Line 37: | ||
== Dependencies == | == Dependencies == | ||
[https://bugzilla.redhat.com/show_bug.cgi?id=1045556 Apache HBase] | |||
== Contingency Plan == | == Contingency Plan == | ||
Line 51: | Line 51: | ||
Fedora 21 includes Apache Pig, the Hadoop data analysis tool. | Fedora 21 includes Apache Pig, the Hadoop data analysis tool. | ||
[[Category: | [[Category:ChangeAcceptedF21]] | ||
<!-- When your change proposal page is completed and ready for review and announcement --> | <!-- When your change proposal page is completed and ready for review and announcement --> | ||
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler --> | <!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler --> |
Latest revision as of 10:18, 25 October 2014
Apache Pig
Summary
Apache Pig is a data analysis tool built on top of Apache Hadoop.
Owner
- Name: Peter MacKinnon
- Email: pmackinn@redhat.com
- Release notes owner: Simon Clark (sclark)
Current status
Detailed Description
Apache Pig is a platform for analysing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Benefit to Fedora
Apache Pig is a data analysis tool used by many parts of the Hadoop ecosystem. Including it in Fedora increases usefulness of the Apache Hadoop package that is already in Fedora.
Scope
- Proposal owners: The Pig package has been accepted into Fedora and provides all the functionality from the upstream release with the exception of jython (version) and parquet (unpackaged) support.
- Other developers: N/A (not a System Wide Change)
- Release engineering: N/A (not a System Wide Change)
- Policies and guidelines: N/A (not a System Wide Change)
Upgrade/compatibility impact
N/A (not a System Wide Change)
How To Test
An upstream quickstart guide is available here which describes setup and simple examples.
User Experience
Users should be able to write and run applications that use Apache Pig for analysis of large data sets stored in Hadoop.
Dependencies
Contingency Plan
- Contingency mechanism: N/A (not a System Wide Change)
- Contingency deadline: N/A (not a System Wide Change)
- Blocks release? N/A (not a System Wide Change)
- Blocks product? N/A
Documentation
N/A (not a System Wide Change)
Release Notes
Fedora 21 includes Apache Pig, the Hadoop data analysis tool.