Fedora Environment and Stacks Product Requirements Document.

Document Purpose and Overview

Vision Statement

Fedora Environment and Stacks Working Group is a research and development group working on new ways to provide the latest software stacks in Fedora. It also works on transforming Fedora into the preferred environment of software developers.

Mission Statement

Fedora Environment and Stacks Working Group is a group incubating new ideas for packaging, testing and delivering the various existing and new software stacks in Fedora. The successful ideas will graduate and become an officially released and supported part of the Fedora Project.

What this document describes

This is the Product Requirements Document (PRD) of the Fedora Environment and Stacks Working Group. Unlike the working groups in charge of developing the three products, namely Workstation, Server and Cloud, the Environment and Stacks Working Group is not directly developing any Fedora product. Hence, this PRD should not be seen as a real PRD, but rather a document that:

Provides a list of tasks and goals which fulfil the mission of packaging, testing and delivering the various existing and new software stacks in Fedora. In addition, the listed tasks and goals work towards making the development on Fedora and of Fedora itself easier and friendlier.
Provides deeper understanding of the types of users that are currently faced with various limitations when trying to use, package and develop the various software stacks on Fedora. It also lists the types of users that would benefit from having an easier to use development environment, which would cater to their diverse needs.

This document does not dictate implementation details. The working group will drive the continued prototyping of the listed goals and tasks and graduating them to become an officially released and supported part of the Fedora Project. Schedule for the various goals and tasks will be provided in a separate document.

Definitions and Acronyms

API: Application Programming Interface
ABI: Application Binary Interface
BZ: Bugzilla
CI: Continuous Integration
COPR: Cool Other Package Repositories
CPAN: Comprehensive Perl Archive Network
CS: Computer Science
EPEL: Extra Packages for Enterprise Linux
FAS: Fedora Account System
Koji: the Fedora RPM build system
NTH: Nice-to-have
OS: Operating System
PRD: Product Requirements Document
PyPI: Python Package Index
SCL: Software Collections
WG: Working Group

Tasks and Goals

Testing/additional repositories

Many features from this WG will need additional repositories. It needs to be determined where they will be hosted (maybe on COPR, maybe as Koji tags). Also, a policy for enabling such repositories must be defined.
A repository with packages only for building. It might help some developers to just build their package without maintaining the build dependencies.
Repositories with a looser policy than the current Fedora Packaging Guidelines. Examples of what a looser policy would allow would be bundling libraries, overriding things in Base/Core. It is important to note that these repositories would still need to follow Legal/Licensing Policy as the goal for these repositories is to eventually end up being officially distributed by Fedora.

Automation

Additional repositories with automatically updated SPEC files and RPM packages for packages that are already included in Fedora.
- Maintainers will benefit from having less work with updating the packages, since they would just need to review the automatically updated package and import it into rawhide.
- Users will benefit from having the latest versions of software available. Clearly, the users will need to be informed that the provided packages are a preview available AS IS.
- In a way, such repositories would follow the upstream release schedule:
  - As soon as the upstream makes it available on e.g. CPAN, PyPI, it would be available in the repository.
  - We would need to handle major updates with an API/ABI bump. Checking for ABI/API breakage could be done automatically with api-checker.

Automated packaging
- The general idea is to enable easier/quicker packaging of upstream software by generating SPEC files for the packager automatically.
- Various tools for automatic packaging already exist (e.g. the *2spec and *2rpm tools), so we will use them as a starting point.
- The critical non-automated part of the packaging and review process that will remain manual will be the licensing checks.
- By leveraging COPR, we could have automatically updated repositories generated automatically from upstream sources.
- jzeleny is working on other automatic packaging tools.

Automated package review tools
- Development of a new service where package reviews would happen from start to finish [1]:
  - No Bugzilla
  - Automatic FAS integration (one could automatically mark reviews that need sponsors)
  - Guided (self-)review
  - Inline comments (like gerrit)
  - FedoraReview integration
  - pingou plans to work on such a tool [2]

QA automation
- Join Fedora QA and help them with the Taskotron project concerning the future of QA automation in Fedora.
- Develop new tools for Fedora QA that will help achieving high standards of software in Fedora.
  - An example of such a tool is rpmgrill, a set of analysis tests that run against a particular RPM build. Its main difference to rpmlint is that rpmgrill handles *builds*, that is, the entire sets of RPMs built from one source RPM file instead of single RPM files.

Integration of Fedora services/tools

We want to lessen/automate the work of packagers by integrating the Fedora services/tools such as fedpkg, git, Bodhi, Koji, Bugzilla.
- An example of a git-Bugzilla integration: packager's git commit referencing a particular bug number in Bugzilla triggers automatic generation of a comment that includes the commit link and commit message and changes the state of the referenced bug to MODIFIED.
- An example of a Koji-git integration: a successful build of a package in Koji triggers automatic generation of a tag in package's git repository so that the packager can easily access the content of the particular build using pure git.
This will be done in co-operation with the Fedora Infrastructure Team.

Build systems

We will work on improving the developer experience with the build systems in Fedora (currently, COPR and Koji). Furthermore, we will work on new features in the build systems that will enable building and delivering the various existing and new software stacks in Fedora.
- Cooperate with the COPR upstream on adding support for building SCLs. This will allow us to provide additional repositories with multiple parallel installable versions of the software stacks.
- Cooperate with the Koji upstream on implementing new features in the build system hierarchy that will give more information to developers and make their work easier, like:
  - getting more files from broken builds (e.g. core dumps)
  - providing information about workers' usage status and their resources utilization
  - supporting easier environment setup for implementing new complicated features (currently, creating a new target for that purpose involves quite a lot manual work; this could be solved by an ability to create ad-hoc targets, that don't need rel-eng interaction at all)

SCL

work on getting SCL into Fedora for Fedora Products
- for example Cloud often depends on specific version of Ruby
  - Status: SCL in Fedora are pending in FPC. Most probably each SCL will be added into Fedora as a system-wide change.

SCL in SCL Upstream from COPR.
- Possible to use 3rd party repos for various projects.
  - Status: almost finished.

scl-utils is a package, which provide possibility for building and running Software Collections, which mean user can install two versions of software in parallel.
- scl-utils take bugfixes
- scl-utils2 might contain bigger changes

CI

Continuous integration (CI) allows early detection of bugs or potential issues, thus it is beneficial for improving and maintaining the high quality of software available in Fedora. A number of tools for CI are available such as Travis CI or Jenkins.

We will encourage upstream projects to move toward a CI model. Projects based on GitHub can use the hosted Travis CI instance at https://travis-ci.org/ free of charge. Projects related to Fedora can use the Fedora Infrastructure hosted Jenkins instance at http://jenkins.cloud.fedoraproject.org/.

For CI of RPMs built by Koji/COPR and Fedora updates submitted through bodhi, we will cooperate with Fedora QA on the AutoQA tool. Future plans for AutoQA are described in the Taskotron project.

Documentation, guidelines

Wiki pages.
- Currently lacking any usable structure, lots of abandoned and outdated content, thus not being very helpful.
- Look into creating a better structure, improve the related wiki page categorization.
- Keep the Packaging: namespace, the official Packaging Guidelines are and will be maintained by the Packaging Committee members.
- Archive the outdated/duplicated/unneeded wiki pages (outside the Packaging: namespace).
- Motivate people to create new content (with badges, swags,...).
- Improve the search on Fedora wiki.
- Add more references to formal documentation (see below).
Formal documentation.
- Some people prefer to use a single document to learn about concepts and tasks rather that browsing through a number of wiki pages. Wiki content may not feel qualified and may not invoke the same trust as formal documentation.
- pkovar will be working on a new Packager's Guide, to help people getting started with RPM packaging for Fedora and EPEL. The goal is to share as much content as possible with downstream RHEL/EPEL documentation, using the same format and toolchain (DocBook, Publican).
- The guide will be mostly based on the Fedora wiki pages/HOWTOs. It will reference additional resources on the wiki (esp. in the Packaging: namespace) and/or other content where appropriate.
- An outdated draft is here: http://docs.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/Packagers_Guide/index.html
SCL
- Continue to improve the SCL packager guide.
- It currently lives at http://docs.fedoraproject.org/en-US/Fedora_Contributor_Documentation/1/html/Software_Collections_Guide/index.html
- Using the same format and toolchain as the Packager's Guide and other Fedora guides, it currently shares content with the RH version of the same book at access.redhat.com.
- Look into upstreaming the guide to SCL Upstream if it turns out not to be possible to share most of the content with the version at SCL Upstream or access.redhat.com (due to limited resources).

DevAssistant

The aim of the DevAssistant project is to help developers with repetitive everyday tasks, such as kickstarting/scaffolding new projects, installing dependencies, working with SCM, setting up development environments and so on.

DevAssistant is a "best effort" service - we can't solve all the corner cases, but we can do good in vast majority of cases.

In current DevAssistant release, 0.8.0, core is starting to be mature and stable and for upstream release 0.9.0 we will be concentrating on providing as much end-user functionality as possible. This contains:

Providing more assistants for end-users, improving the functionality of current assistant set
Support for generating (and perhaps modifying) Docker.io images
Dependency versioning support

User stories

This section lists a couple of concrete user stories highlighting the problems they are facing with the current Fedora and what they would like to see improved in the future.

Persona #1: Alan the Big Data analyst

Alan is a Big Data analyst and member of the Fedora Big Data SIG. He uses a number of applications written in different languages to perform the data analysis. He wants to focus his time and effort in the application and the actual data analysis. He want to minimize the time and hassle spent obtaining, compiling, packaging and maintaining the applications that he needs. The form of packaging (rpm, deb, npm, other) isn't as important to him.

Problem statement: Currently, there is a lot of hassle and pain in dealing with non-primary (i.e. C/C++, Python, Perl) language stacks in Fedora. Although Alan wants to focus his time in the application and the actual data analysis, he too often finds himself spending time managing the language-specific toolchain needed by the application.

Cause #1: The upstream application authors usually assume that any developer would just be using the language-specific packaging ecosystem rather that also taking into account the downstream distribution-based packaging and dependency management systems. This is a problem since there are many differences between language-specific packaging ecosystems and the Fedora way of packaging software.

Cause #2: Many upstream applications use very brittle versioning. In other words, each application expects the user will be able to have its particular versions of the runtime, compiler and libraries available.

Example #1: Applications written in Scala require different versions of Scala, some need v2.9 and some need v2.10. Although Fedora could provide both versions in the same release (via SCLs or by manually maintaining two Scala packages), they would both need to be resolvable via Apache Ivy, rather than just providing both binaries, "scalac29" and "scalac210".

Example #2: The version of Jetty in Fedora does not work with Java 6 and the Apache Hadoop upstream isn't ready to abandon Java 6 yet. So the Big Data SIG ends up maintaining a patch to Apache Hadoop for Jetty 9 that will live purely in Fedora for quite a while to come. Although this could be worked-around by using compat packages for Jetty for a couple of Fedora releases, it just hides the fact that there is a mismatch in expectations between Fedora and upstream application authors.

Example #3: Node.js has its package/dep management tool available in Fedora, but very little of the language ecosystem / base libraries are packaged and available in Fedora.

Possible solution: Fedora would need a way to provide language-specific ecosystems in a way that aligns with how these language itself are used and applications written in them are developed and distributed.

Persona #2: Student or Corporate developer needing multiple development environments

Billy is a CS student with multiple assignments that need different development environments/stacks to build. Bob is a corporate enterprise developer who is both developing new applications using the latest technology stacks while also maintaining older software releases that uses older libraries and tool chains.

Problem statement: While they can setup different OS's or environments in separate virtual machines, it is a lot of work and wastes a lot of disk space and time. Being able to install multiple environments or versions of stacks in the same Fedora instance and switching between them on a per-project basis would be much easier and more efficient.

About this Document

Authors

Contributors to this document include:

Reviewers & Contributors

The following people have contributed to the development of this document, through feedback on IRC, mailing lists, and other points of contact.

Search

Env and Stacks/Product Requirements Document

Contents