User Level Package Management (draft!)
DISCLAIMER: This is currently just a draft document, rather than an official perspective of the Environments & Stacks WG. If/when that changes, it will move out of my personal space and under the Env_and_Stacks hierarchy.
Problem Description
Package management in Fedora is currently focused on *system* level packaging. This is a very "operations" oriented view of the world, as it emphasises the ability to reproduce systems exactly, while still being able to delegate the task of monitoring for and responding to CVE's to the platform provider, rather than having to track dependencies for security updates directly.
However, for many application developers and data analysts, being able to bring in new dependencies quickly and easily to solve problems is an essential requirement. It's also essential to be able to install a dependency into a local testing or analysis environment *without* affecting the operation of the system itself in any way.
Unfortunately, the relative lack of support for this model at the platform level in Fedora and other Linux distributions has resulted in these communities routing around platform level package management systems, placing significant barriers in the way of effective collaboration between developers, analysts and system administrators.
Solution: endorse layered architectures
The model that has emerged to effectively manage these conflicting requirements is to separate user & developer level dependency management tools from the system level dependency management tools used to create an integrated platform.
The base platform, its dependency management system, and the packages it contains then become primarily the responsibility of system administrators. Security, stability, reliability, & compatibility are the focus at this layer. In a Fedora context, this perspective is best represented by the Base, Server and Cloud WGs (together with the CentOS and EPEL communities).
The Environments & Stacks WG then primarily represents the perspective of application developers and data analysts looking to work on top of that stable foundation.
The Workstation WG takes this work in the other WGs and brings it all together to provide a more cohesive developer experience for the Fedora/RHEL/CentOS ecosystem.
Directions to be Explored
Two primary areas of exploration have been identified for user level package management:
- Embracing the existing practice of using language ecosystem specific tooling to deploy applications to Linux systems
- Potentially recommending particular language independent tooling for more complex dependency management scenarios
Embracing ecosystem specific tooling
Fully embracing language specific tooling is likely to be the best way to reach the professional developer audience. ~bkabrda has started setting up a [devpi] instance as a proof of concept for running a filtered mirror of an upstream package repository that only includes packages that have been reviewed and determined to at least meet Fedora's licensing guidelines and to not be obviously malicious.
However, given the wide range of ecosystem specific packaging tools out there, it seems unlikely that this approach will scale in a sustainable way. To provide a more consistent management interface, it may be better to instead adopt a plugin-based solution like [Pulp]. Adding a new language to the "endorsed" list would then be a matter of writing a suitable Pulp plugin and integrating it with the build system, rather than setting up yet another a completely new repo management service within Fedora's infrastructure.
Recommending language independent tooling
Even if we manage to integrate language specific tooling into the review and build toolchains, it's unlikely it will be feasible to integrate every such toolset into the system management utilities. The language specific tools also don't do a good job of managing arbitrary external dependencies with associated ABI compatibility requirements.
As such, it's worth considering the possibility of recommending a particular user level dependency management toolchain, that allows software to be installed on a per-user basis, rather than as part of the integrated OS platform.
There are two main possibilities that come to mind on that front:
Nix has many attractive properties (including support for custom build environments and per-user package installations without root access), but has the limitation of being POSIX specific. This makes it significantly less interesting to upstream communities that also encompass Windows based developers and data analysts.
By contrast, conda was created by Continuum Analytics specifically to tackle the problem of allowing scientists and data analysts to easily install the full Python analytical stack, including external C, C++ and FORTRAN dependencies.
While there's no current exploration happening in this area, it's still a possibility well worth considering for possible future investigation.
Other Considerations
Software Collections
Software collections represent a hybrid model for layered architectures, where platform components are selectively upgraded within the context of the existing system level package management system.
This is a useful model, as it automatically integrates with all existing system level auditing tools. However, it is aimed primarily at folks that are already using system level packaging tools, and would like to selectively upgrade particular components. It is less interesting in the context of being able to provide consistent cross-platform dependency management instructions.
From the perspective of user level package management, a software collection becomes just another environment to target, just like targeting the default system environments directly.
Linux Containers and Docker
In a Docker context, the work of the Environments & Stacks WG largely applies to the way container images are built. From a development perspective, containers don't actually change all that much relative to any other system for dependency management - it simply shifts the complexity from deployment time to image build time.
Where Docker helps dramatically is in managing the division of responsibility between folks with an operations focus, who can specialise in provide base images and the infrastructure to run containers, and those with a development focus, who can focus on the creation and deployment of full application containers.