From Fedora Project Wiki

Comps.xml

Draft!

The comps file adds meta data information to the repository that cannot be extracted from the packages themselves. Although there is one big tree of items there are several different functions merged into it.

See Fedora 14 as example of the structure.

Scheme

The structure of comps looks like this:

  • category
    • id
    • name (translated string)
    • displayorder (int)
    • grouplist (groupids)
  • group
    • id
    • name (translated string)
    • description (translated string)
    • display_order (int)
    • default (bool)
    • user visible (bool)
    • langonly (two letter lang code)
    • package list
      • conditional
      • mandatory
      • default
      • optionals

Anaconda magic:

  • whiteout
    • requires, package
  • blacklist
    • packagename, arch


So there is a tree with fixed depth: Categories -> Groups -> optional packages.

Functions

Comps has several functions that are merged into the same tree:

  • Selecting packages as important to the user (in opposite to those only dragged in by dependency)
  • Grouping similar packages together to make them more easy to find
  • Grouping one of each kind together to allow installing them as one thing
  • Defining the default installation and default installation for variants
  • Define how the UI for package selection should look like
  • Do magic for language packages (XXX TODO: read details about yum plugin doing this for F15)

While putting all those into a single tree has advantages like having one simple UI and a smaller number of single entities that need to be named, translated and displayed several functions are also limited by this approach.

Grouping similar packages

Using the same groups for putting together similar and one of each kind makes it difficult to do both well. Groups like "Games" are just a list of hundreds of packages. Assuming that a tree is the right thing to sort the packages into there would be at least one or two more levels of depth needed. The application menu for comparison - that typically deals with a much smaller number of entries - offers one additional level for the "Games" sub menu. While "Games" may be the most obvious example similar issues do also exist in other comps groups.

The other question is if a tree really is the right thing to group the huge number of interesting packages we have right now. Having several orthogonal keywords that can be used to split up the set of packages into a much larger set of single sub sets could be a much better solutions. Instead of offering an access to all packages having some more web search like interface offering only the best matches may be more suited to the amount of packages we currently have in Fedora.

XXX TODO: Do the math about how many groups/buckets we'd need to come to a sane number of packages/options on the lowest level.

Grouping one of each kind

This functionality is very important as it makes sure you end up with a functional system after installing. Although the Requirements/Provides on RPM level does make sure every package has everything it needs this does not extend to system services that are not directly used. E.g.no one needs ntpd to run but lots of programs rely on the system clock to work correctly. No program requires a kernel, init or initrd but without the system cannot be booted. Similar relations also apply to Desktop environment or development tool chains. Even for applications programs there are suites that are only really useful as a whole.

Beside such implicit requirements the huge number of packages and applications become much more easy to deal with when grouped into larger entities. Unfortunately the current structure of comps does not allow to easily use arbitrary groups. Defining a group of applications that are useful together is a very individual thing. While it would be technically possible to have ones own repository to offer one owns group definition to others or the own machines it is currently to complicated to set up.

This means that a large part of the packages are still installed and handled as single packages and not as part of a group.

Defining the default

The fact that the groups are basically the only way to get a working system and people have to choose at least some of them they have always been of high political interest. controlling the groups means controlling which programs the users are going to install - at least in the very most cases. Although the user has a choice he is dropped into the much more difficult world of dealing with single packages.

A real choice would be making it more easy for different parties offering their own group information and having the users choose who's view on the distribution they trust or find most useful.

Defining the UI

Having everything in one big tree looks compelling at first. But the number of package and even the number of applications in Fedora make it difficult to fit everything in. May be a more web search like approach is more suitable for the 20k package we have right now. details have still to be figured out but being bound to a given model from comps is not really helping.