From Fedora Project Wiki
 
(50 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{admon/important | Comments and Explanations | The page source contains comments providing guidance to fill out each section. They are invisible when viewing this page. To read it, choose the "view source" link.<br/> '''Copy the source to a ''new page'' before making changes!  DO NOT EDIT THIS TEMPLATE FOR YOUR CHANGE PROPOSAL.'''}}
{{admon/tip | Guidance | For details on how to fill out this form, see the [https://docs.fedoraproject.org/en-US/program_management/changes_guide/ documentation].}}
{{admon/tip | Report issues | To report an issue with this template, file an issue in the [https://pagure.io/fedora-pgm/pgm_docs pgm_docs repo].}}
<!-- The actual name of your proposed change page should look something like: Changes/Your_Change_Proposal_Name.  This keeps all change proposals in the same namespace -->
= DNF: Do not download filelists by default <!-- The name of your change proposal --> =
= DNF: Do not download filelists by default <!-- The name of your change proposal --> =
{{Change_Proposal_Banner}}


== Summary ==
== Summary ==
<!-- A sentence or two summarizing what this change is and what it will do. This information is used for the overall changeset summary page for each release. Note that motivation for the change should be in the Benefit to Fedora section below, and this part should answer the question "What?" rather than "Why?". -->
Change the DNF behavior to not download filelists by default. These metadata, which describe all the files contained within each package, are unnecessary in the majority of use cases. Additionally, these metadata files can be large in size, leading to a significant slowdown in the user experience.


== Owner ==
== Owner ==
Line 19: Line 9:
This should link to your home wiki page so we know who you are.  
This should link to your home wiki page so we know who you are.  
-->
-->
* Name: [[User:FASAcountName| Your Name]]
* Name: [[User:jkolarik| Jan Kolarik]]
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
<!-- Include you email address that you can be reached should people want to contact you about helping with your change, status is requested, or technical issues need to be resolved. If the change proposal is owned by a SIG, please also add a primary contact person. -->
* Email: <your email address so we can contact you, invite you to meetings, etc. Please provide your Bugzilla email address if it is different from your email in FAS>
* Email: jkolarik@redhat.com
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
<!--- UNCOMMENT only for Changes with assigned Shepherd (by FESCo)
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
* FESCo shepherd: [[User:FASAccountName| Shehperd name]] <email address>
-->
-->


== Current status ==
== Current status ==
[[Category:ChangePageIncomplete]]
[[Category:ChangeAcceptedF40]]
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- When your change proposal page is completed and ready for review and announcement -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
<!-- remove Category:ChangePageIncomplete and change it to Category:ChangeReadyForWrangler -->
Line 35: Line 24:


<!-- Select proper category, default is Self Contained Change -->
<!-- Select proper category, default is Self Contained Change -->
[[Category:SelfContainedChange]]
[[Category:SystemWideChange]]
<!-- [[Category:SystemWideChange]] -->
<!-- [[Category:SystemWideChange]] -->


Line 46: Line 35:
ON_QA -> change is fully code complete
ON_QA -> change is fully code complete
-->
-->
* [<will be assigned by the Wrangler> devel thread]
* [https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org/thread/5UFFIAR5ITBS7YFS4N5HM5GGPXYVPF7E/ Announced]
* FESCo issue: <will be assigned by the Wrangler>
* [https://discussion.fedoraproject.org/t/f40-change-proposal-dnfconditionalfilelists-system-wide/94939 Discussion thread]
* Tracker bug: <will be assigned by the Wrangler>
* FESCo issue: [https://pagure.io/fesco/issue/3097 #3097]
* Release notes tracker: <will be assigned by the Wrangler>
* Tracker bug: [https://bugzilla.redhat.com/show_bug.cgi?id=2254789 #2254789]
* Release notes tracker: [https://pagure.io/fedora-docs/release-notes/issue/1064 #1064]


== Detailed Description ==
== Detailed Description ==
<!-- Expand on the summary, if appropriate. A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
Until now, filelists were always downloaded together with other metadata. This was hardcoded and unable to change from the outside of DNF.
 
With these changes, we are proposing to not download the filelists metadata by default. This default behavior can be modified through the new DNF configuration option. Additionally, specific commands can override this behavior and request loading the filelists metadata at runtime using the existing demands object in DNF.
 
Note that after this change, users can still use DNF without filelists metadata when querying file provides located in `/usr/bin`, `/usr/sbin` or `/etc` directories.
 
The proposed behavior has already been incorporated into the future successor, DNF5 project, where they were implemented around the beginning of this year (see [https://github.com/rpm-software-management/dnf5/pull/123 this PR] for more details).


== Feedback ==
== Feedback ==
Line 58: Line 54:


== Benefit to Fedora ==
== Benefit to Fedora ==
<!-- What is the benefit to the distribution?  Will the software we generate be improved? How will the process of creating Fedora releases be improved?
As DNF is integral to various infrastructure tasks like package building and installation, testing environment creation, and server integration tests, this change significantly reduces processing time and resource usage for these processes.
 
 
      Be sure to include the following areas if relevant:
This change reduces the RAM requirements of the DNF process, addressing existing issues when running the Fedora system on low-memory machines such as the Raspberry Pi (see f.e. [https://bugzilla.redhat.com/show_bug.cgi?id=1907030 Bug 1907030]).
      If this is a major capability update, what has changed?
          For example: This change introduces Python 5 that runs without the Global Interpreter Lock and is fully multithreaded.
      If this is a new functionality, what capabilities does it bring?
          For example: This change allows package upgrades to be performed automatically and rolled-back at will.
      Does this improve some specific package or set of packages?
          For example: This change modifies a package to use a different language stack that reduces install size by removing dependencies.
      Does this improve specific Spins or Editions?
          For example: This change modifies the default install of Fedora Workstation to be more in line with the base install of Fedora Server.
      Does this make the distribution more efficient?
          For example: This change replaces thousands of individual %post scriptlets in packages with one script that runs at the end.
      Is this an improvement to maintainer processes?
          For example: Gating Fedora packages on automatic QA tests will make rawhide more stable and allow changes to be implemented more smoothly.
      Is this an improvement targeted as specific contributors?
          For example: Ensuring that a minimal set of tools required for contribution to Fedora are installed by default eases the onboarding of new contributors.  


    When a Change has multiple benefits, it's better to list them all.
Also, omitting the filelists metadata download overall decreases the costs of a Fedora mirror server operation.


    Consider these Change pages from previous editions as inspiration:
As the described behavior already exists in its extended form in DNF5 within the current Fedora release, allowing any optional metadata types to be conditionally loaded, and considering that DNF5 is planned to replace DNF as the main package manager for Fedora 41, implementing these changes will facilitate a smoother and more compatible transition process.
    https://fedoraproject.org/wiki/Changes/Annobin (low-level and technical, invisible to users)
    https://fedoraproject.org/wiki/Changes/ParallelInstallableDebuginfo (low-level, but visible to advanced users)
    https://fedoraproject.org/wiki/Changes/VirtualBox_Guest_Integration (primarily a UX change)
    https://fedoraproject.org/wiki/Changes/NoMoreAlpha (an improvement to distro processes)
    https://fedoraproject.org/wiki/Changes/perl5.26 (major upgrade to a popular software stack, visible to users of that stack)
-->


== Scope ==
== Scope ==
* Proposal owners:
* Proposal owners:
<!-- What work do the feature owners have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
** libdnf
*** Modify the `Repo` object to enable conditional filelists metadata download
*** Introduce a new main configuration option to set the default behavior
** dnf
*** Enable configuration of filelists download from commandline, DNF commands and DNF plugins
*** Implement filename pattern argument detection heuristics


* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Other developers: <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- What work do other developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
** Dependencies using the existing DNF C interface may need to adapt if they expect the filelists metadata to be available and explicitly request loading filelists using the existing API due to this change:
*** PackageKit
*** microdnf
*** API users


* Release engineering: [https://pagure.io/releng/issues #Releng issue number] <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Release engineering: N/A
<!-- Does this feature require coordination with release engineering (e.g. changes to installer image generation or update package delivery)?  Is a mass rebuild required?  include a link to the releng issue.
The issue is required to be filed prior to feature submission, to ensure that someone is on board to do any process development work and testing and that all changes make it into the pipeline; a bullet point in a change is not sufficient communication -->


* Policies and guidelines: N/A (not needed for this Change) <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Policies and guidelines:
<!-- Do the packaging guidelines or other documents need to be updated for this feature?  If so, does it need to happen before or after the implementation is done? If a FPC ticket exists, add a link here. Please submit a pull request with the proposed changes before submitting your Change proposal. -->
** Package maintainers must follow Fedora's packaging guidelines, particularly concerning file dependency specifications (see [https://docs.fedoraproject.org/en-US/packaging-guidelines/#_file_and_directory_dependencies here])
*** Adopting the '''MUST NOT''' rule in these guidelines would help prevent future issues with the installability of such packages.
*** A few packages in the current Fedora developmental release are not following these rules. Pull requests have already been prepared to fix their spec files. Please refer to [https://bugzilla.redhat.com/show_bug.cgi?id=2180842 Bug 2180842] for details.


* Trademark approval: N/A (not needed for this Change)
* Trademark approval: N/A
<!-- If your Change may require trademark approval (for example, if it is a new Spin), file a ticket ( https://pagure.io/Fedora-Council/tickets/issues ) requesting trademark approval from the Fedora Council. This approval will be done via the Council's consensus-based process. -->
<!-- If your Change may require trademark approval (for example, if it is a new Spin), file a ticket ( https://pagure.io/Fedora-Council/tickets/issues ) requesting trademark approval from the Fedora Council. This approval will be done via the Council's consensus-based process. -->


* Alignment with Community Initiatives:  
* Alignment with Community Initiatives: N/A (no currently active initiatives)
<!-- Does your proposal align with the current Fedora Community Initiatives: https://docs.fedoraproject.org/en-US/project/initiatives/ ? It's okay if it doesn't, but it's something to consider -->
<!-- Does your proposal align with the current Fedora Community Initiatives: https://docs.fedoraproject.org/en-US/project/initiatives/ ? It's okay if it doesn't, but it's something to consider -->


== Upgrade/compatibility impact ==
== Upgrade/compatibility impact ==
<!-- What happens to systems that have had a previous versions of Fedora installed and are updated to the version containing this change? Will anything require manual configuration or data migration? Will any existing functionality be no longer supported? -->
In general, applying these changes should not affect any existing user workflows and no additional manual changes are required.


<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
However, the absence of filelists would cause issues for packages that do '''not''' follow the recommended file dependencies outlined in the [https://docs.fedoraproject.org/en-US/packaging-guidelines/#_file_and_directory_dependencies packaging guidelines]. This change would render such packages uninstal‌lable without the presence of filelists. In the current Fedora release repository, only a few packages are affected, and none of them is critical to the system. Also, trivial pull requests have already been prepared for each, resolving the issue upon merging.


If DNF fails to resolve a transaction due to a missing file dependency, and the filelists metadata are not currently present on the system, users will receive a hint on how to request the download of filelists from the command line. This action may assist in resolving the situation.
For more information, refer to the [https://bugzilla.redhat.com/show_bug.cgi?id=2180842 Bug 2180842] and the [https://discussion.fedoraproject.org/t/f40-change-proposal-dnfconditionalfilelists-system-wide/94939 discussion thread] on this proposal.


== How To Test ==
== How To Test ==
<!-- This does not need to be a full-fledged document. Describe the dimensions of tests that this change implementation is expected to pass when it is done.  If it needs to be tested with different hardware or software configurations, indicate them. The more specific you can be, the better the community testing can be.  
When using DNF commands without a filename pattern passed as the argument, filelists metadata should not be downloaded from the remote repositories and should not be needed for the command execution. This can be tested with the following steps:
* Clean the local metadata cache (`dnf clean metadata`)
* Run a DNF command not involving the filename spec (e.g. `dnf repoquery rpm`)
* Verify that no `*-filelists.*` metadata files were downloaded inside the cache subdirectories (by default under the `/var/cache/dnf` for root)
* Check the command works as expected
The same should also apply to RPM package arguments (files ending with `.rpm` extension).


Remember that you are writing this how to for interested testers to use to check out your change implementation - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your change.
When using DNF commands with a filename pattern passed as the argument, filelists metadata should be downloaded from the remote repositores as before.


A good "how to test" should answer these four questions:
== User Experience ==
 
Large filelists could be over 200MB in size. It could take 1-2 minutes to download which is greatly slowing down the user experience.
0. What special hardware / data / etc. is needed (if any)?
1. How do I prepare my system to test this change? What packages
need to be installed, config files edited, etc.?
2. What specific actions do I perform to check that the change is
working like it's supposed to?
3. What are the expected results of those actions?
-->
 
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->


 
For many operations the filelists metadata are not needed, so downloading them is wasting the resources. Without filelists being downloaded, DNF performance will be improved significantly, mainly regarding the network, CPU and disk space resources. Metadata download size will be reduced by about 60%. The improvement includes deployments of customer built RPMS to containers that have no need for filelists level dependencies.
== User Experience ==
<!-- If this change proposal is noticeable by users, how will their experiences change as a result?
<!-- If this change proposal is noticeable by users, how will their experiences change as a result?


Line 143: Line 125:


== Dependencies ==
== Dependencies ==
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this change depends?  In other words, completion of another change owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel change)? -->
No changes should be required for any package depending on DNF to implement this behavior.
 
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
 


== Contingency Plan ==
== Contingency Plan ==
 
* Contingency mechanism: Change the configuration option to download the filelists by default
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "Revert the shipped configuration".  Or it might not (e.g. rebuilding a number of dependent packages).  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
* Contingency deadline: Branch Fedora Linux 40 from Rawhide
* Contingency mechanism: (What to do?  Who will do it?) N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
* Blocks release? No
<!-- When is the last time the contingency mechanism can be put in place?  This will typically be the beta freeze. -->
* Contingency deadline: N/A (not a System Wide Change)  <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
<!-- Does finishing this feature block the release, or can we ship with the feature in incomplete state? -->
* Blocks release? N/A (not a System Wide Change), Yes/No <!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
 


== Documentation ==
== Documentation ==
<!-- Is there upstream documentation on this change, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
New configuration option `optional_metadata_types` was added to allow requesting filelists metadata on demand, see configuration docs [https://dnf.readthedocs.io/en/latest/conf_ref.html#optional-metadata-types-label here].
 
<!-- REQUIRED FOR SYSTEM WIDE CHANGES -->
N/A (not a System Wide Change)


== Release Notes ==
== Release Notes ==

Latest revision as of 12:34, 9 February 2024

DNF: Do not download filelists by default

Summary

Change the DNF behavior to not download filelists by default. These metadata, which describe all the files contained within each package, are unnecessary in the majority of use cases. Additionally, these metadata files can be large in size, leading to a significant slowdown in the user experience.

Owner

Current status

Detailed Description

Until now, filelists were always downloaded together with other metadata. This was hardcoded and unable to change from the outside of DNF.

With these changes, we are proposing to not download the filelists metadata by default. This default behavior can be modified through the new DNF configuration option. Additionally, specific commands can override this behavior and request loading the filelists metadata at runtime using the existing demands object in DNF.

Note that after this change, users can still use DNF without filelists metadata when querying file provides located in /usr/bin, /usr/sbin or /etc directories.

The proposed behavior has already been incorporated into the future successor, DNF5 project, where they were implemented around the beginning of this year (see this PR for more details).

Feedback

Benefit to Fedora

As DNF is integral to various infrastructure tasks like package building and installation, testing environment creation, and server integration tests, this change significantly reduces processing time and resource usage for these processes.

This change reduces the RAM requirements of the DNF process, addressing existing issues when running the Fedora system on low-memory machines such as the Raspberry Pi (see f.e. Bug 1907030).

Also, omitting the filelists metadata download overall decreases the costs of a Fedora mirror server operation.

As the described behavior already exists in its extended form in DNF5 within the current Fedora release, allowing any optional metadata types to be conditionally loaded, and considering that DNF5 is planned to replace DNF as the main package manager for Fedora 41, implementing these changes will facilitate a smoother and more compatible transition process.

Scope

  • Proposal owners:
    • libdnf
      • Modify the Repo object to enable conditional filelists metadata download
      • Introduce a new main configuration option to set the default behavior
    • dnf
      • Enable configuration of filelists download from commandline, DNF commands and DNF plugins
      • Implement filename pattern argument detection heuristics
  • Other developers:
    • Dependencies using the existing DNF C interface may need to adapt if they expect the filelists metadata to be available and explicitly request loading filelists using the existing API due to this change:
      • PackageKit
      • microdnf
      • API users
  • Release engineering: N/A
  • Policies and guidelines:
    • Package maintainers must follow Fedora's packaging guidelines, particularly concerning file dependency specifications (see here)
      • Adopting the MUST NOT rule in these guidelines would help prevent future issues with the installability of such packages.
      • A few packages in the current Fedora developmental release are not following these rules. Pull requests have already been prepared to fix their spec files. Please refer to Bug 2180842 for details.
  • Trademark approval: N/A
  • Alignment with Community Initiatives: N/A (no currently active initiatives)

Upgrade/compatibility impact

In general, applying these changes should not affect any existing user workflows and no additional manual changes are required.

However, the absence of filelists would cause issues for packages that do not follow the recommended file dependencies outlined in the packaging guidelines. This change would render such packages uninstal‌lable without the presence of filelists. In the current Fedora release repository, only a few packages are affected, and none of them is critical to the system. Also, trivial pull requests have already been prepared for each, resolving the issue upon merging.

If DNF fails to resolve a transaction due to a missing file dependency, and the filelists metadata are not currently present on the system, users will receive a hint on how to request the download of filelists from the command line. This action may assist in resolving the situation.

For more information, refer to the Bug 2180842 and the discussion thread on this proposal.

How To Test

When using DNF commands without a filename pattern passed as the argument, filelists metadata should not be downloaded from the remote repositories and should not be needed for the command execution. This can be tested with the following steps:

  • Clean the local metadata cache (dnf clean metadata)
  • Run a DNF command not involving the filename spec (e.g. dnf repoquery rpm)
  • Verify that no *-filelists.* metadata files were downloaded inside the cache subdirectories (by default under the /var/cache/dnf for root)
  • Check the command works as expected

The same should also apply to RPM package arguments (files ending with .rpm extension).

When using DNF commands with a filename pattern passed as the argument, filelists metadata should be downloaded from the remote repositores as before.

User Experience

Large filelists could be over 200MB in size. It could take 1-2 minutes to download which is greatly slowing down the user experience.

For many operations the filelists metadata are not needed, so downloading them is wasting the resources. Without filelists being downloaded, DNF performance will be improved significantly, mainly regarding the network, CPU and disk space resources. Metadata download size will be reduced by about 60%. The improvement includes deployments of customer built RPMS to containers that have no need for filelists level dependencies.

Dependencies

No changes should be required for any package depending on DNF to implement this behavior.

Contingency Plan

  • Contingency mechanism: Change the configuration option to download the filelists by default
  • Contingency deadline: Branch Fedora Linux 40 from Rawhide
  • Blocks release? No

Documentation

New configuration option optional_metadata_types was added to allow requesting filelists metadata on demand, see configuration docs here.

Release Notes