From Fedora Project Wiki

Revision as of 19:06, 2 January 2020 by Chrismurphy (talk | contribs) (F32 system wide change proposal to enable earlyoom.service by default)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Enable EarlyOOM killing

Summary

Install earlyoom package, and enable it by default. This will cause the kernel oomkiller to trigger sooner, but will not affect which process it chooses to kill off. The idea is to recover from out of memory situations sooner, rather than the typical complete system hang in which the user has no other choice but to force power off.


Owner

Current status

  • Targeted release: Fedora 32
  • Last updated: 2020-01-02
  • Tracker bug: <will be assigned by the Wrangler>
  • Release notes tracker: <will be assigned by the Wrangler>

Detailed Description

The kernel in Fedora editions and spins, enables the in-kernel OOM (out-of-memory) manager. Its concern is to keep the kernel itself functioning, it has no concern at all about user space function or interactivity. This change attempts to improve the user experience in the short term by triggering the same process killing mechanism, but sooner. Instead of the system becoming completely unresponsive for tens of minutes, hours or days, the expectation is an offending process (determined by oom_score, same as now) will be killed of within seconds to minutes. This is better, but admittedly still suboptimal, and there is more long term work on-going to improve the user experience in this area.

Background information on this complicated problem: https://www.kernel.org/doc/gorman/html/understand/understand016.html https://lwn.net/Articles/317814/

Recent discussion: https://pagure.io/fedora-workstation/issue/98 https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/XUZLHJ5O32OX24LG44R7UZ2TMN6NY47N/

Other in-progress solutions: https://gitlab.freedesktop.org/hadess/low-memory-monitor


Benefit to Fedora

There are two major benefits to Fedora:

- improved user experience by more quickly regaining control over one's system, rather than having to force power off in low-memory situations where there's aggressive swapping. Once a system becomes unresponsive, it's completely reasonable for the user to assume the system is lost, but that includes high potential for data loss.

- reducing forced poweroff as the main work around will increase data collection, improving understanding of low memory situations and how to handle them better


Scope

  • Proposal owners:

Include earlyoom package and enabled it by default, both for clean installs and upgrades.

  • Other developers:

Desktop spins may choose to opt-out. Server, Cloud, IoT may choose to opt-in.

  • Policies and guidelines: N/A
  • Trademark approval: N/A

Upgrade/compatibility impact

N/A (not a System Wide Change)

How To Test

N/A (not a System Wide Change)

User Experience

Dependencies

N/A (not a System Wide Change)

Contingency Plan

  • Contingency mechanism: (What to do? Who will do it?) N/A (not a System Wide Change)
  • Contingency deadline: N/A (not a System Wide Change)
  • Blocks release? N/A (not a System Wide Change), Yes/No
  • Blocks product? product

Documentation

N/A (not a System Wide Change)

Release Notes