From Fedora Project Wiki
Description
This test case tests that systemd-oomd will kill a cgroup with the most pgscans when memory pressure on user@$UID.service exceeds 10% (or whatever was defined in systemd-oomd-defaults).
Setup
- This test case should be performed on either bare-metal or virtual machines.
- Check that you are running systemd 248~rc1 or higher with
systemctl --version
. - Ensure the systemd-oomd-defaults package is installed (included with Fedora 34).
- You will also need to install
stress-ng
. - Boot the system and log in as a regular user.
- So as not to trigger the swap policy for systemd-oomd, create an override with the following commands (don't forget to remove this file and
systemctl daemon-reload
to restore the settings afterwards):
sudo mkdir /etc/systemd/system/-.slice.d/ printf "[Slice]\nManagedOOMSwap=auto" | sudo tee /etc/systemd/system/-.slice.d/99-test.conf sudo systemctl daemon-reload
How to test
- Check that systemd-oomd is running:
systemctl status systemd-oomd
- Check that the systemd-oomd-defaults policy was applied by running
oomctl
and verifying that "/user.slice/user-$UID.slice/user@$UID.service/" is listed as a path under "Memory Pressure Monitored CGroups" along with some stats. "Swap Monitored CGroups" should show no paths since we put in an override. - Now run the test:
systemd-run --user --scope /usr/bin/stress-ng --brk 0 --stack 0 --bigheap 0 --timeout 120s
- Make sure to clean up the override and reset the test unit when you're done:
sudo rm /etc/systemd/system/-.slice.d/99-test.conf sudo systemctl daemon-reload
Expected Results
- The system becomes unresponsive during the test but should respond again once
stress-ng
is killed. - systemd-oomd will have killed all the processes before the 120 second timeout.
stress-ng
may take some time to build up pressure, but should be killed before the timeout. If the the command runs to timeout, it means systemd-oomd did not kill it. * * You can verify by checking for some of the relevant log lines withjournalctl
: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)"