From Fedora Project Wiki
mNo edit summary |
No edit summary |
||
Line 33: | Line 33: | ||
* The system becomes unresponsive during the test but should respond again once `stress-ng` is killed. | * The system becomes unresponsive during the test but should respond again once `stress-ng` is killed. | ||
* If the main `stress-ng` was killed, the command will print "Killed" and a non-zero exit code. If it runs to completion, it say something about "successful run" and exit 0. | * If the main `stress-ng` was killed, the command will print "Killed" and a non-zero exit code. If it runs to completion, it say something about "successful run" and exit 0. | ||
* This test will invoke the kernel OOM killer in combination with systemd-oomd. The kernel OOM killer will kill worker processes from stress-ng but not the main process continually spawn them until killed by systemd-oomd. | |||
* systemd-oomd will have killed all the processes before the timeout. `stress-ng` may take some time to build up pressure. If the the command runs to timeout, it means systemd-oomd did not kill it. | * systemd-oomd will have killed all the processes before the timeout. `stress-ng` may take some time to build up pressure. If the the command runs to timeout, it means systemd-oomd did not kill it. | ||
* You can verify by checking for some of the relevant log lines with `journalctl`: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)" | * You can verify by checking for some of the relevant log lines with `journalctl`: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)" | ||
|optional= | |||
* You can also try a variant of this test that is less likely to invoke the kernel OOM killer; the idea is to use up all the free memory and swap, leaving ~0.5GB. This should generate enough pressure on the system without invoking an actual out of memory event: | |||
<pre> | |||
memavail=$(cat /proc/meminfo | grep MemAvailable | awk '{print $2}') | |||
target=$(bc -l <<< "$memavail - 500000") | |||
systemd-run --user --scope stress-ng -m 1 --vm-bytes "$target"K --vm-keep | |||
</pre> | |||
* You will have to ctrl-c the command if systemd-oomd does not kill it. It can take some time to build up pressure and meet the kill condition. However if you see from the output of `oomctl` that the "value" of "Pressure: Avg10: <value>" does not go above 10.0, there might have been too much memory available to generate pressure. | |||
}} | }} |
Revision as of 09:23, 18 March 2021
Description
This test case tests that systemd-oomd will kill a cgroup with the most pgscans when memory pressure on user@$UID.service exceeds 10% (or whatever was defined in systemd-oomd-defaults).
Setup
- This test case should be performed on either bare-metal or virtual machines.
- Check that you are running systemd 248~rc1 or higher with
systemctl --version
. - Ensure the systemd-oomd-defaults package is installed (included with Fedora 34).
- You will also need to install
stress-ng
. - Boot the system and log in as a regular user.
- So as not to trigger the swap policy for systemd-oomd, create an override with the following commands (don't forget to remove this file and
systemctl daemon-reload
to restore the settings afterwards):
sudo mkdir /etc/systemd/system/-.slice.d/ printf "[Slice]\nManagedOOMSwap=auto" | sudo tee /etc/systemd/system/-.slice.d/99-test.conf sudo systemctl daemon-reload
How to test
- Check that systemd-oomd is running:
systemctl status systemd-oomd
- Check that the systemd-oomd-defaults policy was applied by running
oomctl
and verifying that "/user.slice/user-$UID.slice/user@$UID.service/" is listed as a path under "Memory Pressure Monitored CGroups" along with some stats. "Swap Monitored CGroups" should show no paths since we put in an override. - Now run the test:
systemd-run --user --scope /usr/bin/stress-ng --brk 2 --stack 2 --bigheap 2 --timeout 90s
- Make sure to clean up the override and reset the test unit when you're done:
sudo rm /etc/systemd/system/-.slice.d/99-test.conf sudo systemctl daemon-reload
Expected Results
- The system becomes unresponsive during the test but should respond again once
stress-ng
is killed. - If the main
stress-ng
was killed, the command will print "Killed" and a non-zero exit code. If it runs to completion, it say something about "successful run" and exit 0. - This test will invoke the kernel OOM killer in combination with systemd-oomd. The kernel OOM killer will kill worker processes from stress-ng but not the main process continually spawn them until killed by systemd-oomd.
- systemd-oomd will have killed all the processes before the timeout.
stress-ng
may take some time to build up pressure. If the the command runs to timeout, it means systemd-oomd did not kill it. - You can verify by checking for some of the relevant log lines with
journalctl
: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)"
Optional
- You can also try a variant of this test that is less likely to invoke the kernel OOM killer; the idea is to use up all the free memory and swap, leaving ~0.5GB. This should generate enough pressure on the system without invoking an actual out of memory event:
memavail=$(cat /proc/meminfo | grep MemAvailable | awk '{print $2}') target=$(bc -l <<< "$memavail - 500000") systemd-run --user --scope stress-ng -m 1 --vm-bytes "$target"K --vm-keep
- You will have to ctrl-c the command if systemd-oomd does not kill it. It can take some time to build up pressure and meet the kill condition. However if you see from the output of
oomctl
that the "value" of "Pressure: Avg10: <value>" does not go above 10.0, there might have been too much memory available to generate pressure.