m (fix a typo) |
|||
(38 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
== ControlGroups == | == ControlGroups == | ||
See FUDCon 2009 presentation [http://people.redhat.com/jsafrane/talk/Fudcon-libcgroup.odp slides]. | |||
== Summary == | == Summary == | ||
Line 6: | Line 7: | ||
# user-space tools which handle kernel control groups mechanism. We want to improve them where necessary and feasible and/or to create new ones e.g. to create or modify cgroups configuration or display control groups data (using [http://libcg.sourceforge.net/ libcgroups] package). | # user-space tools which handle kernel control groups mechanism. We want to improve them where necessary and feasible and/or to create new ones e.g. to create or modify cgroups configuration or display control groups data (using [http://libcg.sourceforge.net/ libcgroups] package). | ||
== | == Owners == | ||
* Linda Wang | * Linda Wang | ||
** email: lwang@redhat.com | ** email: lwang@redhat.com | ||
Line 18: | Line 19: | ||
== Current status == | == Current status == | ||
* Targeted release:[[Releases/11 | Fedora 11 ]] | * Targeted release:[[Releases/11 | Fedora 11 ]] | ||
* Last updated: 2009-04-14 | |||
* Percentage of completion: 100% | |||
* Last updated: 2009- | |||
* Percentage of completion: | |||
== Detailed Description == | == Detailed Description == | ||
Line 77: | Line 63: | ||
you to associate a set of CPUs and a set of memory nodes with the | you to associate a set of CPUs and a set of memory nodes with the | ||
tasks in each cgroup. | tasks in each cgroup. | ||
===User space tools=== | ===User space tools=== | ||
Line 99: | Line 84: | ||
== Scope == | == Scope == | ||
* Kernel Part: | |||
There are several sub-features under control group: | |||
* CGROUPS (grouping infrastructure mechanism) | |||
* CPUSET (cpuset controller, in F10) | |||
* CPUACCT (cpu account controller, in F10) | |||
* SCHED (schedule controller, in F10) | |||
* MEMCTL (memory controller, in F10) | |||
* DEVICES | |||
* NETCTL (network controller, new in Rawhide/F11) | |||
* tools part: | |||
Required extended testing and fixing of [http://libcg.sourceforge.net/ libcgroups] package and in time when libcgroups will be stable enough try to add start to write another parts - based on existing ones. | Required extended testing and fixing of [http://libcg.sourceforge.net/ libcgroups] package and in time when libcgroups will be stable enough try to add start to write another parts - based on existing ones. | ||
== How To Test == | == How To Test == | ||
To help test, and use the control group features in Fedora; there are | |||
multiple way to test, depends on the feature set that you are interested in. | |||
From now to other tests it is necessary to have a kernel with cgroups support and the <code>libcgroup</code> package. | |||
<pre> | <pre> | ||
1. yum install libcgroup | 1. yum install libcgroup | ||
</pre> | </pre> | ||
=== User space tools === | |||
==== Creating cgroups ==== | |||
# Configure <code>/etc/cgconfig.conf</code> file - there should be nice example and man page packaged. | # Configure <code>/etc/cgconfig.conf</code> file - there should be nice example and man page packaged. | ||
# Start/stop cgconfig service and test whether the created groups are as expected. | # Start/stop cgconfig service and test whether the created groups are as expected. | ||
==== Moving task to groups ==== | |||
# Prepare some cgroups, i.e. prepare <code>/etc/cgconfig.conf</code> and start <code>cgconfig</code> service. | # Prepare some cgroups, i.e. prepare <code>/etc/cgconfig.conf</code> and start <code>cgconfig</code> service. | ||
# Start/stop new proces using <code>cgexec</code> and check that it's in appropriate cgroup. | # Start/stop new proces using <code>cgexec</code> and check that it's in appropriate cgroup. | ||
Line 120: | Line 119: | ||
# Test <code>cgrulesengd</code> daemon (it should automatically move processes as written in <code>cgrules.conf</code>). | # Test <code>cgrulesengd</code> daemon (it should automatically move processes as written in <code>cgrules.conf</code>). | ||
# Configure cgroup pam module and test that works if a user logs in (again, driven by <code>cgrules.conf</code>). | # Configure cgroup pam module and test that works if a user logs in (again, driven by <code>cgrules.conf</code>). | ||
==== Looking in which cgroup the task is ==== | |||
ps -o cgroup | |||
cat /proc/<pid>/cgroup | |||
There will be more tools in future | |||
==== Staring a service in control groups ==== | |||
Most services can be configured to start their daemons in specific control groups. Add following line to /etc/sysconfig/<service name> script: | |||
CGROUP_DAEMON="cpu:/daemons/foo cpuacct:/foo" | |||
It will work only if the service supports reading configuration from /etc/sysconfig/<name> and the service script uses daemon() call from /etc/init.d/functions (most services do). | |||
==== Other tools ==== | |||
* cgclassify should move existing process to defined group (see man cgclassify) | |||
* cgexec should start new process in defined group (see man cgexec) | |||
===Kernel features=== | |||
Read kernel docs (see below). Each controller should have a documentation there | |||
====CPUSET==== | |||
* Create a group controlled by cpuset controller, e.g. use following cgconfig.conf: | |||
<pre> | |||
mount { | |||
cpuset = /mnt/cgroup; | |||
} | |||
group test { | |||
perm { | |||
task { | |||
uid = root; | |||
gid = root; | |||
} | |||
admin { | |||
uid = root; | |||
gid = root; | |||
} | |||
} | |||
# following section is cpuset specific, | |||
# replace with appropriate content when testing other controllers | |||
# allow only the first cpu and the first memory region | |||
cpuset { | |||
cpuset.cpus = 0; | |||
cpuset.mems = 0; | |||
} | |||
} | |||
</pre> | |||
* Start the cgconfig service | |||
* Execute a task in this group | |||
<pre> | |||
$ cgexec -g cpuset:test /bin/bash | |||
</pre> | |||
* Check the started bash (and all its children) are in the right group | |||
<pre> | |||
$ cat /proc/self/cgroup | |||
12:cpuset:/test | |||
$ ps -o cgroup | |||
... | |||
</pre> | |||
* Check, that all children of the bash can use only first cpu (e.g. compile kernel with -j3 or so). | |||
====CPUACCT==== | |||
* Same as before, use following cgconfig.conf snippet instead of <code>cpuset { }</code>: | |||
<pre> | |||
cpuacct { | |||
} | |||
</pre> | |||
* Start a process in the group as before, check, that /mnt/cgroup/test/cpuacct.usage counts CPU cycles of the process and all its future children | |||
====Memory Controller==== | |||
* Use following cgconfig.conf snippet: | |||
<pre> | |||
memory { | |||
memory.limit_in_bytes = 40M; | |||
} | |||
</pre> | |||
* Again, start something in the group. The process there can use 40 megabytes of memory. | |||
* Look at /mnt/cgroup/tests/usage_in_bytes,there should be current memory usage of all processes in the group. | |||
Test other controllers, as described in kernel documentation. | |||
== User Experience == | == User Experience == | ||
End-user who will use this feature will hopefully find it useful to help partition their server/machine resources into different functional units that they can dedicate these resources to. | |||
The control group user interfaces are very straight forward, and are a set of common easy to use command-line operations. The concept of allocating different system resources such as number of CPUs, amount of memories, and network bandwidth should be easy. | |||
<code>libcgroups</code> package should help the user to create persistent configuration and would help to reduce the barrier of entry to using control groups on Linux significantly. | |||
== Dependencies == | == Dependencies == | ||
Majority of the implementation is done inside of the kernel. | |||
Tools part is implemented in package <code>libcgroups</code> | |||
== Contingency Plan == | == Contingency Plan == | ||
The contingency plan for under develop sub-feature is to simply not enable the kernel option during development freeze. Hence it will not expose the incomplete sub-feature to the fedora community. | |||
Currently, nothing depends on <code>libcgroup</code> or the tools which would use it. If things go really wrong, we can always go back to the last working version of <code>libcgroup</code>. | Currently, nothing depends on <code>libcgroup</code> or the tools which would use it. If things go really wrong, we can always go back to the last working version of <code>libcgroup</code>. | ||
== Documentation == | == Documentation == | ||
* kernel documentation: | * kernel documentation: | ||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups Documentation/cgroups] | ** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups Documentation/cgroups] - control group's directory | ||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups/cgroups.txt Documentation/cgroups/cgroups.txt] - overall top level description of the feature | |||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups/cpusets.txt Documentation/cgroups/cpusets.txt] - doc describing CPU/memory nodes to a set of tasks | |||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups/cpuacct.txt Documentation/cgroups/cpuacct.txt] - doc describing CPU acct ctrl to cal. usage of cpu time | |||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups/devices.txt Documentation/cgroups/devices.txt] - doc describing device file | |||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups/memory.txt Documentation/cgroups/memory.txt] | |||
** [http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=tree;f=Documentation/cgroups/resource_counter.txt Documentation/cgroups/resource_counter.txt] | |||
* <code> | * <code>libcgroup</code>: | ||
** [http://libcg.sourceforge.net/ upstream site] | ** [http://libcg.sourceforge.net/ upstream site] | ||
** LWN.net article: [http://lwn.net/Articles/271788/ libcg: design and plans] | ** LWN.net article: [http://lwn.net/Articles/271788/ libcg: design and plans] | ||
** documentation from source tarball (directories <code>doc</code> and <code>samples</code>) | ** documentation from source tarball (directories <code>doc</code> and <code>samples</code>) | ||
** libcgroup man pages | |||
* Resource management via cgroups in general: | |||
** [http://docs.fedoraproject.org/en-US/Fedora/16/html-single/Resource_Management_Guide/index.html Fedora Resource Management Guide] | |||
== Release Notes == | == Release Notes == | ||
Fedora 11 includes a new feature called `Control Group` where it allows system administrator to partition the system resources into different sub groups, and dedicate these sub groups resources to different applications' need. It can be use to dedicate specific applications such as interactive applications; cpu, memory, or network bandwidth intensive application; or database application to a set of pre-allocated system resources. | |||
There is also [http://libcg.sourceforge.net/ libcgroups] tool which helps to manipulate, control and administrate control groups and the associated controllers. Using this tool it is possible to aggregate/partition set of tasks and their future children into hierarchical groups with specialized access to resources. | |||
== Comments and Discussion == | == Comments and Discussion == | ||
* See [[Talk:Features/ControlGroups]] | * See [[Talk:Features/ControlGroups]] | ||
[[Category: | [[Category:FeatureAcceptedF11]] | ||
<!-- When your feature page is completed and ready for review --> | <!-- When your feature page is completed and ready for review --> | ||
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler --> | <!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler --> | ||
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete--> | <!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete--> | ||
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process --> | <!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process --> |
Latest revision as of 13:26, 28 March 2012
ControlGroups
See FUDCon 2009 presentation slides.
Summary
Control Groups
consists of two parts:
- an upstream kernel feature that allows system resources to be partitioned/divided up amongst different processes, or a group of processes.
- user-space tools which handle kernel control groups mechanism. We want to improve them where necessary and feasible and/or to create new ones e.g. to create or modify cgroups configuration or display control groups data (using libcgroups package).
Owners
- Linda Wang
- email: lwang@redhat.com
- Nils Philippsen
- email: nphilipp@redhat.com
- Ivana Varekova
- email: varekova@redhat.com
- Jan Šafránek
- email: jsafrane@redhat.com
Current status
- Targeted release: Fedora 11
- Last updated: 2009-04-14
- Percentage of completion: 100%
Detailed Description
Kernel Part
Control Groups provide a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour.
Definitions:
A *cgroup* associates a set of tasks with a set of parameters for one or more subsystems.
A *subsystem* is a module that makes use of the task grouping facilities provided by cgroups to treat groups of tasks in particular ways. A subsystem is typically a "resource controller" that schedules a resource or applies per-cgroup limits, but it may be anything that wants to act on a group of processes, e.g. a virtualization subsystem.
A *hierarchy* is a set of cgroups arranged in a tree, such that every task in the system is in exactly one of the cgroups in the hierarchy, and a set of subsystems; each subsystem has system-specific state attached to each cgroup in the hierarchy. Each hierarchy has an instance of the cgroup virtual filesystem associated with it.
At any one time there may be multiple active hierachies of task cgroups. Each hierarchy is a partition of all tasks in the system.
User level code may create and destroy cgroups by name in an instance of the cgroup virtual file system, specify and query to which cgroup a task is assigned, and list the task pids assigned to a cgroup. Those creations and assignments only affect the hierarchy associated with that instance of the cgroup file system.
On their own, the only use for cgroups is for simple job tracking. The intention is that other subsystems hook into the generic cgroup support to provide new attributes for cgroups, such as accounting/limiting the resources which processes in a cgroup can access. For example, cpusets (see Documentation/cpusets.txt) allows you to associate a set of CPUs and a set of memory nodes with the tasks in each cgroup.
User space tools
Libcgroups makes that functionality available to programmers and contains two tools, cgexec
and cgclassify
, to start processes in a control group or move existing processes from one control group to another. In Fedora libcgroups package is already incorporated, but the overall quality is very poor. There is almost no documentation, no man pages, no configuration file samples, there should be done code review and created other necessary tools and improve installations:
The goal for Fedora 11 is to improve this package where necessary, i.e.:
- bugfixing
- add/fix documentation and man-pages
- add examples
- fix error handling
- rework logging
- create displaying tool (to see, in which control group is given process)
- prepare a way, how to start a service daemon in given context group
The long term goal is to create new tools to e.g. create or modify persistent cgroups configuration and display control groups data. At the beginning the focus will be on command line tools, but we'll keep in mind that in the long term we'll likely want to have graphical tools. These would offer similar functionality and we should try to make sure that any non-UI code written is usable from both kinds of frontends.
Benefit to Fedora
The implementation of of "control groups" schema and its improvement should enable users to partitioned/divided resources up amongst different processes, or a group of processes. Libcgroups should helps them to create persistent configuration of partitioning devices and handle cgroups from user point of view. This project should help the user to make the best of control groups kernel feature.
Scope
- Kernel Part:
There are several sub-features under control group:
* CGROUPS (grouping infrastructure mechanism) * CPUSET (cpuset controller, in F10) * CPUACCT (cpu account controller, in F10) * SCHED (schedule controller, in F10) * MEMCTL (memory controller, in F10) * DEVICES * NETCTL (network controller, new in Rawhide/F11)
- tools part:
Required extended testing and fixing of libcgroups package and in time when libcgroups will be stable enough try to add start to write another parts - based on existing ones.
How To Test
To help test, and use the control group features in Fedora; there are multiple way to test, depends on the feature set that you are interested in.
From now to other tests it is necessary to have a kernel with cgroups support and the libcgroup
package.
1. yum install libcgroup
User space tools
Creating cgroups
- Configure
/etc/cgconfig.conf
file - there should be nice example and man page packaged. - Start/stop cgconfig service and test whether the created groups are as expected.
Moving task to groups
- Prepare some cgroups, i.e. prepare
/etc/cgconfig.conf
and startcgconfig
service. - Start/stop new proces using
cgexec
and check that it's in appropriate cgroup. - Prepare
cgrules.conf
file - there should be some sample and man page available. - Test
cgrulesengd
daemon (it should automatically move processes as written incgrules.conf
). - Configure cgroup pam module and test that works if a user logs in (again, driven by
cgrules.conf
).
Looking in which cgroup the task is
ps -o cgroup cat /proc/<pid>/cgroup
There will be more tools in future
Staring a service in control groups
Most services can be configured to start their daemons in specific control groups. Add following line to /etc/sysconfig/<service name> script:
CGROUP_DAEMON="cpu:/daemons/foo cpuacct:/foo"
It will work only if the service supports reading configuration from /etc/sysconfig/<name> and the service script uses daemon() call from /etc/init.d/functions (most services do).
Other tools
- cgclassify should move existing process to defined group (see man cgclassify)
- cgexec should start new process in defined group (see man cgexec)
Kernel features
Read kernel docs (see below). Each controller should have a documentation there
CPUSET
- Create a group controlled by cpuset controller, e.g. use following cgconfig.conf:
mount { cpuset = /mnt/cgroup; } group test { perm { task { uid = root; gid = root; } admin { uid = root; gid = root; } } # following section is cpuset specific, # replace with appropriate content when testing other controllers # allow only the first cpu and the first memory region cpuset { cpuset.cpus = 0; cpuset.mems = 0; } }
- Start the cgconfig service
- Execute a task in this group
$ cgexec -g cpuset:test /bin/bash
- Check the started bash (and all its children) are in the right group
$ cat /proc/self/cgroup 12:cpuset:/test $ ps -o cgroup ...
- Check, that all children of the bash can use only first cpu (e.g. compile kernel with -j3 or so).
CPUACCT
- Same as before, use following cgconfig.conf snippet instead of
cpuset { }
:
cpuacct { }
- Start a process in the group as before, check, that /mnt/cgroup/test/cpuacct.usage counts CPU cycles of the process and all its future children
Memory Controller
- Use following cgconfig.conf snippet:
memory { memory.limit_in_bytes = 40M; }
- Again, start something in the group. The process there can use 40 megabytes of memory.
- Look at /mnt/cgroup/tests/usage_in_bytes,there should be current memory usage of all processes in the group.
Test other controllers, as described in kernel documentation.
User Experience
End-user who will use this feature will hopefully find it useful to help partition their server/machine resources into different functional units that they can dedicate these resources to.
The control group user interfaces are very straight forward, and are a set of common easy to use command-line operations. The concept of allocating different system resources such as number of CPUs, amount of memories, and network bandwidth should be easy.
libcgroups
package should help the user to create persistent configuration and would help to reduce the barrier of entry to using control groups on Linux significantly.
Dependencies
Majority of the implementation is done inside of the kernel.
Tools part is implemented in package libcgroups
Contingency Plan
The contingency plan for under develop sub-feature is to simply not enable the kernel option during development freeze. Hence it will not expose the incomplete sub-feature to the fedora community.
Currently, nothing depends on libcgroup
or the tools which would use it. If things go really wrong, we can always go back to the last working version of libcgroup
.
Documentation
- kernel documentation:
- Documentation/cgroups - control group's directory
- Documentation/cgroups/cgroups.txt - overall top level description of the feature
- Documentation/cgroups/cpusets.txt - doc describing CPU/memory nodes to a set of tasks
- Documentation/cgroups/cpuacct.txt - doc describing CPU acct ctrl to cal. usage of cpu time
- Documentation/cgroups/devices.txt - doc describing device file
- Documentation/cgroups/memory.txt
- Documentation/cgroups/resource_counter.txt
libcgroup
:- upstream site
- LWN.net article: libcg: design and plans
- documentation from source tarball (directories
doc
andsamples
) - libcgroup man pages
- Resource management via cgroups in general:
Release Notes
Fedora 11 includes a new feature called Control Group
where it allows system administrator to partition the system resources into different sub groups, and dedicate these sub groups resources to different applications' need. It can be use to dedicate specific applications such as interactive applications; cpu, memory, or network bandwidth intensive application; or database application to a set of pre-allocated system resources.
There is also libcgroups tool which helps to manipulate, control and administrate control groups and the associated controllers. Using this tool it is possible to aggregate/partition set of tasks and their future children into hierarchical groups with specialized access to resources.
Comments and Discussion