|
|
(19 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
| {{admon/important | Comments and Explanations | The page source contains comments providing guidance to fill out each section. They are invisible when viewing this page. To read it, choose the "edit" link.<br/> '''Copy the source to a ''new page'' before making changes! DO NOT EDIT THIS TEMPLATE FOR YOUR FEATURE.'''}}
| | #REDIRECT[[:Features/ControlGroups]] |
| | |
| <!-- All fields on this form are required to be accepted by FESCo.
| |
| We also request that you maintain the same order of sections so that all of the feature pages are uniform. -->
| |
| | |
| <!-- The actual name of your feature page should look something like: Features/ResourceMgt. This keeps all features in the same namespace -->
| |
| | |
| = Feature Name =
| |
| Resource Management
| |
| | |
| == Summary ==
| |
| <!-- A sentence or two summarizing what this feature is and what it will do. This information is used for the overall feature summary page for each release. -->
| |
| Resource Management is an upstream feature that allows system resources to be partitioned/divided up amongst different processes, or a group of processes.
| |
| | |
| == Owner ==
| |
| <!--This should link to your home wiki page so we know who you are-->
| |
| * Name: lwang
| |
| | |
| <!-- Include you email address that you can be reached should people want to contact you about helping with your feature, status is requested, or technical issues need to be resolved-->
| |
| * email: lwang@redhat.com
| |
| | |
| == Current status ==
| |
| * Targeted release: [[Releases/{{FedoraVersion||next}} | {{FedoraVersion|long|next}} ]]
| |
| * Last updated: (DATE)
| |
| * Percentage of completion: 75%
| |
| | |
| <!-- CHANGE THE "FedoraVersion" TEMPLATES ABOVE TO PLAIN NUMBERS WHEN YOU COMPLETE YOUR PAGE. -->
| |
| | |
| == Detailed Description ==
| |
| Resource Management/Control Groups
| |
| | |
| Control Groups provide a mechanism for aggregating/partitioning sets of
| |
| tasks, and all their future children, into hierarchical groups with
| |
| specialized behaviour.
| |
| | |
| Definitions:
| |
|
| |
| A *cgroup* associates a set of tasks with a set of parameters for one
| |
| or more subsystems.
| |
|
| |
| A *subsystem* is a module that makes use of the task grouping
| |
| facilities provided by cgroups to treat groups of tasks in
| |
| particular ways. A subsystem is typically a "resource controller" that
| |
| schedules a resource or applies per-cgroup limits, but it may be
| |
| anything that wants to act on a group of processes, e.g. a
| |
| virtualization subsystem.
| |
|
| |
| A *hierarchy* is a set of cgroups arranged in a tree, such that
| |
| every task in the system is in exactly one of the cgroups in the
| |
| hierarchy, and a set of subsystems; each subsystem has system-specific
| |
| state attached to each cgroup in the hierarchy. Each hierarchy has
| |
| an instance of the cgroup virtual filesystem associated with it.
| |
|
| |
| At any one time there may be multiple active hierachies of task
| |
| cgroups. Each hierarchy is a partition of all tasks in the system.
| |
|
| |
| User level code may create and destroy cgroups by name in an
| |
| instance of the cgroup virtual file system, specify and query to
| |
| which cgroup a task is assigned, and list the task pids assigned to
| |
| a cgroup. Those creations and assignments only affect the hierarchy
| |
| associated with that instance of the cgroup file system.
| |
|
| |
| On their own, the only use for cgroups is for simple job
| |
| tracking. The intention is that other subsystems hook into the generic
| |
| cgroup support to provide new attributes for cgroups, such as
| |
| accounting/limiting the resources which processes in a cgroup can
| |
| access. For example, cpusets (see Documentation/cpusets.txt) allows
| |
| you to associate a set of CPUs and a set of memory nodes with the
| |
| tasks in each cgroup.
| |
| | |
| == Benefit to Fedora ==
| |
| To enable the cgroup sub-features will help fedora to be exposed to various resource partitioning scheme, and allow the fedora users to experience a new feature set that helps them partition their resource anyway they want.
| |
| | |
| == Scope ==
| |
| There are several sub-features under control group:
| |
| | |
| * CGROUPS (grouping mechanism)
| |
| CGROUPS=y
| |
| | |
| * CPUSET (cpuset controller)
| |
| CPUSET=y
| |
| | |
| * CPUACCT (cpu account controller)
| |
| CGROUP_CPUACCT=y
| |
| | |
| * SCHED (schedule controller)
| |
| CGROUP_SCHED=y
| |
| | |
| * MEMCTL (memory controller)
| |
| RESOURCE_COUNTERS=y
| |
| CGROUP_MEM_CONT=y
| |
| (CGROUP_MEM_RES_CTLR???)
| |
| | |
| * DEVICE
| |
| CGROUP_DEVICE=y
| |
| | |
| * NETCTL (network controller)
| |
| NET_CLS_CGROUP=y
| |
| | |
| * IOCTL (I/O controller)
| |
| ?? still under development
| |
| | |
| == How To Test ==
| |
| <!-- This does not need to be a full-fledged document. Describe the dimensions of tests that this feature is expected to pass when it is done. If it needs to be tested with different hardware or software configurations, indicate them. The more specific you can be, the better the community testing can be.
| |
| | |
| Remember that you are writing this how to for interested testers to use to check out your feature - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your feature.
| |
| | |
| A good "how to test" should answer these four questions:
| |
| | |
| 0. What special hardware / data / etc. is needed (if any)?
| |
| 1. How do I prepare my system to test this feature? What packages
| |
| need to be installed, config files edited, etc.?
| |
| 2. What specific actions do I perform to check that the feature is
| |
| working like it's supposed to?
| |
| 3. What are the expected results of those actions?
| |
| | |
| -->
| |
| | |
| To help test, and use the control group features in Fedora; there are
| |
| multiple way to test, depends on the feature set that you are interested in.
| |
| | |
| '''For CPUSET:'''
| |
| | |
| 0. targeted mostly for x86, x86_64
| |
| 1. Documentation/cgroups/cpusets.txt, section 2, Usage Examples and Syntax:
| |
| To start a new job that is to be contained within a cpuset, the steps are:
| |
| | |
| 1) mkdir /dev/cpuset
| |
| 2) mount -t cgroup -ocpuset cpuset /dev/cpuset
| |
| 3) Create the new cpuset by doing mkdir's and write's (or echo's) in
| |
| the /dev/cpuset virtual file system.
| |
| 4) Start a task that will be the "founding father" of the new job.
| |
| 5) Attach that task to the new cpuset by writing its pid to the
| |
| /dev/cpuset tasks file for that cpuset.
| |
| 6) fork, exec or clone the job tasks from this founding father task.
| |
| | |
| For example, the following sequence of commands will setup a cpuset
| |
| named "Charlie", containing just CPUs 2 and 3, and Memory Node 1,
| |
| and then start a subshell 'sh' in that cpuset:
| |
| | |
| mount -t cgroup -ocpuset cpuset /dev/cpuset
| |
| cd /dev/cpuset
| |
| mkdir Charlie
| |
| cd Charlie
| |
| /bin/echo 2-3 > cpus
| |
| /bin/echo 1 > mems
| |
| /bin/echo $$ > tasks
| |
| sh
| |
| # The subshell 'sh' is now running in cpuset Charlie
| |
| # The next line should display '/Charlie'
| |
| cat /proc/self/cpuset
| |
| | |
| '''For CPUACCT'''
| |
| | |
| The CPU accounting controller is used to group tasks using cgroups and
| |
| account the CPU usage of these groups of tasks.
| |
| | |
| The CPU accounting controller supports multi-hierarchy groups. An accounting
| |
| group accumulates the CPU usage of all of its child groups and the tasks
| |
| directly present in its group.
| |
| | |
| Accounting groups can be created by first mounting the cgroup filesystem.
| |
| | |
| # mkdir /cgroups
| |
| # mount -t cgroup -ocpuacct none /cgroups | |
| | |
| With the above step, the initial or the parent accounting group
| |
| becomes visible at /cgroups. At bootup, this group includes all the
| |
| tasks in the system. /cgroups/tasks lists the tasks in this cgroup.
| |
| /cgroups/cpuacct.usage gives the CPU time (in nanoseconds) obtained by
| |
| this group which is essentially the CPU time obtained by all the tasks
| |
| in the system.
| |
| | |
| New accounting groups can be created under the parent group /cgroups.
| |
| | |
| # cd /cgroups
| |
| # mkdir g1
| |
| # echo $$ > g1
| |
| | |
| The above steps create a new group g1 and move the current shell
| |
| process (bash) into it. CPU time consumed by this bash and its children
| |
| can be obtained from g1/cpuacct.usage and the same is accumulated in
| |
| /cgroups/cpuacct.usage also.
| |
| | |
| '''For Memory Controller'''
| |
| 0. Configuration
| |
| | |
| a. Enable CONFIG_CGROUPS
| |
| b. Enable CONFIG_RESOURCE_COUNTERS
| |
| c. Enable CONFIG_CGROUP_MEM_RES_CTLR (still valid??)
| |
| | |
| 1. Prepare the cgroups
| |
| # mkdir -p /cgroups
| |
| # mount -t cgroup none /cgroups -o memory
| |
| | |
| 2. Make the new group and move bash into it
| |
| # mkdir /cgroups/0
| |
| # echo $$ > /cgroups/0/tasks
| |
| | |
| Since now we're in the 0 cgroup,
| |
| We can alter the memory limit:
| |
| # echo 4M > /cgroups/0/memory.limit_in_bytes
| |
| | |
| NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
| |
| mega or gigabytes.
| |
| | |
| # cat /cgroups/0/memory.limit_in_bytes
| |
| 4194304
| |
| | |
| NOTE: The interface has now changed to display the usage in bytes
| |
| instead of pages
| |
| | |
| We can check the usage:
| |
| # cat /cgroups/0/memory.usage_in_bytes
| |
| 1216512
| |
| | |
| A successful write to this file does not guarantee a successful set of
| |
| this limit to the value written into the file. This can be due to a
| |
| number of factors, such as rounding up to page boundaries or the total
| |
| availability of memory on the system. The user is required to re-read
| |
| this file after a write to guarantee the value committed by the kernel.
| |
| | |
| # echo 1 > memory.limit_in_bytes
| |
| # cat memory.limit_in_bytes
| |
| 4096
| |
| | |
| The memory.failcnt field gives the number of times that the cgroup limit was
| |
| exceeded.
| |
| | |
| The memory.stat file gives accounting information. Now, the number of
| |
| caches, RSS and Active pages/Inactive pages are shown.
| |
| | |
| == User Experience ==
| |
| <!-- If this feature is noticeable by its target audience, how will their experiences change as a result? Describe what they will see or notice. -->
| |
| | |
| == Dependencies ==
| |
| <!-- What other packages (RPMs) depend on this package? Are there changes outside the developers' control on which completion of this feature depends? In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate? Other upstream projects like the kernel (if this is not a kernel feature)? -->
| |
| | |
| == Contingency Plan ==
| |
| <!-- If you cannot complete your feature by the final development freeze, what is the backup plan? This might be as simple as "None necessary, revert to previous release behaviour." Or it might not. If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy. -->
| |
| | |
| == Documentation ==
| |
| <!-- Is there upstream documentation on this feature, or notes you have written yourself? Link to that material here so other interested developers can get involved. -->
| |
| | |
| | |
| == Release Notes ==
| |
| | |
| <!-- The Fedora Release Notes inform end-users about what is new in the release. Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
| |
| <!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns. If there are any such changes involved in this feature, indicate them here. You can also link to upstream documentation if it satisfies this need. This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
| |
| | |
| == Comments and Discussion ==
| |
| | |
| * See [[Talk:Features/YourFeatureName]] <!-- This adds a link to the "discussion" tab associated with your page. This provides the ability to have ongoing comments or conversation without bogging down the main feature page -->
| |
| | |
| | |
| ----
| |
| | |
| [[Category:FeaturePageIncomplete]]
| |
| <!-- When your feature page is completed and ready for review -->
| |
| <!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
| |
| <!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
| |
| <!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->
| |