Linux Process introduction
If you want to write a small program to calculate addition , This program needs input from a file , The result of the calculation is input into another file .
Because computers only know 0 and 1, So no matter what language you write this code in , Finally, they need to be translated into binary files in some way , To run in the computer operating system .
In order to make these codes work properly , We often have to provide it with data , For example, the input file required by our addition program . This data plus the binary file of the code itself , Put it on disk , It's what we usually call
Program , Also known as the executable image of code （executable image）.
then , We can run this on the computer
Program 了 .
First , The operating system comes from
Program It is found that the input is saved in a file , So the data is loaded into memory . meanwhile , The operating system reads the instruction to calculate the addition , At this time , It needs instructions CPU Complete the addition operation . and CPU Cooperate with memory for addition calculation , It also uses registers to store values 、 The memory stack holds the executed commands and variables . meanwhile , There are open files in the computer , And all kinds of I/O The device changes its state in constant calls .
That's it , once
Program To be executed , It's from the binary on disk , It becomes the data in the computer memory 、 The value in the register , Instructions in the stack 、 Open file , And a collection of various device status information . The sum of the computer execution environment after such a program runs , Is our protagonist ： process .
therefore , For the process , Its static representation is the program , I usually stay quietly on the disk ; And once it's running , It becomes the sum of data and state in the computer , This is its dynamic performance ,
Linux Isolation of containers
Docker Containers are essentially Linux The process of the operating system , It's just Docker adopt namespace The resource isolation technology between processes is realized , In this way, many people will feel very abstract , Then let's learn about it through actual combat ！
First, let's create a container ：
# docker run -it busybox /bin/sh / #
Execute... In the container PS Instructions ：
/ # ps PID USER TIME COMMAND 1 root 0:00 /bin/sh 6 root 0:00 ps
You can see , We are Docker The first one in /bin/sh, It's the inside of this container 1 Process of no. （PID=1）, There are only two processes running in this container . That means , What we did earlier /bin/sh, And what we just did ps, Has been Docker Isolated in a world different from the host .
How on earth did this happen ？
Originally , Every time we run a /bin/sh Program , The operating system assigns it a process number , such as PID=100. This number is the only identification of the process , It's like an employee's badge . therefore PID=100, It can be roughly understood as this /bin/sh It's the number one in our company 100 Staff number , And the first 1 No. 1 employee is bill · Gates, who is in charge of the whole . And now , We're going to pass Docker Put this /bin/sh The program runs in a container . Now ,Docker It will be in this 100 Give employee No. 1 a “ Smoke screen ”, Let him never see the others in front 99 Employees , Not to mention bill · gates . such , He mistakenly thinks he's the number one in the company 1 Staff number . Such mechanism , In fact, it is the process space of isolated applications , So that these processes can only see the recalculated process number , such as PID=1. But actually , They're in the host's operating system , It's still the original 100 Process of no. .
This kind of technology , Namely Linux Inside Namespace Mechanism . and Namespace It's also very interesting ： It's just Linux An optional parameter to create a new process . We know , stay Linux The system call to create a thread in the system is clone(), such as ：
int pid = clone(main_function, stack_size, SIGCHLD, NULL);
This system call will create a new process for us , And return its process number pid.
And when we use clone() When a system call creates a new process , You can specify CLONE_NEWPID Parameters , such as ：
int pid = clone(main_function, stack_size, CLONE_NEWPID | SIGCHLD, NULL);
At this time , The newly created process will “ notice ” A new process space , In this process space , its PID yes 1. Reason why “ notice ”, Because it's just a “ Smoke screen ”, In the real process space of the host , Of this process PID It's a real number , such as 100.
Of course , We can also execute the above clone() call , This creates multiple PID Namespace, And each Namespace Application process in , Will think that they are the third in the current container 1 Process of no. , They don't see the real process space in the host , I can't see anything else PID Namespace The details in .
And in addition to what we just used PID Namespace,Linux The operating system also provides Mount、UTS、IPC、Network and User these Namespace, It is used to perform... On various process contexts “ Smoke screen ” operation . such as ,Mount Namespace, Used to make the quarantined process only see the current Namespace There's some information on it ;Network Namespace, Used to let the quarantined process see the current Namespace Network equipment and configuration in .
this , Namely Linux The most basic implementation principle of container .
therefore ,Docker The concept of container sounds mysterious and mysterious , In fact, when creating the container process , Specifies a set of... That this process needs to be enabled Namespace Parameters . such , The container can only “ see ” To the current Namespace Limited resources 、 file 、 equipment 、 state , Or configure . And for the host and other unrelated programs , It can't see at all .
So , Containers , It's actually a special process .
Linux The limitation of the container
Why do I need to do... On the container
Limit Well ？
Although the first process in the container is
Smoke screen We can only see the situation in the container under the interference of , But on the host , It serves as the first 100 There is still a competitive relationship between process No. 1 and all other processes , That means , Although the first 100 Process No. 1 is ostensibly isolated , But the resources it can use （ such as CPU, Memory ）, It can be used by other processes on the host at any time （ Or other machines ） The amount of . Of course, this 100 The process itself may eat up all the resources . These situations , Obviously, it is not a reasonable behavior that a sandbox should mark .
Linux Cgroups What is it? ？
cgroups yes Linux Next control （ Or a group ） Resource restriction mechanism of process , The full name is control groups, It can be done to cpu、 Fine control of memory and other resources , For example, there are many Docker stay Linux The following is based on cgroups Provide resource restriction mechanism to realize resource control ; besides , Developers can also refer to directly based on cgroups To control process resources , such as 8 On the nuclear machine web Service and a computing service , It can make web Services can only use 6 A nuclear , Leave the remaining two cores to the computing service .cgroups cpu Limit not only how much you can use / Beyond which cores , You can also set cpu Occupancy ratio （ Note that the occupancy ratio is the usage ratio when each is full , If one cgroup Idle and another busy , So busy cgroup It is possible to occupy the whole cpu The core ）.
stay Linux in ,Cgroups The exposed operation interface to the user is the file system , It is organized in the form of files and directories in the operating system
/sys/fs/cgroup Under the path . stay Centos In the machine , We can use mount Command to show them ：
/ # mount -t cgroup cgroup on /sys/fs/cgroup/systemd type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/hugetlb type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/freezer type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,net_prio,net_cls) cgroup on /sys/fs/cgroup/blkio type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/cpuset type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/perf_event type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/devices type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/pids type cgroup (ro,seclabel,nosuid,nodev,noexec,relatime,pids)
At present, I see , stay
/sys/fs/cgroup There are many examples below cpuset、cpu、memory Such subdirectories , Also called subsystem . These are all things that my machine can be used at present Cgroups Types of resources to limit . Under the resource class corresponding to the subsystem , You can see the specific methods that such resources can be restricted .
such as , Yes CPU For subsystems , We can see the following configuration files ：
/ # ls -l /sys/fs/cgroup/cpu/ total 0 -rw-r--r-- 1 root root 0 Aug 12 10:55 cgroup.clone_children --w--w--w- 1 root root 0 Aug 12 10:55 cgroup.event_control -rw-r--r-- 1 root root 0 Aug 12 10:55 cgroup.procs -rw-r--r-- 1 root root 0 Aug 12 10:55 cpu.cfs_period_us -rw-r--r-- 1 root root 0 Aug 12 10:55 cpu.cfs_quota_us -rw-r--r-- 1 root root 0 Aug 12 10:55 cpu.rt_period_us -rw-r--r-- 1 root root 0 Aug 12 10:55 cpu.rt_runtime_us -rw-r--r-- 1 root root 0 Aug 12 10:55 cpu.shares -r--r--r-- 1 root root 0 Aug 12 10:55 cpu.stat -r--r--r-- 1 root root 0 Aug 12 10:55 cpuacct.stat -rw-r--r-- 1 root root 0 Aug 12 10:55 cpuacct.usage -r--r--r-- 1 root root 0 Aug 12 10:55 cpuacct.usage_percpu -rw-r--r-- 1 root root 0 Aug 12 10:55 notify_on_release -rw-r--r-- 1 root root 0 Aug 12 10:55 tasks
Yes Linux CPU Manage familiar classmates , You should notice cfs_period and cfs_quota Such keywords . These two parameters need to be combined , Can be used to limit the length of the process cfs_period For a period of time , Can only be allocated to a total of cfs_quota Of CPU Time .
Next, let's use this configuration ?
First, we need to create a directory under the corresponding subsystem ：
# cd /sys/fs/cgroup/cpu # mkdir container # cd container/ # ll total 0 -rw-r--r--. 1 root root 0 Aug 12 19:38 cgroup.clone_children --w--w--w-. 1 root root 0 Aug 12 19:38 cgroup.event_control -rw-r--r--. 1 root root 0 Aug 12 19:38 cgroup.procs -r--r--r--. 1 root root 0 Aug 12 19:38 cpuacct.stat -rw-r--r--. 1 root root 0 Aug 12 19:38 cpuacct.usage -r--r--r--. 1 root root 0 Aug 12 19:38 cpuacct.usage_percpu -rw-r--r--. 1 root root 0 Aug 12 19:38 cpu.cfs_period_us -rw-r--r--. 1 root root 0 Aug 12 19:38 cpu.cfs_quota_us -rw-r--r--. 1 root root 0 Aug 12 19:38 cpu.rt_period_us -rw-r--r--. 1 root root 0 Aug 12 19:38 cpu.rt_runtime_us -rw-r--r--. 1 root root 0 Aug 12 19:38 cpu.shares -r--r--r--. 1 root root 0 Aug 12 19:38 cpu.stat -rw-r--r--. 1 root root 0 Aug 12 19:38 notify_on_release -rw-r--r--. 1 root root 0 Aug 12 19:38 tasks
This directory is called a control group . You'll find that , The operating system will be in your newly created container Under the table of contents , Automatically generate the resource limit file corresponding to the subsystem .
At the moment , We execute an endless loop script , Put the calculated CPU Eat to 100%
# while : ; do : ; done
# top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7996 root 20 0 1320 256 212 R 100 0.0 1:12.75 sh
adopt top The command can be seen ,CPU The utilization rate of has been 100%
here , We can check container A file in a directory , You can see container Control group CPU quota There are no restrictions yet （：-1）
# cat /sys/fs/cgroup/cpu/container/cpu.cfs_quota_us -1
Next, we set the limit by modifying these files ：
towards container In the group cfs_quota File is written to 20ms（20000 us）
echo 20000 > /sys/fs/cgroup/cpu/container/cpu.cfs_quota_us
100ms In the time of , Restricted by this control group can only use 20MS Of CPU Time , In other words, this process can only use 20% Of CPU bandwidth
Next , We put the limited process of PID write in container In the group tasks file , The above settings will take effect for the process
# echo 7996 > /sys/fs/cgroup/cpu/container/tasks
And then through top Look at the ：
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7996 root 20 0 119484 6140 1652 R 20.3 0.2 3:45.10 sh
You can see , The computer CPU Usage immediately dropped to 20%
<u> Isn't that amazing ？</u>
except CPU Outside the subsystem ,Cgroups Each subsystem of has its own resource limitation capability ： such as
- blkio, Set... For block devices I/O Limit , Generally used for disk and other equipment
- cpuset, Assign a separate... To the process CPU Core and corresponding memory nodes
- memory, Set memory usage limits for processes
Linux Cgroups The design is easy to use , Simply and roughly understand , It is a combination of a subsystem directory and a set of resource limit files . And for Docker etc. Linux For the container project , They just need to be under each subsystem , Create a control group for each container （ Create a directory ）, Then after starting the container process , Put this process PID Fill in the tasks Just in the file .
As for the value in the resource file under these control groups , It's up to the user to do docker run You can specify the parameters when , Such as the following command ：
# docker run -it --cpu-period=10000 --cpu-quota=20000 ubuntu /bin/bash
After starting the container , We can check Cgroup Under the file system ,CPU In the subsystem ,
docker The content of the resource limit file in this control group to confirm ：
#cat/sys/fs/cgroup/cpu/docker/0712c3d12935b9a3f69ac976b9d70309b78cb7db9a5a5c8a612742370b7453e4/cpu.cfs_period_us 10000 #cat/sys/fs/cgroup/cpu/docker/0712c3d12935b9a3f69ac976b9d70309b78cb7db9a5a5c8a612742370b7453e4/cpu.cfs_quota_us 20000
Click on " Read the original " Get a better reading experience ！