[Reprint] Principles of Docker Images

title: 【转载】Docker Image Principles
date: 2021-08-16 09:24:01
comment: false
toc: true
category:

Docker
tags:
转载
Docker
Image
Principles

This article is reprinted from: Docker Image Principles

Docker Image Principles#

An image is a lightweight, executable, standalone software package used to package the software runtime environment and the software developed based on the runtime environment. It contains everything needed for a particular software, including code, runtime libraries, environment variables, and configuration files.

UnionFs (Union File System)#

UnionFS (Union File System): The Union file system (UnionFS) is a layered, lightweight, and high-performance file system that supports modifications to the file system as a single commit layered on top of each other, while also allowing different directories to be mounted under the same virtual file system (unite several directories into a single virtual filesystem). The Union file system is the foundation of Docker images. Images can inherit through layers, and various specific application images can be created based on a base image (which has no parent image). Features: Multiple file systems can be loaded simultaneously, but from the outside, only one file system is visible. The union loading will overlay the various layers of the file system, so the final file system will contain all the underlying files and directories.

When downloading a Docker image layer by layer, it is actually a manifestation of the union file system.

Docker Image Loading Principles#

Docker images are actually composed of layers of file systems, which is a hierarchical file system called UnionFS. The bootfs (boot file system) mainly contains the bootloader and kernel. The bootloader is primarily responsible for loading the kernel. When Linux starts, it loads the bootfs file system, and at the bottom layer of the Docker image is the bootfs. This layer is the same as in a typical Linux/Unix system, containing the bootloader and kernel. Once the boot process is complete, the entire kernel is in memory, and at this point, the memory usage rights have been transferred from bootfs to the kernel, and the system will also unload bootfs.

The rootfs (root file system) is above the bootfs. It includes the standard directories and files in a typical Linux system, such as /dev, /proc, /bin, /etc, etc. The rootfs consists of various different operating system distributions, such as Ubuntu, CentOS, etc.

Usually, the CentOS we install on virtual machines is several gigabytes, while Docker is only a few hundred megabytes.

For a streamlined OS, the rootfs can be very small, only needing to include the most basic commands, tools, and libraries, because the underlying kernel is directly used from the host, and it only needs to provide the rootfs. Thus, for different Linux distributions, the bootfs is basically consistent, while the rootfs will differ, allowing different distributions to share the bootfs.

Understanding Layers#

When downloading a Docker image layer by layer, it is the most intuitive manifestation of layering (already downloaded layers will not be downloaded again).

$ docker pull redis  
Using default tag: latest  
latest: Pulling from library/redis  
33847f680f63: Already exists  
26a746039521: Pull complete  
18d87da94363: Pull complete  
5e118a708802: Pull complete  
ecf0dbe7c357: Pull complete  
46f280ba52da: Pull complete  
Digest: sha256:cd0c68c5479f2db4b9e2c5fbfdb7a8acb77625322dd5b474578515422d3ddb59  
Status: Downloaded newer image for redis:latest  
docker.io/library/redis:latest

We can also use the inspect command mentioned in the previous article.

docker image inspect redis:latest

We can see the specific layering information of the image, with 6 layers corresponding to the 6 layers when the image was downloaded.

"RootFS": {  
    "Type": "layers",  
    "Layers": [  
        "sha256:814bff7343242acfd20a2c841e041dd57c50f0cf844d4abd2329f78b992197f4",  
        "sha256:dd1ebb1f5319785e34838c7332a71e5255bda9ccf61d2a0bf3bff3d2c3f4cdb4",  
        "sha256:11f99184504048b93dc2bdabf1999d6bc7d9d9ded54d15a5f09e36d8c571c32d",  
        "sha256:e461360755916af80821289b1cbc503692cf63e4e93f09b35784d9f7a819f7f2",  
        "sha256:45f6df6342536d948b07e9df6ad231bf17a73e5861a84fc3c9ee8a59f73d0f9f",  
        "sha256:262de04acb7e0165281132c876c0636c358963aa3e0b99e7fbeb8aba08c06935"  
    ]  
},

Benefits of Layering:

The biggest benefit is - resource sharing.

For example, if multiple images are built from the same base image, then the Docker Host only needs to save one copy of the base image on disk; at the same time, only one copy of the base image needs to be loaded into memory to serve all containers. Moreover, each layer of the image can be shared.

At this point, someone might ask: If multiple containers share a base image, when one container modifies the contents of the base image, such as files under /etc, will the /etc of other containers also be modified? The answer is: No! Because modifications are confined to a single container.

This is the Copy-on-Write feature of containers that we will learn next.

Writable Layer of Containers#

When a container starts, a new writable layer is loaded on top of the image. This layer is commonly referred to as the "container layer," while everything below the "container layer" is called "image layers." All operations are performed on the container layer, which is the only writable layer; all image layers below the container layer are read-only.

Commit Image#

docker commit  Submit the container as a new copy

Let's illustrate this with an example.

I previously downloaded the Tomcat image.

$ docker images  
REPOSITORY   TAG       IMAGE ID       CREATED        SIZE  
tomcat       latest    710ec5c56683   7 days ago     668MB  
redis        latest    aa4d65e670d6   3 weeks ago    105MB  
mysql        latest    c60d96bd2b77   3 weeks ago    514MB  
centos       centos7   8652b9f0cb4c   9 months ago   204MB

Start Tomcat.

-- Run the image and map the port to the host port  
docker run  -it -p 8080:8080 tomcat

In another terminal, check the currently running containers; Tomcat is running.

$ docker ps  
CONTAINER ID   IMAGE            COMMAND             CREATED          STATUS          PORTS                                       NAMES  
8b294cd7074e   tomcat           "catalina.sh run"   29 seconds ago   Up 28 seconds   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   zealous_cohen  
f42ae22e4b72   centos:centos7   "/bin/bash"         3 weeks ago      Up 46 hours                                                 centos-test

Then we enter the Tomcat container.

docker exec -it 8b294cd7074e /bin/bash

$ docker exec -it 8b294cd7074e /bin/bash  
root@8b294cd7074e:/usr/local/tomcat# ls  
BUILDING.txt  CONTRIBUTING.md  LICENSE	NOTICE	README.md  RELEASE-NOTES  RUNNING.txt  bin  conf  lib  logs  native-jni-lib  temp  webapps  webapps.dist  work

Next, we check the webapp (project file location), and as shown, the webapp is empty.

root@8b294cd7074e:/usr/local/tomcat# cd webapps  
root@8b294cd7074e:/usr/local/tomcat/webapps# ls  
root@8b294cd7074e:/usr/local/tomcat/webapps#

Then we access Tomcat; a 404 indicates that Tomcat has started normally but cannot find the corresponding page, which is understandable since the webapp is empty.

Now let's try to create an image with content in the webapp.

We will copy the files from webapps.dist to webapps.

root@8b294cd7074e:/usr/local/tomcat# cp -r webapps.dist/* webapps  
root@8b294cd7074e:/usr/local/tomcat# cd webapps  
root@8b294cd7074e:/usr/local/tomcat/webapps# ls  
ROOT  docs  examples  host-manager  manager

After modification, Tomcat can be accessed normally.

$ docker ps  
CONTAINER ID   IMAGE            COMMAND             CREATED          STATUS          PORTS                                       NAMES  
8b294cd7074e   tomcat           "catalina.sh run"   27 minutes ago   Up 27 minutes   0.0.0.0:8080->8080/tcp, :::8080->8080/tcp   zealous_cohen

Now that we have made simple modifications to this Tomcat, I think my version is slightly better than the official one, so I will submit my image.

$ docker commit -a="cb" -m "add init file" 8b294cd7074e newtomcat:1.0  
sha256:44cf4d44be664d9704a3fc38ddef1f03fa7f113ad83f4049cced322a14dc216b

By using docker images, we can see that a new image has been created. Upon careful observation, we can find that our submitted newtomcat is slightly larger than the official tomcat.

$ docker images  
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE  
newtomcat    1.0       44cf4d44be66   46 seconds ago   673MB  
tomcat       latest    710ec5c56683   7 days ago       668MB  
redis        latest    aa4d65e670d6   3 weeks ago      105MB  
mysql        latest    c60d96bd2b77   3 weeks ago      514MB  
centos       centos7   8652b9f0cb4c   9 months ago     204MB

Later, we can directly use our own image and also share it with others; we will discuss this later.

Through this example, we can review the layering concept mentioned above. The newtomcat is the image generated after we made modifications in the previous container layer, which is similar to a snapshot of a virtual machine image.

That's all for Docker images. Now we have a simple understanding of Docker, and we will continue to learn more about Docker-related content. Let's keep up the good work!

Day by day, there is no end; efforts will not be in vain and will eventually pay off.