Docker image layers are the fundamental building blocks of Docker images. Each layer captures filesystem changes and stacks as a read-only tier on top of previous layers. These layers are unified into a single filesystem through the Union File System and provided to containers. Understanding the layer structure is essential for efficient image building and storage optimization.

Docker Image Layer Overview

What is a Docker Image Layer?

A Docker image layer is a filesystem snapshot generated by each instruction in a Dockerfile. Like Git commits, it stores only the changes from the previous state, maximizing space efficiency.

History and Background of the Layer System

Docker’s layer system is rooted in the Linux Union File System concept. This technology originated from Unionfs, developed in 2004 by Professor Erez Zadok’s research team at SUNY Stony Brook. Docker initially used AUFS (Another Union File System), but overlay2 is now adopted as the default storage driver, natively supported in Linux kernel 3.18 and above.

YearEventDescription
2004Unionfs developmentFirst Union File System implementation
2006AUFS releaseImproved version of Unionfs, used in early Docker
2013Docker releaseAdopted AUFS-based layer system
2014overlay introductionOverlayFS integrated into Linux kernel 3.18
2016overlay2 becomes defaultoverlay2 recommended as driver in Docker 1.12
2020Optimization completePerformance improvements and stabilization of overlay2

Why Layers are Necessary

Composing Docker images as a single filesystem causes various inefficiencies. The layer system solves the following problems.

ProblemLimitation of Single StructureLayer System Solution
Storage wasteFull duplication even for identical base imagesDeduplication through shared common layers
Increased build timeFull rebuild required even for a single line changeOnly changed layers are rebuilt
Network bandwidth consumptionFull file transfer when transferring imagesOnly missing layers are downloaded
Difficult version managementCannot track change historyEach layer serves as change history

Union File System Operation Principles

What is a Union File System?

A Union File System is a technology that mounts multiple directories (branches) as a single unified filesystem view. Each branch has either read-only or read-write permissions, and upper branches overlay (shadow) lower branches.

overlay2 Storage Driver

The current default storage driver for Docker, overlay2, is based on the Linux kernel’s OverlayFS. It consists of four directories: lowerdir (read-only lower layers), upperdir (read-write upper layer), workdir (internal working directory), and merged (unified view).

overlay2 Storage Driver Structure

Layer Stack Operation

When multiple layers stack like a stack, the filesystem operates by the following rules: upper layer files shadow files at the same path in lower layers, file deletion is marked with whiteout files, and file reads search sequentially from the upper layer and return the first file found.

FROM ubuntu:22.04           # Layer 1: ~77MB - Ubuntu base filesystem
RUN apt-get update          # Layer 2: ~45MB - Package cache creation
RUN apt-get install -y nginx # Layer 3: ~60MB - nginx binaries and config
COPY nginx.conf /etc/nginx/  # Layer 4: ~1KB - Config file only
COPY app /var/www/html/      # Layer 5: Variable - Application files

Each layer stores only the changes (delta) from the previous layer. Layer 3 contains only the files added by nginx installation, and references Layer 1 for Ubuntu base files.

Copy-on-Write Mechanism

What is Copy-on-Write (CoW)?

Copy-on-Write is an optimization technique that delays resource duplication until actual modification occurs. In Docker, when a container modifies a file from an image layer, the file is copied to the container layer before modification.

CoW Operation Process

When modifying a file in a container, Copy-on-Write operates in the following steps:

  1. File read request: Container requests to modify /etc/nginx/nginx.conf file
  2. File search: Search in upper layer (upperdir), then in lower layer (lowerdir) if not found
  3. File copy: Copy from lowerdir to upperdir when found (copy-up)
  4. File modification: Apply modification to the copy in upperdir
  5. Subsequent access: All subsequent accesses go to the modified file in upperdir

CoW Performance Characteristics

The Copy-on-Write mechanism has advantages of space efficiency and fast container startup. However, there is copy overhead on the first modification of large files, and performance degradation can occur in write-intensive workloads. In such cases, it is recommended to bypass CoW overhead using volume mounts.

# docker-compose.yml - Use volumes for write-intensive directories
services:
  database:
    image: postgres:15
    volumes:
      - db-data:/var/lib/postgresql/data  # Bypass CoW

volumes:
  db-data:

Layer Caching Strategies

What is Build Cache?

Docker build cache stores layers generated from previous builds and reuses cached layers instead of building new ones when the same instruction and context are detected. This dramatically reduces build time.

Cache Invalidation Rules

Docker invalidates the cache for a layer and all subsequent layers when any of the following conditions are met. This means layer order directly affects build performance.

ConditionDescriptionExample
Instruction changeDockerfile instruction text changedRUN apt-get install nginxRUN apt-get install -y nginx
COPY/ADD file changeContent or metadata of files to copy changedSource code modification
ARG value changeBuild argument value changed--build-arg VERSION=2.0
Previous layer invalidationPreceding layer was rebuiltWhen Layer 1 changes, Layers 2~N are all rebuilt

Cache-Optimized Dockerfile Writing

To maximize cache efficiency, place layers with low change frequency first and place frequently changing layers at the bottom of the Dockerfile.

Inefficient Dockerfile:

FROM node:20-alpine
WORKDIR /app
COPY . .                    # Copy all files - Cache invalidated on source change
RUN npm install             # Runs every time
RUN npm run build

Optimized Dockerfile:

FROM node:20-alpine
WORKDIR /app

# Copy only dependency files first (low change frequency)
COPY package.json package-lock.json ./
RUN npm ci --only=production    # Runs only when dependencies change

# Copy source code (high change frequency)
COPY . .
RUN npm run build

In the optimized version, even if source code changes, if package.json has not changed, the npm install layer is reused from cache, significantly reducing build time.

Layer Consolidation and Optimization

Minimizing Layer Count

Each RUN, COPY, and ADD instruction in a Dockerfile creates a new layer. Consolidating related instructions reduces layer count and image size.

Command Consolidation Techniques

Connecting multiple RUN instructions with && and deleting temporary files within the same layer reduces final image size.

Inefficient approach:

RUN apt-get update
RUN apt-get install -y nginx
RUN apt-get install -y curl
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*

Optimized approach:

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        nginx \
        curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

The optimized version creates only 1 layer instead of 5. Temporary files (apt cache) are deleted in the same layer, so they are not included in the final image.

Using .dockerignore

Using a .dockerignore file prevents unnecessary files from being included in the build context, improving build speed and reducing image size.

# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
docker-compose*.yml
.env*
*.test.js
coverage/
.nyc_output/

Multi-Stage Builds

What is Multi-Stage Build?

Multi-stage build is a technique that uses multiple FROM instructions in a single Dockerfile to separate build and runtime environments. By excluding build tools and intermediate artifacts from the final image, image size can be dramatically reduced.

Multi-Stage Build Example

This Go application multi-stage build example demonstrates how separating build and runtime environments reduces image size.

# ===== Build Stage =====
FROM golang:1.21-alpine AS builder

WORKDIR /app

# Copy and download dependencies
COPY go.mod go.sum ./
RUN go mod download

# Copy source code and build
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

# ===== Runtime Stage =====
FROM alpine:3.19

# Create non-root user for security
RUN adduser -D -g '' appuser

WORKDIR /app

# Copy only binary from build stage
COPY --from=builder /app/main .

# Switch to non-root user
USER appuser

EXPOSE 8080
CMD ["./main"]

In this example, the build stage’s golang:1.21-alpine image is about 300MB, but the final runtime image is about 10MB based on alpine:3.19, reducing image size by over 97%.

Node.js Multi-Stage Build

This is a multi-stage build example for frontend applications.

# ===== Dependencies Stage =====
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production

# ===== Build Stage =====
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# ===== Runtime Stage =====
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/nginx.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Layer Analysis Tools

docker history Command

The docker history command displays information about each layer of an image, allowing you to identify which instructions occupy how much size.

# Check layer history
docker history nginx:latest

# Display full commands (not truncated)
docker history --no-trunc nginx:latest

# Format for size-based sorting
docker history --format "table {{.Size}}\t{{.CreatedBy}}" nginx:latest

Output example:

IMAGE          CREATED       CREATED BY                                      SIZE
a8758716bb6a   2 weeks ago   CMD ["nginx" "-g" "daemon off;"]                0B
<missing>      2 weeks ago   STOPSIGNAL SIGQUIT                              0B
<missing>      2 weeks ago   EXPOSE 80                                       0B
<missing>      2 weeks ago   ENTRYPOINT ["/docker-entrypoint.sh"]            0B
<missing>      2 weeks ago   COPY 30-tune-worker-processes.sh /docker-ent…   4.62kB
<missing>      2 weeks ago   COPY 20-envsubst-on-templates.sh /docker-ent…   3.02kB
<missing>      2 weeks ago   COPY 10-listen-on-ipv6-by-default.sh /docker…   2.12kB
<missing>      2 weeks ago   COPY docker-entrypoint.sh / # buildkit          1.62kB
<missing>      2 weeks ago   RUN /bin/sh -c set -x     && groupadd --syst…   112MB
<missing>      2 weeks ago   ENV DYNPKG_RELEASE=1~bookworm                   0B

docker inspect Command

The docker inspect command provides detailed image metadata in JSON format, allowing you to check layer IDs, environment variables, volume settings, and more.

# Check layer ID list
docker inspect --format '{{json .RootFS.Layers}}' nginx:latest | jq .

# Check total image size
docker inspect --format '{{.Size}}' nginx:latest

# Check virtual size (including shared layers)
docker inspect --format '{{.VirtualSize}}' nginx:latest

dive Tool

dive is a third-party tool for visually exploring and analyzing Docker image layers. It shows files added/modified/deleted in each layer and provides an image efficiency score.

# Install dive (Ubuntu/Debian)
wget https://github.com/wagoodman/dive/releases/download/v0.12.0/dive_0.12.0_linux_amd64.deb
sudo dpkg -i dive_0.12.0_linux_amd64.deb

# Analyze image
dive nginx:latest

# Efficiency check in CI/CD
CI=true dive nginx:latest --ci-config .dive-ci.yml

Conclusion

The Docker image layer system is a core technology for efficient storage usage, fast image builds, and rapid container startup. It operates based on Union File System and Copy-on-Write mechanisms. By understanding the layer structure and applying techniques such as cache optimization, command consolidation, and multi-stage builds, you can reduce build time, decrease image size, and improve overall development and deployment efficiency.

For effective Dockerfile writing, it is important to place layers with low change frequency first, consolidate related commands, separate build and runtime environments with multi-stage builds, and analyze layers with tools like docker history and dive to discover optimization opportunities.