Docker - Use COPY --chown instead of RUN chown after COPY in Dockerfile


Docker best practice:

Use --chown option of Docker's COPY command instead of doing it manually to reduce build time.

 # manually changing owner
 COPY . $APP_HOME
 RUN chown -r app:app $APP_HOME

 # using --chown option
 COPY --chown=app:app . $APP_HOME

Docker and Python Virtual Environments


Docker tip:

You can use a virtual environment instead of building wheels in multi-stage builds.

For example:

# temp stage
FROM python:3.9-slim as builder

WORKDIR /app

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc

RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY requirements.txt .
RUN pip install -r requirements.txt


# final stage
FROM python:3.9-slim

COPY --from=builder /opt/venv /opt/venv

WORKDIR /app

ENV PATH="/opt/venv/bin:$PATH"

Note: This is one of the only use cases for using a Python virtual environment with Docker.

  1. Install the dependencies in the builder image within a virtual environment.
  2. Copy over the dependencies to the final image

This reduces the size of the final image significantly.

Docker Logging Best Practices - stdout and stderr


Docker best practice:

Your Docker applications should log to standard output (stdout) and standard error (stderr) rather than to a file.

You can then configure the Docker daemon to send your log messages to a centralized logging solution (like CloudWatch or Papertrail).

Set Docker Memory and CPU Limits


Docker best practice:

Limit CPU and memory for your containers to prevent crippling the rest of the containers on the machine.

Examples:

# using docker run
$ docker run --cpus=2 -m 512m nginx


# using docker-compose
version: "3.9"
services:
  redis:
    image: redis:alpine
    deploy:
      resources:
        limits:
          cpus: 2
          memory: 512M
        reservations:
          cpus: 1
          memory: 256M

Sign and Verify Docker Images


Docker best practice:

Sign and verify your Docker images to prevent running images that have been tampered with.

To verify the integrity and authenticity of an image, set the DOCKER_CONTENT_TRUST environment variable:

DOCKER_CONTENT_TRUST=1

Lint and Scan Your Dockerfiles and Images


Docker best practice:

Lint and scan your Dockerfiles and images to check your code for programmatic and stylistic errors and bad practices that could lead to potential flaws.

Some options:

👇

hadolint Dockerfile

Dockerfile:1 DL3006 warning: Always tag the version of an image explicitly
Dockerfile:7 DL3042 warning: Avoid the use of cache directory with pip. Use `pip install --no-cache-dir <package>`
Dockerfile:9 DL3059 info: Multiple consecutive `RUN` instructions. Consider consolidation.
Dockerfile:17 DL3025 warning: Use arguments JSON notation for CMD and ENTRYPOINT arguments

Use a .dockerignore File


A properly structured .dockerignore file can help:

  1. Decrease the size of the Docker image
  2. Speed up the build process
  3. Prevent unnecessary cache invalidation
  4. Prevent leaking secrets

Example:

**/.git
**/.gitignore
**/.vscode
**/coverage
**/.env
**/.aws
**/.ssh
Dockerfile
README.md
docker-compose.yml
**/.DS_Store
**/venv
**/env

Don't Embed Secrets in Docker Images


Docker best practice:

Don't store secrets in Docker images.

Instead, they should be injected via:

  1. Environment variables (at run-time)
  2. Build-time arguments (at build-time)
  3. An orchestration tool like Docker Swarm (via Docker secrets) or Kubernetes (via Kubernetes secrets)

For more along with examples, check out Don't Store Secrets in Images.

Docker tagging best practices


Docker best practice:

Version Docker images to know which version of your code is running and to simplify rollbacks. Avoid the latest tag.

Examples:

docker build -t web-prod-a072c4e-0.1.4 .

Docker - include a HEALTHCHECK instruction


Docker best practice:

Use HEALTHCHECK to verify that the process running inside the container is healthy.

For example, call the health check endpoint of your web app:

HEALTHCHECK CMD curl --fail http://localhost:8000 || exit 1

Docker - array vs string based CMD


Docker best practice:

Use array over string syntax in your Dockerfiles to handle signals properly:

# array (exec)
CMD ["gunicorn", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "main:app"]

# string (shell)
CMD "gunicorn -w 4 -k uvicorn.workers.UvicornWorker main:app"

Using the string form causes Docker to run your process using bash, which doesn't handle signals properly. Since most shells don't process signals to child processes, if you use the shell format, CTRL-C (which generates a SIGTERM) may not stop a child process.

Docker - run only one process per container


Docker best practice:

Run only one process per container to make it easier to reuse and scale each of the individual services:

  1. Scaling - With each service being in a separate container, you can scale one of your web servers horizontally as needed to handle more traffic.
  2. Reusability - Perhaps you have another service that needs a containerized database. You can simply reuse the same database container without bringing two unnecessary services along with it.
  3. Logging - Coupling containers makes logging much more complex.
  4. Portability and Predictability - It's much easier to make security patches or debug an issue when there's less surface area to work with.

Docker ADD vs COPY


Docker best practice:

Prefer COPY over ADD when copying files from a location to a Docker image.

Use ADD to:

  1. download external files
  2. extract an archive to the destination

👇

# copy local files on the host to the destination
COPY /source/path  /destination/path
ADD /source/path  /destination/path

# download external file and copy to the destination
ADD http://external.file/url  /destination/path

# copy and extract local compresses files
ADD source.file.tar.gz /destination/path

Docker - use unprivileged containers


Docker best practice:

Always run a container with a non-root user. Running as root inside the container is running as root in the Docker host. If an attacker gains access to your container, they have access to all the root privileges and can perform several attacks against the Docker host.

👇

RUN addgroup --system app && adduser --system --group app

USER app

Dockerfile - Multiple RUN commands v. single chained RUN command


Docker best practice:

In your Dockerfile, combine commands to minimize the number of layers and therefore reduce the image size.

# 2 commands
RUN apt-get update
RUN apt-get install -y netcat


# single command
RUN apt-get update && apt-get install -y netcat

Results:

# docker history to see layers

$ docker images
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
dockerfile   latest    180f98132d02   51 seconds ago   259MB

$ docker history 180f98132d02

IMAGE          CREATED              CREATED BY                                      SIZE      COMMENT
180f98132d02   58 seconds ago       COPY . . # buildkit                             6.71kB    buildkit.dockerfile.v0
<missing>      58 seconds ago       RUN /bin/sh -c pip install -r requirements.t…   35.5MB    buildkit.dockerfile.v0
<missing>      About a minute ago   COPY requirements.txt . # buildkit              58B       buildkit.dockerfile.v0
<missing>      About a minute ago   WORKDIR /app
...

Which Docker base image should you use?


Docker best practice:

Use smaller base images for your application. *-slim is usually a good choice.

  • faster building
  • faster pushing
  • faster pulling
REPOSITORY   TAG                 IMAGE ID       CREATED      SIZE
python       3.9.6-alpine3.14    f773016f760e   3 days ago   45.1MB
python       3.9.6-slim          907fc13ca8e7   3 days ago   115MB
python       3.9.6-slim-buster   907fc13ca8e7   3 days ago   115MB
python       3.9.6               cba42c28d9b8   3 days ago   886MB
python       3.9.6-buster        cba42c28d9b8   3 days ago   886MB
5:17

Pay close attention to the order of your Dockerfile commands to leverage layer caching


Docker best practice:

Order Dockerfile commands appropriately to better leverage caching.

Example:

# sample.py is copied before requirements.txt
# dependencies will be installed for every change to sample.py

FROM python:3.9-slim

WORKDIR /app

COPY sample.py .

COPY requirements.txt .

RUN pip install -r /requirements.txt


# sample.py is copied after requirements.txt
# dependencies will be installed only for changes to requirements.txt
# when there are no changes, Docker cache will be used

FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install -r /requirements.txt

COPY sample.py .

Docker multi-stage builds


Docker best practice:

Use multistage builds to reduce the size of the production image.

# temp stage
FROM python:3.9-slim as builder

WORKDIR /app

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc

COPY requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /app/wheels -r requirements.txt


# final stage
FROM python:3.9-slim

WORKDIR /app

COPY --from=builder /app/wheels /wheels
COPY --from=builder /app/requirements.txt .

RUN pip install --no-cache /wheels/*