docker and apache spark

Faster CI Builds with Docker Cache

Faster CI Builds with Docker Cache




This post takes a quick look at how to speed up your Docker-based CI builds on Travis, Circle, and GitLab with Docker Cache.

Contents

Docker Cache

Docker caches each layer as an image is built, and each layer will only be re-built if it or the layer above it has changed since the last build. So, you can significantly speed up builds with Docker cache. Let's take a look at a quick example.

Dockerfile:

# pull base image
FROM python:3.7.3-slim

# install netcat
RUN apt-get update && \
    apt-get -y install netcat && \
    apt-get clean

# set working directory
WORKDIR /usr/src/app

# install requirements
COPY ./requirements.txt /usr/src/app/requirements.txt
RUN pip install -r requirements.txt

# add app
COPY . /usr/src/app

# run server
CMD gunicorn -b 0.0.0.0:5000 manage:app

You can find the full source code for this project in the docker-ci-cache repo on GitHub.

The first Docker build can take several minutes to complete, depending on your connection speed. Subsequent builds should only take a few seconds since the layers were cached after that first build:

Step 1/7 : FROM python:3.7.3-slim
 ---> 6ba8638f69d7
Step 2/7 : RUN apt-get update &&     apt-get -y install netcat &&     apt-get clean
 ---> Using cache
 ---> 66e45dd8223c
Step 3/7 : WORKDIR /usr/src/app
 ---> Using cache
 ---> 8608535877f4
Step 4/7 : COPY ./requirements.txt /usr/src/app/requirements.txt
 ---> Using cache
 ---> e32e9df9a534
Step 5/7 : RUN pip install -r requirements.txt
 ---> Using cache
 ---> 45b05d6f5f0c
Step 6/7 : COPY . /usr/src/app
 ---> Using cache
 ---> 688bcc98f54a
Step 7/7 : CMD gunicorn -b 0.0.0.0:5000 manage:app
 ---> Using cache
 ---> f1545746c538
Successfully built f1545746c538

Even if you make a change to the source code it should still only take a few seconds to build as the dependencies will not need to be downloaded. Only the last two layers have to be re-built, in other words:

Step 6/7 : COPY . /usr/src/app
 ---> 520ecfd308e9
Step 7/7 : CMD gunicorn -b 0.0.0.0:5000 manage:app
 ---> Running in 8bdbe1a3349f

To avoid invalidating the cache:

  1. Start your Dockerfile with commands that are less likely to change
  2. Place commands that are more likely to change (like COPY . /usr/src/app) as late as possible
  3. Add only the necessary files (use a .dockerignore file)

CI Environments

Since CI platforms provide a fresh environment for every build, you'll need to use an existing Docker image as the source of the cache.

Steps:

  1. Pull the existing image from an image registry (like Docker Hub)
  2. Use Docker build's --cache-from option to use the existing image as the cache source
  3. Push the new image back to the registry if the build is successful

Let's look at how to do this on Travis, Circle, and GitLab, using both single and multi-stage Docker builds with and without Docker Compose.

Single-stage

Travis:

# _config-examples/single-stage/.travis.yml

sudo: required

services:
  - docker

env:
  global:
    CACHE_IMAGE: mjhea0/docker-ci-cache

before_script:
  - docker pull $CACHE_IMAGE:latest || true

script:
  - docker build --cache-from $CACHE_IMAGE:latest --tag $CACHE_IMAGE:latest .

after_success:
  - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
  - docker push $CACHE_IMAGE:latest

Circle:

# _config-examples/single-stage/circle.yml

version: 2

jobs:
  job1:
    docker:
      - image: docker:stable
    environment:
      CACHE_IMAGE: mjhea0/docker-ci-cache
    steps:
      - checkout
      - setup_remote_docker
      - run:
          name: Pull image from docker hub
          command: docker pull $CACHE_IMAGE:latest || true
      - run:
          name: Build from dockerfile
          command: docker build --cache-from $CACHE_IMAGE:latest --tag $CACHE_IMAGE:latest .
      - run:
          name: Log in to docker hub
          command: docker login -u $REGISTRY_USER -p $REGISTRY_PASS
      - run:
          name: Push to docker hub
          command: docker push $CACHE_IMAGE:latest


workflows:
  version: 2

  tests_to_run:
    jobs:
    - job1

GitLab:

# _config-examples/single-stage/.gitlab-ci.yml

image: docker:stable
services:
  - docker:dind

variables:
  # use the overlay storage driver
  # https://docs.gitlab.com/ce/ci/docker/using_docker_build.html#using-the-overlayfs-driver
  DOCKER_DRIVER: overlay
  CACHE_IMAGE: mjhea0/docker-ci-cache

stages:
  - build

docker-build:
  stage: build
  script:
    - docker pull $CACHE_IMAGE:latest || true
    - docker build --cache-from $CACHE_IMAGE:latest --tag $CACHE_IMAGE:latest .
    - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
    - docker push $CACHE_IMAGE:latest

Compose

If you're using Docker Compose, you can add the cache_from option to the compose file, which maps back to the docker build --cache-from <image> command when you run docker-compose build .

Example:

version: '3.7'

services:

  web:
    build:
      context: .
      cache_from:
        - mjhea0/docker-ci-cache:latest
    image: mjhea0/docker-ci-cache:latest

Travis:

# _config-examples/single-stage/compose/.travis.yml

sudo: required

services:
  - docker

env:
  global:
    DOCKER_COMPOSE_VERSION: 1.23.2
    CACHE_IMAGE: mjhea0/docker-ci-cache

before_install:
  - sudo rm /usr/local/bin/docker-compose
  - curl -L https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-`uname -s`-`uname -m` > docker-compose
  - chmod +x docker-compose
  - sudo mv docker-compose /usr/local/bin

before_script:
  - docker pull $CACHE_IMAGE:latest || true

script:
  - docker-compose build
  - docker-compose up -d
  - docker-compose exec web flake8
  - docker-compose exec web python manage.py test

after_success:
  - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
  - docker push $CACHE_IMAGE:latest

Circle:

# _config-examples/single-stage/compose/circle.yml

version: 2

jobs:
  job1:
    docker:
      - image: docker/compose:1.23.2
    environment:
      CACHE_IMAGE: mjhea0/docker-ci-cache

    steps:
      - checkout
      - setup_remote_docker
      - run:
          name: Pull image from docker hub
          command: docker pull $CACHE_IMAGE:latest || true
      - run:
          name: Build Docker images
          command: docker-compose build
      - run:
          name: Spin up containers
          command: docker-compose up -d
      - run:
          name: Run flake8
          command: docker-compose exec web flake8
      - run:
          name: Run tests
          command: docker-compose exec web python manage.py test
      - run:
          name: Log in to docker hub
          command: docker login -u $REGISTRY_USER -p $REGISTRY_PASS
      - run:
          name: Push to docker hub
          command: docker push $CACHE_IMAGE:latest


workflows:
  version: 2

  tests_to_run:
    jobs:
    - job1

GitLab:

# _config-examples/single-stage/compose/.gitlab-ci.yml

image: docker:stable
services:
  - docker:dind

variables:
  # use the overlay storage driver
  # https://docs.gitlab.com/ce/ci/docker/using_docker_build.html#using-the-overlayfs-driver
  DOCKER_DRIVER: overlay
  CACHE_IMAGE: mjhea0/docker-ci-cache

stages:
  - build

docker-build:
  stage: build
  before_script:
    - apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
    - pip install docker-compose
  script:
    - docker pull $CACHE_IMAGE:latest || true
    - docker-compose build
    - docker-compose up -d
    - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
    - docker push $CACHE_IMAGE:latest

Multi-stage

With the multi-stage build pattern, you'll have to apply the same workflow (pull, build, push) for each intermediate stage since those images are discarded before the final image is created. The --target option can be used to build each stage of the multi-stage build separately.

Dockerfile.multi:

# base
FROM python:3.7.3 as base
COPY ./requirements.txt /
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /wheels -r requirements.txt

# stage
FROM python:3.7.3-slim
RUN apt-get update && \
    apt-get -y install netcat && \
    apt-get clean
WORKDIR /usr/src/app
COPY --from=base /wheels /wheels
COPY --from=base requirements.txt .
RUN pip install --no-cache /wheels/*
COPY . /usr/src/app
CMD gunicorn -b 0.0.0.0:5000 manage:app

Travis:

# _config-examples/multi-stage/.travis.yml

sudo: required

services:
  - docker

env:
  global:
    CACHE_IMAGE: mjhea0/docker-ci-cache

before_script:
  - docker pull $CACHE_IMAGE:base || true
  - docker pull $CACHE_IMAGE:latest || true

script:
  - docker build
      --target base
      --cache-from $CACHE_IMAGE:base
      --tag $CACHE_IMAGE:base
      --file ./Dockerfile.multi
      "."
  - docker build
      --cache-from $CACHE_IMAGE:latest
      --tag $CACHE_IMAGE:latest
      --file ./Dockerfile.multi
      "."

after_success:
  - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
  - docker push $CACHE_IMAGE:base
  - docker push $CACHE_IMAGE:latest

Circle:

# _config-examples/multi-stage/circle.yml

version: 2

jobs:
  job1:
    docker:
      - image: docker:stable
    environment:
      CACHE_IMAGE: mjhea0/docker-ci-cache
    steps:
      - checkout
      - setup_remote_docker
      - run:
          name: Pull base image from docker hub
          command: docker pull $CACHE_IMAGE:base || true
      - run:
          name: Pull stage image from docker hub
          command: docker pull $CACHE_IMAGE:stage || true
      - run:
          name: Build base from dockerfile
          command: |
            docker build \
              --target base \
              --cache-from $CACHE_IMAGE:base \
              --tag $CACHE_IMAGE:base \
              --file ./Dockerfile.multi \
              "."
      - run:
          name: Build stage from dockerfile
          command: |
            docker build \
              --cache-from $CACHE_IMAGE:stage \
              --tag $CACHE_IMAGE:stage \
              --file ./Dockerfile.multi \
              "."
      - run:
          name: Log in to docker hub
          command: docker login -u $REGISTRY_USER -p $REGISTRY_PASS
      - run:
          name: Push base image to docker hub
          command: docker push $CACHE_IMAGE:base
      - run:
          name: Push stage image to docker hub
          command: docker push $CACHE_IMAGE:stage


workflows:
  version: 2

  tests_to_run:
    jobs:
    - job1

GitLab:

# _config-examples/multi-stage/.gitlab-ci.yml

image: docker:stable
services:
  - docker:dind

variables:
  # use the overlay storage driver
  # https://docs.gitlab.com/ce/ci/docker/using_docker_build.html#using-the-overlayfs-driver
  DOCKER_DRIVER: overlay
  CACHE_IMAGE: mjhea0/docker-ci-cache

stages:
  - build

docker-build:
  stage: build
  script:
    - docker pull $CACHE_IMAGE:base || true
    - docker pull $CACHE_IMAGE:stage || true
    - docker build
        --target base
        --cache-from $CACHE_IMAGE:base
        --tag $CACHE_IMAGE:base
        --file ./Dockerfile.multi
        "."
    - docker build
        --cache-from $CACHE_IMAGE:stage
        --tag $CACHE_IMAGE:stage
        --file ./Dockerfile.multi
        "."
    - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
    - docker push $CACHE_IMAGE:base
    - docker push $CACHE_IMAGE:stage

Compose

Example compose file:

version: '3.7'

services:

  web:
    build:
      context: .
      cache_from:
        - mjhea0/docker-ci-cache:stage
    image: mjhea0/docker-ci-cache:stage

Travis:

# _config-examples/multi-stage/compose/.travis.yml

sudo: required

services:
  - docker

env:
  global:
    DOCKER_COMPOSE_VERSION: 1.23.2
    CACHE_IMAGE: mjhea0/docker-ci-cache

before_install:
  - sudo rm /usr/local/bin/docker-compose
  - curl -L https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-`uname -s`-`uname -m` > docker-compose
  - chmod +x docker-compose
  - sudo mv docker-compose /usr/local/bin

before_script:
  - docker pull $CACHE_IMAGE:base || true
  - docker pull $CACHE_IMAGE:stage || true

script:
  - docker build
      --target base
      --cache-from $CACHE_IMAGE:base
      --tag $CACHE_IMAGE:base
      --file ./Dockerfile.multi
      "."
  - docker-compose -f docker-compose.multi.yml build
  - docker-compose -f docker-compose.multi.yml up -d
  - docker-compose -f docker-compose.multi.yml exec web flake8
  - docker-compose -f docker-compose.multi.yml exec web python manage.py test

after_success:
  - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
  - docker push $CACHE_IMAGE:base
  - docker push $CACHE_IMAGE:stage

Circle:

# _config-examples/multi-stage/compose/circle.yml

version: 2

jobs:
  job1:
    docker:
      - image: docker/compose:1.23.2
    environment:
      CACHE_IMAGE: mjhea0/docker-ci-cache
    steps:
      - checkout
      - setup_remote_docker
      - run:
          name: Pull base image from docker hub
          command: docker pull $CACHE_IMAGE:base || true
      - run:
          name: Pull stage image from docker hub
          command: docker pull $CACHE_IMAGE:stage || true
      - run:
          name: Build base from dockerfile
          command: |
            docker build \
              --target base \
              --cache-from $CACHE_IMAGE:base \
              --tag $CACHE_IMAGE:base \
              --file ./Dockerfile.multi \
              "."
      - run:
          name: Build Docker images
          command: docker-compose -f docker-compose.multi.yml build
      - run:
          name: Spin up containers
          command: docker-compose -f docker-compose.multi.yml up -d
      - run:
          name: Run flake8
          command: docker-compose -f docker-compose.multi.yml exec web flake8
      - run:
          name: Run tests
          command: docker-compose -f docker-compose.multi.yml exec web python manage.py test
      - run:
          name: Log in to docker hub
          command: docker login -u $REGISTRY_USER -p $REGISTRY_PASS
      - run:
          name: Push base image to docker hub
          command: docker push $CACHE_IMAGE:base
      - run:
          name: Push stage image to docker hub
          command: docker push $CACHE_IMAGE:stage


workflows:
  version: 2

  tests_to_run:
    jobs:
    - job1

GitLab:

# _config-examples/multi-stage/compose/.gitlab-ci.yml

image: docker:stable
services:
  - docker:dind

variables:
  # use the overlay storage driver
  # https://docs.gitlab.com/ce/ci/docker/using_docker_build.html#using-the-overlayfs-driver
  DOCKER_DRIVER: overlay
  CACHE_IMAGE: mjhea0/docker-ci-cache

stages:
  - build

docker-build:
  stage: build
  before_script:
    - apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
    - pip install docker-compose
  script:
    - docker pull $CACHE_IMAGE:base || true
    - docker pull $CACHE_IMAGE:stage || true
    - docker build
        --target base
        --cache-from $CACHE_IMAGE:base
        --tag $CACHE_IMAGE:base
        --file ./Dockerfile.multi
        "."
    - docker-compose -f docker-compose.multi.yml build
    - docker-compose -f docker-compose.multi.yml up -d
    - docker login -u $REGISTRY_USER -p $REGISTRY_PASS
    - docker push $CACHE_IMAGE:base
    - docker push $CACHE_IMAGE:stage

Make sure to set REGISTRY_USER and REGISTRY_PASS as environment variables in the build environment -- Travis, Circle, and GitLab.

The code can be found in the docker-ci-cache repo:

  1. Single-stage examples
  2. Multi-stage examples

Cheers!





Join our mailing list to be notified about course updates and new tutorials.

 

Microservices with Docker, Flask, and React

Get the full course. Learn how to build, test, and deploy microservices to Amazon ECS powered by Docker, Flask, and React!

View the Course

Microservices with Docker, Flask, and React

Get the full course. Learn how to build, test, and deploy microservices to Amazon ECS powered by Docker, Flask, and React!


Table of Contents