Tests and Code Quality

Part 1, Chapter 5


For the CI/CD pipeline, we'll start with tests and code quality. The first thing you need to create is a Docker image with Python and Poetry installed. You'll run your tests and code quality jobs inside containers using this image.

At this point, you may be tempted to use images from Docker Hub. Keep in mind that you want your CI/CD pipelines to be fast, to speed up feedback loops, so it's better to use images that only contain the things you need. You don't want to waste time downloading unnecessary dependencies, in other words. You can also bake dependencies specific to your needs in the Docker images, which will save time as well.

Docker Image

Create a new folder called "ci_cd". Inside that folder, create a new folder called "python" and inside that one create a Dockerfile:

ci_cd
└── python
    └── Dockerfile

Dockerfile:

FROM python:3.11-slim
RUN mkdir -p /home/gitlab && addgroup gitlab && useradd -d /home/gitlab -g gitlab gitlab && chown gitlab:gitlab /home/gitlab
RUN apt-get update && apt-get install -y curl
USER gitlab
WORKDIR /home/gitlab
RUN curl -sSL https://install.python-poetry.org | python3 -
ENV PATH=/home/gitlab/.local/bin:$PATH
RUN poetry config virtualenvs.in-project true

What's happening here?

  1. To speed up our builds, we used python:3.11-slim as a base image. Python slim-based images only install the packages required to run Python.
  2. We then created a user called gitlab.
  3. Next, we downloaded and installed cURL, which is used to download Poetry. At the end, Poetry is added to the PATH and configured to create virtual environments inside projects. That way you can use GitLab's cache to speed up your jobs.

Running things as root inside a Docker container is not safe. That's why you added a new user. We called it gitlab because, well, we're using GitLab. The name doesn't matter. It will still work fine you use mark or mary.

With that, let's move on to the CI/CD configuration.

GitLab CI Config

Create a new file in the project root called .gitlab-ci.yml:

stages:
  - docker

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"


cache:
  key: ${CI_JOB_NAME}
  paths:
    - ${CI_PROJECT_DIR}/services/talk_booking/.venv/

build-python-ci-image:
  image: docker:24.0.6
  services:
    - docker:24.0.6-dind
  stage: docker
  before_script:
    - cd ci_cd/python/
  script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
    - docker build -t registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim .
    - docker push registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim

Make sure to replace <your-gitlab-username> with your actual GitLab username.

GitLab uses this file to configure a CI/CD pipeline.

A CI pipeline is series of jobs that must be performed in a specific order in order to deliver a new version of software. You can think of jobs as steps. They define what to do. This could be the running of tests, checking for code quality issues, or building a Docker image. Jobs are organized into stages. Stages are used to define when jobs run.

First, we defined a single stage called docker, which will build the Docker image:

stages:
  - docker

Stages are logical groupings of jobs. Jobs within a stage are executed in parallel. Stages are executed sequential in the same order as they're defined.

Next, we defined two variables:

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"

These are global variables available to all stages and jobs, which allow us to run Docker inside Docker

Refer to the Use Docker to build Docker images blog post for more on running Docker in Docker on GitLab.

It's worth noting that you can also define variables at the stage and job level, which will be scoped appropriately.

After that, we defined a job aptly named build-python-ci-image:

build-python-ci-image:
  image: docker:24.0.6
  services:
    - docker:24.0.6-dind
  stage: docker
  before_script:
    - cd ci_cd/python/
  script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
    - docker build -t registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim .
    - docker push registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim

This job will run in a container using a docker:24.0.6 Docker image. Along with this container, there will also be a running container from a docker:24.0.6-dind Docker image that's used to execute Docker commands. Keep in mind that you should always use Docker images with specific version tags rather than the latest tag since it's likely that something will break after a new version of an image is released.

For Example:

  • Use docker:24.0.6 or docker:20.10.2
  • Don't use docker:latest

As the name suggests, before_script runs before the main commands to set up the environment. In this case, we moved to the folder where the Dockerfile is located.

The script section is where you define the job's main commands. This is where we're building and pushing the image to a private Docker image registry on GitLab.

You can view your Docker registry for your GitLab repo by clicking on Deploy -> Container Registry on the side nav.

Refer to the Keyword reference for the .gitlab-ci.yml file reference from the official docs from more info on configuring the .gitlab-ci.yml file.

Most of the CI/CD SaaS products (GitLab CI/CD, GitHub Actions, CircleCI, TravisCI, to name a few) have similar YAML configuration files for defining pipelines and jobs. You can check out an example GitHub workflow in the Python Project Workflow article.

Add the changes to git, create a new commit, and push your code up to GitLab:

$ git add -A
$ git commit -m 'CI Python docker image'
$ git push -u origin main

Click on CI/CD on the side nav of your repository. You should see your first pipeline running:

First pipeline

Make sure the pipeline succeeds. You now have an image ready for running tests and code quality checks!

Code Quality Checks

Moving along, let's add our code quality checks to the CI/CD pipeline.

First, add a new stage called test to the .gitlab-ci.yml file, which will be used to run code quality checks along with our automated tests with pytest:

stages:
  - docker
  - test  # new

variables:
  DOCKER_DRIVER: overlay2
  DOCKER_TLS_CERTDIR: "/certs"


cache:
  key: ${CI_JOB_NAME}
  paths:
    - ${CI_PROJECT_DIR}/services/talk_booking/.venv/

build-python-ci-image:
  image: docker:24.0.6
  services:
    - docker:24.0.6-dind
  stage: docker
  before_script:
    - cd ci_cd/python/
  script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
    - docker build -t registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim .
    - docker push registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim

# new
include:
  - local: /services/talk_booking/ci-cd.yml

include is used to include external YAML files in your CI/CD pipeline. This helps to break up large YAML config files to increase readability. The path must start with "/" and it's relative to the repository root.

Add a new ci-cd.yml config file to "services/talk_booking":

service-talk-booking-code-quality:
  stage: test
  image: registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim
  before_script:
    - cd services/talk_booking/
    - poetry install
  script:
    - poetry run flake8 .
    - poetry run black . --check
    - poetry run isort . --check-only --profile black
    - poetry run bandit .
    - poetry run safety check

Here, we registered a new job called service-talk-booking-code-quality.

You'll name all of your projects' jobs using this structure: <project type>-<project name>-<job type>. You're more than welcome to change this structure. Just be consistent with your naming.

Take note of the registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim image that we used to run the service-talk-booking-code-quality job. This is the same image that we built in the build-python-ci-image job.

Within before_script, we moved to the appropriate directory and installed the Python dependencies with Poetry. We then defined all of our code quality checks inside script. If any of the checks exit with a non-zero code, the job will fail. Black and isort both run in check mode. We also used the Black profile with isort to ensure compatibility with Black.

Before moving on, add a new file called .flake8 inside "services/talk_booking":

[flake8]
max-line-length = 120
exclude =
    .git,
    build,
    dist,
    migrations,
    .venv
max-complexity = 10
docstring_style=sphinx

This configuration makes sure that code formatted with Black passes Flake8 linting.

Before committing your code, to avoid failed code quality jobs, run:

$ poetry run black .
$ poetry run isort . --profile black
$ poetry run flake8 .

Commit and push to the remote:

$ git add -A
$ git commit -m 'Add code quality job'
$ git push -u origin main

If safety check fails, update the problematic dependency via poetry update <package-name> and push your code one more time.

Ensure the pipeline passes.

Tests

Finally, add a job for our automated tests to services/talk_booking/ci-cd.yml:

service-talk-booking-tests:
  stage: test
  image: registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim
  before_script:
    - cd services/talk_booking/
    - poetry install
  script:
    - poetry run python -m pytest --junitxml=report.xml --cov=./ --cov-report=xml tests/unit tests/integration
  after_script:
    - curl -Os https://uploader.codecov.io/latest/linux/codecov
    - chmod +x codecov
    - ./codecov -R
  artifacts:
    when: always
    reports:
      junit: services/talk_booking/report.xml

Here, we executed the unit and integrations tests with pytest.

Take note of --junitxml=report.xml. This option generates a JUnit XML report called report.xml, which will be stored in the job's artifacts. Artifacts are files and folders that are preserved between jobs that can be downloaded from the GitLab UI. In essence, JUnit reports make it easier (and faster!) to identify test failures.

We also generated a coverage report, which will be uploaded to Codecov, in after_script, where you can track changes to code coverage.

This job runs in parallel with the code quality job, service-talk-booking-code-quality. It also runs under the same conditions as the code quality job.

You'll need to obtain a token from Codecov before proceeding. Navigate to http://codecov.io/, log in with your GitLab account, and find your repository.

Check out the Quick Start guide for help with getting up and running with Codecov.

To upload the coverage report, you need to set a CODECOV_TOKEN variable. To do so, open "Settings -> CI/CD":

CI/CD variables

Click "Add Variable", and set the name to "CODECOV_TOKEN" with the value of your token:

Add CI/CD variable

Click "Add Variable" to save it.

CI/CD variables are available to all CI/CD jobs inside the pipeline as environment variables.

Create a new commit, and push to the remote:

$ git add -A
$ git commit -m 'Add tests job'
$ git push -u origin main

Ensure the pipeline passes.

Controlling When Jobs Run

Thus far, we're running every job for each pipeline run. This is unnecessary. It slows your pipeline, costing you time and money. It's also not great on the environment as a whole. Fortunately, we can use rules to control when jobs should and should not run.

We currently have the following jobs:

  1. build-python-ci-image
  2. service-talk-booking-code-quality
  3. service-talk-booking-tests

When should these run?

Job Branches File Changes
build-python-ci-image main ci_cd/python/Dockerfile
service-talk-booking-code-quality main, merge requests against main services/talk_booking/**/*
service-talk-booking-tests main, merge requests against main services/talk_booking/**/*

So, the build-python-ci-image should only run when the branch is main and changes have been made to ci_cd/python/Dockerfile. The other two jobs should only run when a merge requested is created or updated against the main branch when changes occur anywhere inside the "talk_booking" service.

Update the jobs:

build-python-ci-image:
  image: docker:24.0.6
  services:
    - docker:24.0.6-dind
  stage: docker
  before_script:
    - cd ci_cd/python/
  script:
    - docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
    - docker build -t registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim .
    - docker push registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim
  rules:  # new
    - if: '$CI_COMMIT_BRANCH == "main"'
      changes:
        - ci_cd/python/Dockerfile

service-talk-booking-code-quality:
  stage: test
  image: registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim
  before_script:
    - cd services/talk_booking/
    - poetry install
  script:
    - poetry run flake8 .
    - poetry run black . --check
    - poetry run isort . --check-only --profile black
    - poetry run bandit .
    - poetry run safety check
  rules:  # new
    - if: '($CI_COMMIT_BRANCH == "main") || ($CI_PIPELINE_SOURCE == "merge_request_event")'
      changes:
        - services/talk_booking/**/*

service-talk-booking-tests:
  stage: test
  image: registry.gitlab.com/<your-gitlab-username>/talk-booking:cicd-python3.11-slim
  before_script:
    - cd services/talk_booking/
    - poetry install
  script:
    - poetry run python -m pytest --junitxml=report.xml --cov=./ --cov-report=xml tests/unit tests/integration
  after_script:
    - bash <(curl -s https://codecov.io/bash)
  artifacts:
    when: always
    reports:
      junit: services/talk_booking/report.xml
  rules:  # new
    - if: '($CI_COMMIT_BRANCH == "main") || ($CI_PIPELINE_SOURCE == "merge_request_event")'
      changes:
        - services/talk_booking/**/*

**/* at the end means any file in the current directory or subdirectories.

Commit and push your code.

service-talk-booking-code-quality and service-talk-booking-tests should run since we made changes in "services/talk_booking".

Try creating a new branch and pushing your code:

$ git checkout -b test
$ git push origin test

This won't trigger a pipeline since no jobs meet the criteria to run. Nice. Jump back to the main branch.

As you make your way through this course, you may need to comment out the rules section from time to time if you run into errors or just need to force a specific job to run. Make sure to uncomment the section once you fix the issue.

What Have You Done?

It may not seem like a lot of work, but we accomplished a number of important tasks in this chapter.

First, we set up code quality checks to ensure that our code follows a consistent style and is free from any of the known security vulnerabilities. If any of these checks fail, the entire pipeline fails. More importantly, we know we must do something about it. Because it runs early in the pipeline -- future stages and jobs will run after the test stage -- we get feedback as soon as possible.

Next, we added a job to run our tests. Like the quality checks, the automated tests run early in the pipeline. If any test fails, we can respond immediately. We also generated a JUnit report, which is used by GitLab to show the results of your tests in its UI.

To see that report, click on the "Passed" badge in your last pipeline:

Pipeline on list

Then, within the pipeline details, open the "Tests" tab:

Pipeline details

You should see a list of all jobs that produced test reports. You have the reported total number of tests that failed, produced an error, were skipped, or passed. You can click on a job's name to see a list of all executed tests. Failed tests are colored red, which helps to simplify and speed up feedback loops. Notice a trend yet?

Further, you enabled code coverage tracking via Codecov. Don't worry so much about the percent number. Sure, you want that number to be greater than 70, but it's better to track changes to that number over time. You can compare the main branch with the PR/MR branch. You can see if it unexpectedly drops or increases. You can also see if it's gradually dropping. By analyzing that, you can take the right action. Again, your feedback loop just became richer and faster.

At this point, your project should have the following structure:

├── .gitignore
├── .gitlab-ci.yml
├── ci_cd
│   └── python
│       └── Dockerfile
└── services
    └── talk_booking
        ├── .flake8
        ├── ci-cd.yml
        ├── poetry.lock
        ├── pyproject.toml
        ├── tests
        │   ├── __init__.py
        │   ├── e2e
        │   │   └── __init__.py
        │   ├── integration
        │   │   ├── test_web_app
        │   │   │    ├── __init__.py
        │   │   │    └── test_main.py
        │   └── unit
        │       └── __init__.py
        └── web_app
            ├── __init__.py
            └── main.py



Mark as Completed