Documenting Python Code and Projects

Last updated December 9th, 2020

Why do you need to document your Python code? What should your project documentation include? How do you write and generate documentation?

Documentation is an important part of software development. Without proper documentation, it can be very difficult or impossible for internal and external stakeholders to use and/or maintain your code. It also makes it much harder to onboard new developers. Taking it a step further, without a culture of documenting and learning in general you'll often make the same mistakes over and over again. Unfortunately, many developers treat documentation as an afterthought -- something sprinkled on, like black pepper, without much consideration.

This article looks at why you should document your Python code and how to go about doing it.

The Complete Python Guide:

  1. Modern Python Environments - dependency and workspace management
  2. Testing in Python
  3. Modern Test-Driven Development in Python
  4. Python Code Quality
  5. Python Type Checking
  6. Documenting Python Code and Projects (this article!)
  7. Python Project Workflow

Contents

Comments vs Documentation

What's the difference between code comments and documentation?

Documentation is a standalone resource that helps others use your API, package, library, or framework without having to read the source code. Comments, on the other hand, are there for developers who read your source code. Documentation is something that should always be present but the same cannot be said for comments. They are nice to have but not required. Documentation should tell others how and when to use something while comments should answer the why questions:

  1. Why is this done this way?
  2. Why is this here and not there?

The what questions should then be answered by your clean code:

  1. What is this?
  2. What does this method do?
Type Answers Stakeholder
Documentation When and How Users
Code Comments Why Developers
Clean Code What Developers

Docstrings

As specified by PEP-257, a Python documentation string (or docstring) is a special "string literal that occurs as the first statement in a module, function, class, or method definition" to form the __doc__ attribute of the given object. It allows you to embed documentation directly into your source code.

For example, say you have a module called temperature.py with a single function that calculates daily average temperatures. Using docstrings, you can document it like so:

"""
The temperature module: Manipulate your temperature easily

Easily calculate daily average temperature
"""

from typing import List


class HighTemperature:
    """Class representing very high temperatures"""

    def __init__(self, value: float):
        """
        :param value: value of temperature
        """

        self.value = value


def daily_average(temperatures: List[float]) -> float:
    """
    Get average daily temperature

    Calculate average temperature from multiple measurements

    :param temperatures: list of temperatures
    :return: average temperature
    """

    return sum(temperatures)/len(temperatures)

You can see docstrings specified for the daily_average function by accessing its __doc__ attribute:

>>> from temperature import daily_average
>>>
>>> print(daily_average.__doc__)

    Get average daily temperature

    :param temperatures: list of temperatures
    :return: average temperature

You can also view the full module-level docstrings by using the built-in help function:

>>> import temperature
>>>
>>> help(temperature)

It's worth noting that you can use the help function with built-in keywords (int, float, def, and so on), classes, functions, and modules.

Single vs Multi-line

Docstrings can be single-line or multi-line. Either way the first line is always treated as a summary. The summary line may be used by automatic indexing tools so it's important that it fits on one line. When using single line docstrings, everything should be on the same line: opening quotes, summary, and closing quotes.

class HighTemperature:
    """Class representing very high temperatures"""

    # code starts here

When using multi-line docstrings, structure goes like this: opening quotes, summary, blank line, more elaborate description, and closing quotes.

def daily_average(temperatures: List[float]) -> float:
    """
    Get average daily temperature

    Calculate average temperature from multiple measurements

    :param temperatures: list of temperatures
    :return: average temperature
    """

    return sum(temperatures) / len(temperatures)

Besides describing what a particular function, class, method, or module does, you can also specify:

  1. function arguments
  2. function returns
  3. class attributes
  4. errors raised
  5. limitations
  6. code examples

Formats

The four most common formats are:

  1. Google
  2. reStructuredText
  3. NumPy
  4. Epytext

Pick the one that suits you the most and be consistent about it throughout the entire project.

By using docstrings you can express your intentions explicitly in spoken language to help others (and your future self!) better understand when, where, and how to use certain code.

Linting

You can lint docstrings just as you would your code. Linters ensure that your docstrings are well formatted and that they match the actual implementation, which helps keep your documentation fresh.

Darglint is a popular Python documentation linter.

$ pip install darglint

Let's lint the temperature.py module:

def daily_average(temperatures: List[float]) -> float:
    """
    Get average daily temperature

    Calculate average temperature from multiple measurements

    :param temperatures: list of temperatures
    :return: average temperature
    """

    return sum(temperatures) / len(temperatures)

Lint:

$ darglint --docstring-style sphinx temperature.py

What happens if you change the name of the parameter from temperatures to temperatures_list?

$ darglint --docstring-style sphinx temperature.py

temperature.py:daily_average:27: DAR102: + temperatures
temperature.py:daily_average:27: DAR101: - temperatures_list

Code Examples

You can also add code examples to docstrings, showing example usage of the function, method, or class.

For example:

def daily_average(temperatures: List[float], new_param=None) -> float:
    """
    Get average daily temperature

    Calculate average temperature from multiple measurements

    >>> daily_average([10.0, 12.0, 14.0])
    12.0

    :param temperatures: list of temperatures
    :return: Average temperature
    """

    return sum(temperatures)/len(temperatures)

Code examples can also be executed by pytest like any other test via doctest. Along with linting, this also helps ensure that your documentation stays fresh, in sync with the code.

Check out doctest — Testing Through Documentation for more on doctest.

So, with the above example, pytest will assert daily_average([10.0, 12.0, 14.0]) equals 12.0. To run this code example as a test you just need to run pytest with the doctest-modules option:

$ python -m pytest --doctest-modules temperature.py

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python_developer_2020/docs
collected 1 item

temperature.py .                                                                                                       [100%]

===================================================== 1 passed in 0.01s ======================================================

What happens if you change the code example to:

>>> daily_average([10.0, 12.0, 14.0])
13.0
$ python -m pytest --doctest-modules temperature.py

==================================================== test session starts =====================================================
platform darwin -- Python 3.9.0, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python_developer_2020/docs
collected 1 item

temperature.py F                                                                                                       [100%]

========================================================== FAILURES ==========================================================
____________________________________________ [doctest] temperature.daily_average _____________________________________________
020
021     Get average daily temperature
022
023     Calculate average temperature from multiple measurements
024
025     >>> daily_average([10.0, 12.0, 14.0])
Expected:
    13.0
Got:
    12.0

/Users/michael/repos/testdriven/python_developer_2020/docs/temperature.py:25: DocTestFailure
================================================== short test summary info ===================================================
FAILED temperature.py::temperature.daily_average
===================================================== 1 failed in 0.02s =====================================================

For more on pytest, check out the Testing in Python article.

Sphinx

Adding docstrings to your code is great, but you still need to present it to your users.

This is where tools like Sphinx, Epydoc, and MKDocs come into play, which will convert your project's docstrings over to HTML and CSS.

Sphinx is by far the most popular. It's used to generate the documentation for a number of open-source projects like Python and Flask. It's also one of the documentation tools supported by Read the Docs, which is used by thousands of open-source projects like Requests, Flake8, and pytest, to name a few.

Let's see it in action. Start by following the official guide to download and install Sphinx.

$ sphinx-quickstart --version

sphinx-quickstart 3.3.1

Create a new project directory:

$ mkdir sphinx_example
$ cd sphinx_example

Next, add a new file called temperature.py:

"""
The temperature module: Manipulate your temperature easily

Easily calculate daily average temperature
"""

from typing import List


class HighTemperature:
    """Class representing very high temperatures"""

    def __init__(self, value: float):
        """
        :param value: value of temperature
        """

        self.value = value


def daily_average(temperatures: List[float]) -> float:
    """
    Get average daily temperature

    :param temperatures: list of temperatures
    :return: average temperature
    """

    return sum(temperatures)/len(temperatures)

To scaffold out the files and folders for Sphinx to create the documentation for temperature.py, within the project root, run:

$ sphinx-quickstart docs

You'll be promoted with a few questions:

> Separate source and build directories (y/n) [n]: n
> Project name: Temperature
> Author name(s): Your Name
> Project release []: 1.0.0
> Project language [en]: en

Once done, the "docs" directory should contain the following files and folders:

docs
├── Makefile
├── _build
├── _static
├── _templates
├── conf.py
├── index.rst
└── make.bat

Next, let's update the project config. Open docs/conf.py and replace this:

# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))

With this:

import os
import sys
sys.path.insert(0, os.path.abspath('..'))

Now, autodoc, which is used to pull in documentation from docstrings, will search for modules in the parent folder of "docs".

Add the following extensions to the extensions list:

extensions = [
    'sphinx.ext.autodoc',
]

Open docs/index.rst and edit it to look like this:

Welcome to Temperature documentation!
=====================================

.. automodule:: temperature
    :members:



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

The contents of index.rst is written in reStructuredText, which is a file format for textual data similar to Markdown but much more powerful since it's designed for writing technical documentation.

Notes:

  1. Titles are created by underlining (and optionally overlining) the title with an = character, at least as long as the text:
  2. The automodule directive is used to gather docstrings from Python modules. So, .. automodule:: temperature tells Sphinx to gather docstrings from the temperature.py module.
  3. The genindex, modindex, and search directives are used to generate a general index, an index of documented modules, and a search page, respectively.

From the "docs" directory, build the documentation:

$ make html

Open docs/_build/html/index.html in your browser. You should see:

Sphinx docs

Now you can serve up the docs yourself using a tool like Netlify or via a service like Read the Docs.

API Documentation

When talking about documentation don't forget about documentation for your APIs. You have endpoints with their URLs, URL parameters, query parameters, status codes, request bodies, and response bodies. Even a simple API can have a number of parameters that are difficult to remember.

OpenAPI Specification (formerly Swagger Specification) provides a standard format for describing, producing, consuming, and visualizing RESTful APIs. The spec is used to generate documentation with Swagger UI or ReDoc. It can also be imported into tools like Postman. You can generate sever stubs and client SDKs as well with tools like Swagger Codegen and OpenAPI Generator.

For a full list of editors, linters, parsers, code generators, documentation, testing, and schema/data validation tools for OpenAPI, check out OpenAPI Tools.

The spec itself must be written in either YAML or JSON. For example:

---
openapi: 3.0.2
info:
  title: Swagger Petstore - OpenAPI 3.0
  description: |-
    This is a sample Open API
  version: 1.0.0
servers:
- url: "/api/v3"
paths:
  "/pet":
    post:
      summary: Add a new pet to the store
      description: Add a new pet to the store
      operationId: addPet
      requestBody:
        description: Create a new pet in the store
        content:
          application/json:
            schema:
              "$ref": "#/components/schemas/Pet"
        required: true
      responses:
        '200':
          description: Successful operation
          content:
            application/json:
              schema:
                "$ref": "#/components/schemas/Pet"
        '405':
          description: Invalid input
components:
  schemas:
    Pet:
      required:
      - name
      - photoUrls
      type: object
      properties:
        id:
          type: integer
          format: int64
          example: 10
        name:
          type: string
          example: doggie
        photoUrls:
          type: array
          items:
            type: string
        status:
          type: string
          description: pet status in the store
          enum:
          - available
          - pending
          - sold
  requestBodies:
    Pet:
      description: Pet object that needs to be added to the store
      content:
        application/json:
          schema:
            "$ref": "#/components/schemas/Pet"

Writing such schema by hand is very dull and prone to errors. Fortunately, there are a number of tools that help automate this process:

Tests as Documentation

Thus far, we've talked about documentation for users (project documentation) and developers (code comments). Another type of documentation for developers comes from the tests themselves.

As a developer working on a project you need to know more than just how to use a method. You need to know if it works as expected and how to use it to develop further. While adding code examples to docstrings can help with this, such examples are not meant for anything more than simple examples. You need to add tests to cover more than just a function's happy path.

Tests document three things:

  1. What the expected output for a given input is
  2. How exception paths are handled
  3. How to use a given function, method, or class

As you're writing tests, be sure to use proper naming and clearly specify what it is that you're testing. This will make it much easier for a developer to review the test suite in order to find out how a particular function or method should be used.

What's more, when writing a test, you basically define what should go into your docstrings. The GIVEN, WHEN, THEN structure can be easily converted to function's docstrings.

For example:

  • GIVEN a list of temperature measurements -> :param temperatures: list of temperatures
  • WHEN 'daily_average' is called -> >>> daily_average([10.0, 12.0, 14.0])
  • THEN average temperature is returned -> Get average temperature, :return: Average temperature
def daily_average(temperatures: List[float]) -> float:
    """
    Get average temperature

    Calculate average temperature from multiple measurements

    >>> daily_average([10.0, 12.0, 14.0])
    12.0

    :param temperatures: list of temperatures
    :return: Average temperature
    """

    return sum(temperatures)/len(temperatures)

So, you can treat Test-driven Development (TDD) as a form of documentation-driven development by creating your docstrings as code:

  1. Write a test
  2. Ensure the test fails
  3. Write the code
  4. Ensure the test passes
  5. Refactor and add docstrings

For more on TDD, check out the Modern Test-Driven Development in Python article.

Documenting a Flask REST API

Thus far, we've just covered theory so let's move on to a real example. We'll create a RESTful API with Flask for temperature measurements. Each measurement will have the following attributes: timestamp, temperature, notes. Flask-RESTX will be used for auto-generating an OpenAPI spec.

So, let's begin. First, create a new folder:

$ mkdir flask_temperature
$ cd flask_temperature

Next, initialize your project with Poetry:

$ poetry init
Package name [flask_temperature]:
Version [0.1.0]:
Description []:
Author [Your name <[email protected]>, n to skip]:
License []:
Compatible Python versions [^3.7]:  >3.7

Would you like to define your main dependencies interactively? (yes/no) [yes] no
Would you like to define your development dependencies interactively? (yes/no) [yes] no
Do you confirm generation? (yes/no) [yes]

After that, add Flask and Flask-RESTX:

$ poetry add flask flask-restx

Now, let's create our documented API. Add a file for the Flask app called app.py:

import uuid

from flask import Flask, request
from flask_restx import Api, Resource

app = Flask(__name__)
api = Api(app)

measurements = []


@api.route('/measurements')
class Measurement(Resource):
    def get(self):
        return measurements

    def post(self):
        measurement = {
            'id': str(uuid.uuid4()),
            'timestamp': request.json['timestamp'],
            'temperature': request.json['temperature'],
            'notes': request.json.get('notes'),
        }
        measurements.append(measurement)

        return measurement


if __name__ == '__main__':
    app.run()

Flask-RESTX uses class-based views to organize resources, routes, and HTTP methods. In the above example, the Measurement class supports HTTP GET and POST methods. Other methods, will return a MethodNotAllowed error. Flask-RESTX will also generate the OpenAPI schema when the app is run.

$ python app.py

You can see the schema at http://localhost:5000/swagger.json. You'll also be able to view the browsable API at http://localhost:5000.

SwaggerUI

Currently, the schema only contains endpoints. We can define the request and response bodies to tell our users what's expected from them as well as what will be returned.

Update app.py:

import uuid

from flask import Flask, request
from flask_restx import Api, Resource, fields

app = Flask(__name__)
api = Api(app)

measurements = []

add_measurement_request_body = api.model(
    'AddMeasurementRequestBody', {
        'timestamp': fields.Integer(
            description='Timestamp of measurement',
            required=True,
            example=1606509272
        ),
        'temperature': fields.Float(
            description='Measured temperature',
            required=True, example=22.3),
        'notes': fields.String(
            description='Additional notes',
            required=False, example='Strange day'),
    }
)

measurement_model = api.model(
    'Measurement', {
        'id': fields.String(
            description='Unique ID',
            required=False,
            example='354e405c-136f-4e03-b5ce-5f92e3ed3ff8'
        ),
        'timestamp': fields.Integer(
            description='Timestamp of measurement',
            required=True,
            example=1606509272
        ),
        'temperature': fields.Float(
            description='Measured temperature',
            required=True,
            example=22.3
        ),
        'notes': fields.String(
            description='Additional notes',
            required=True,
            example='Strange day'
        ),
    }
)


@api.route('/measurements')
class Measurement(Resource):
    @api.doc(model=[measurement_model])
    def get(self):
        return measurements

    @api.doc(model=[measurement_model], body=add_measurement_request_body)
    def post(self):
        measurement = {
            'id': str(uuid.uuid4()),
            'timestamp': request.json['timestamp'],
            'temperature': request.json['temperature'],
            'notes': request.json.get('notes'),
        }
        measurements.append(measurement)

        return measurement


if __name__ == '__main__':
    app.run()

To define models for our response and request bodies we used api.model. We defined the names and the appropriate fields. For each field, we defined the type, description, example, and if it's required.

Swagger UI models

To add the models to endpoints, we used the @api.doc decorator. The body parameter defines the request body while model defines the response body.

Swagger UI models

Now you should have the basic idea of how to document your Flask RESTful API with Flask-RestX. This just scratches the surface. Check out the Swagger documentation for details on how to define auth info, URL parameters, status codes, and much more.

Conclusion

Most, if not all, of us can do a better job at writing documentation. Fortunately, there are plenty of tools available for simplifying the process of writing it. When writing packages and libraries use Sphinx to organize and help generate your documentation from docstrings. When working on a RESTful API, use a tool that generates an OpenAPI schema since the schema can be used by a plethora of tools -- everything from data validators to code generators. Looking for inspiration? Stripe, Flask, Cypress, and FastAPI are excellent examples of documentation done well.

The Complete Python Guide:

  1. Modern Python Environments - dependency and workspace management
  2. Testing in Python
  3. Modern Test-Driven Development in Python
  4. Python Code Quality
  5. Python Type Checking
  6. Documenting Python Code and Projects (this article!)
  7. Python Project Workflow

Jan Giacomelli

Jan Giacomelli

Jan is a software engineer who lives in Ljubljana, Slovenia, Europe. He is co-founder of typless where he is leading engineering efforts. He loves working with Python and Django. When he's not writing code or deploying to AWS, he's probably skiing, windsurfing, or playing guitar.

Share this tutorial

Featured Course

Test-Driven Development with Python, Flask, and Docker

In this course, you'll learn how to set up a development environment with Docker in order to build and deploy a microservice powered by Python and Flask. You'll also apply the practices of Test-Driven Development with Pytest as you develop a RESTful API.

Featured Course

Test-Driven Development with Python, Flask, and Docker

In this course, you'll learn how to set up a development environment with Docker in order to build and deploy a microservice powered by Python and Flask. You'll also apply the practices of Test-Driven Development with Pytest as you develop a RESTful API.