Python Dependency Injection

Last updated August 26th, 2020

Writing clean, maintainable code is a challenging task. Fortunately, there are many patterns, techniques, and reusable solutions available to us to make achieving that task much easier. Dependency Injection is one of those techniques, which is used to write loosely-coupled yet highly-cohesive code.

In this post, we'll show you how to implement Dependency Injection as you develop an app for plotting historic weather data. After developing the initial app, using Test-Driven Development, you'll refactor it using Dependency Injection to decouple pieces of the app to make it easier to test, extend, and maintain.

By the end of this post, you should be able to explain what Dependency Injection is and implement it in Python with Test-Driven Development (TDD).

Contents

What is Dependency Injection?

In software engineering, Dependency Injection is a technique in which an object receives other objects that it depends on.

  1. It was introduced to manage the complexity of one's codebase.
  2. It helps simplify testing, extending code, and maintenance.
  3. Most languages that allow for the passing of objects and functions as parameters support it. You hear more about Dependency Injection in Java and C#, though, since it's difficult to implement. On the other hand, thanks to Python's dynamic typing along with its duck typing system, it's easy to implement and thus less noticeable. Django, Django REST Framework, and FastAPI all utilize Dependency Injection.

Benefits:

  1. Methods are easier to test
  2. Dependencies are easier to mock
  3. Tests doesn't have to change every time that we extend our application
  4. It's easier to extend the application
  5. It's easier to maintain the application

For more, refer to Martin Fowler's Forms of Dependency Injection article.

To see it in action, let's take a look at a few real-world examples.

Plotting Historic Weather Data

Scenario:

  1. You've decided to build an app for drawing plots from weather history data.
  2. You've downloaded 2009 temperature by hour data for London.
  3. Your goal is to draw a plot of that data to see how temperature changed over time.

The basic idea

First, create (and activate) a virtual environment. Then, install pytest and Matplotlib:

(venv)$ pip install pytest matplotlib

It seems reasonable to start with a class with two methods:

  1. read - read data from a CSV
  2. draw - draw a plot

Reading data from a CSV

Since we need to read historic weather data from a CSV file, the read method should meet the following criteria:

  • GIVEN an App class
  • WHEN the read method is called with a CSV file name
  • THEN data from CSV should be returned in a dictionary where the keys are datetime strings in ISO 8601 format ('%Y-%m-%dT%H:%M:%S.%f') and the values are temperatures measured at that moment

Create a file called test_app.py:

import datetime
from pathlib import Path

from app import App


BASE_DIR = Path(__file__).resolve(strict=True).parent


def test_read():
    app = App()
    for key, value in app.read(file_name=Path(BASE_DIR).joinpath('london.csv')).items():
        assert datetime.datetime.fromisoformat(key)
        assert value - 0 == value

So, this test checks that:

  1. every key is an ISO 8601 formatted date time string (using the fromisoformat function from datetime package)
  2. every value is number (using property of numbers x - 0 = x)

The fromisoformat method from the datetime package was added in Python 3.7. Refer to the official Python docs for more info.

Run the test to ensure it fails:

(venv)$ python -m pytest .

You should see:

E   ModuleNotFoundError: No module named 'app'

Now to implement the read method, to make the test pass, add new file called app.py:

import csv
import datetime
from pathlib import Path


BASE_DIR = Path(__file__).resolve(strict=True).parent


class App:

    def read(self, file_name):
        temperatures_by_hour = {}
        with open(Path(BASE_DIR).joinpath(file_name), 'r') as file:
            reader = csv.reader(file)
            next(reader)  # Skip header row.
            for row in reader:
                hour = datetime.datetime.strptime(row[0], '%d/%m/%Y %H:%M').isoformat()
                temperature = float(row[2])
                temperatures_by_hour[hour] = temperature

        return temperatures_by_hour

Here, we added an App class with a read method that takes a file name as a parameter. After opening and reading the contents of the CSV, the appropriate keys (date) and values (temperature) are added to a dictionary which is eventually returned.

Assuming that you've downloaded the weather data as london.csv, the test should now pass:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael.herman/repos/testdriven/dependency-injection-python/app
collected 1 item

test_app.py .                                                                 [100%]

================================= 1 passed in 0.11s =====================================

Drawing the plot

Next, the draw method should meet the following criteria:

  • GIVEN an App class
  • WHEN the draw method is called with a dictionary where the keys are datetime strings in ISO 8601 format ('%Y-%m-%dT%H:%M:%S.%f') and the values are temperatures measured at that moment
  • THEN the data should be drawn to a line plot with the time on the X axis and temperature on the Y axis

Add a test for this to test_app.py:

def test_draw(monkeypatch):
    plot_date_mock = MagicMock()
    show_mock = MagicMock()
    monkeypatch.setattr(matplotlib.pyplot, 'plot_date', plot_date_mock)
    monkeypatch.setattr(matplotlib.pyplot, 'show', show_mock)

    app = App()
    hour = datetime.datetime.now().isoformat()
    temperature = 14.52
    app.draw({hour: temperature})

    _, called_temperatures = plot_date_mock.call_args[0]
    assert called_temperatures == [temperature]  # check that plot_date was called with temperatures as second arg
    show_mock.assert_called()  # check that show is called

Update the imports like so:

import datetime
from pathlib import Path
from unittest.mock import MagicMock

import matplotlib.pyplot

from app import App

Since we don't want to show the actual plots during the test runs, we used monkeypatch to mock the plot_date function from matplotlib. Then, the method under test is called with single temperature. At the end, we checked that plot_date was called correctly (X and Y axis) and that show was called.

You can read more about monkeypatching with pytest here and more about mocking here.

Let's move to the method implementation:

  1. It takes a parameter temperatures_by_hour which should be dictionary of the same structure as the output from the read method.
  2. It must transform this dictionary into two vectors that can be used in the plot: dates and temperatures.
  3. Dates should be converted to numbers using matplotlib.dates.date2num so they can be used in the plot.
def draw(self, temperatures_by_hour):
    dates = []
    temperatures = []

    for date, temperature in temperatures_by_hour.items():
        dates.append(datetime.datetime.fromisoformat(date))
        temperatures.append(temperature)

    dates = matplotlib.dates.date2num(dates)
    matplotlib.pyplot.plot_date(dates, temperatures, linestyle='-')
    matplotlib.pyplot.show()

Imports:

import csv
import datetime
from pathlib import Path

import matplotlib.dates
import matplotlib.pyplot

The tests should now pass:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 2 items

test_app.py ..                                                                [100%]

================================= 2 passed in 0.37s =====================================

app.py:

import csv
import datetime
from pathlib import Path

import matplotlib.dates
import matplotlib.pyplot


BASE_DIR = Path(__file__).resolve(strict=True).parent


class App:

    def read(self, file_name):
        temperatures_by_hour = {}
        with open(Path(BASE_DIR).joinpath(file_name), 'r') as file:
            reader = csv.reader(file)
            next(reader)  # Skip header row.
            for row in reader:
                hour = datetime.datetime.strptime(row[0], '%d/%m/%Y %H:%M').isoformat()
                temperature = float(row[2])
                temperatures_by_hour[hour] = temperature

        return temperatures_by_hour

    def draw(self, temperatures_by_hour):
        dates = []
        temperatures = []

        for date, temperature in temperatures_by_hour.items():
            dates.append(datetime.datetime.fromisoformat(date))
            temperatures.append(temperature)

        dates = matplotlib.dates.date2num(dates)
        matplotlib.pyplot.plot_date(dates, temperatures, linestyle='-')
        matplotlib.pyplot.show()

test_app.py:

import datetime
from pathlib import Path
from unittest.mock import MagicMock

import matplotlib.pyplot

from app import App


BASE_DIR = Path(__file__).resolve(strict=True).parent


def test_read():
    app = App()
    for key, value in app.read(file_name=Path(BASE_DIR).joinpath('london.csv')).items():
        assert datetime.datetime.fromisoformat(key)
        assert value - 0 == value


def test_draw(monkeypatch):
    plot_date_mock = MagicMock()
    show_mock = MagicMock()
    monkeypatch.setattr(matplotlib.pyplot, 'plot_date', plot_date_mock)
    monkeypatch.setattr(matplotlib.pyplot, 'show', show_mock)

    app = App()
    hour = datetime.datetime.now().isoformat()
    temperature = 14.52
    app.draw({hour: temperature})

    _, called_temperatures = plot_date_mock.call_args[0]
    assert called_temperatures == [temperature]  # check that plot_date was called with temperatures as second arg
    show_mock.assert_called()  # check that show is called

Running the app

You have all that you need to run your application for plotting temperatures by hour form the selected CSV file.

Let's make our app runnable.

Open app.py and add following snippet to the bottom:

if __name__ == '__main__':
    import sys
    file_name = sys.argv[1]
    app = App()
    temperatures_by_hour = app.read(file_name)
    app.draw(temperatures_by_hour)

When app.py runs, it first reads the CSV file from the command line argument assigned to file_name and then it draws the plot.

Run the app:

(venv)$ python app.py london.csv

You should see a plot like this:

Temperature by hour

If you encounter Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure., check this Stack Overflow answer.

Decoupling the Data Source

Alright. We finished our initial iteration of our app for plotting historical weather data. It's working as expected and we're happy to use it. That said, it's tightly coupled with a CSV. What if you wanted to use a different data format? Like a JSON payload from an API. This is where Dependency Injection comes into play.

Let's separate the reading part from our main app.

First, create new file called test_urban_climate_csv.py:

import datetime
from pathlib import Path

from app import App
from urban_climate_csv import DataSource


BASE_DIR = Path(__file__).resolve(strict=True).parent


def test_read():
    app = App()
    for key, value in app.read(file_name=Path(BASE_DIR).joinpath('london.csv')).items():
        assert datetime.datetime.fromisoformat(key)
        assert value - 0 == value

The test here is the same as our test for test_read in test_app.py.

Second, add a new file called urban_climate_csv.py. Inside that file, create a class called DataSource with a read method:

import csv
import datetime
from pathlib import Path


BASE_DIR = Path(__file__).resolve(strict=True).parent


class DataSource:

    def read(self, **kwargs):
        temperatures_by_hour = {}
        with open(Path(BASE_DIR).joinpath(kwargs['file_name']), 'r') as file:
            reader = csv.reader(file)
            next(reader)  # Skip header row.
            for row in reader:
                hour = datetime.datetime.strptime(row[0], '%d/%m/%Y %H:%M').isoformat()
                temperature = float(row[2])
                temperatures_by_hour[hour] = temperature

        return temperatures_by_hour

This is same as the read method in our initial app with one difference: We're using kwargs because we want to have the same interface for all our data sources. So, we could add new readers as necessary based on the source of the data.

For example:

from open_weather_csv import DataSource
from open_weather_json import DataSource
from open_weather_api import DataSource


csv_reader = DataSource()
reader.read(file_name='foo.csv')

json_reader = DataSource()
reader.read(file_name='foo.json')

api_reader = DataSource()
reader.read(url='https://foo.bar')

The test should now pass:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 2 items

test_app.py ..                                                                [ 66%]
test_urban_climate_csv.py .                                                   [100%]

================================= 3 passed in 0.48s =====================================

Now, we need to update our App class.

First, update the test for read in test_app.py:

def test_read():
    hour = datetime.datetime.now().isoformat()
    temperature = 14.52
    temperature_by_hour = {hour: temperature}

    data_source = MagicMock()
    data_source.read.return_value = temperature_by_hour
    app = App(
        data_source=data_source
    )
    assert app.read(file_name='something.csv') == temperature_by_hour

So what changed? We injected data_source to our App. This simplifies testing as the read method has a single job: to return results from the data source. This is an example of the first benefit of Dependency Injection: Testing is easier since we can inject the underlying dependencies.

Update the test for draw too. Again, we need to inject the data source to App, which can be "anything" with an expected interface -- so MagicMock will do:

def test_draw(monkeypatch):
    plot_date_mock = MagicMock()
    show_mock = MagicMock()
    monkeypatch.setattr(matplotlib.pyplot, 'plot_date', plot_date_mock)
    monkeypatch.setattr(matplotlib.pyplot, 'show', show_mock)

    app = App(MagicMock())
    hour = datetime.datetime.now().isoformat()
    temperature = 14.52
    app.draw({hour: temperature})

    _, called_temperatures = plot_date_mock.call_args[0]
    assert called_temperatures == [temperature]  # check that plot_date was called with temperatures as second arg
    show_mock.assert_called()  # check that show is called

Update the App class as well:

import datetime

import matplotlib.dates
import matplotlib.pyplot


class App:

    def __init__(self, data_source):
        self.data_source = data_source

    def read(self, **kwargs):
        return self.data_source.read(**kwargs)

    def draw(self, temperatures_by_hour):
        dates = []
        temperatures = []

        for date, temperature in temperatures_by_hour.items():
            dates.append(datetime.datetime.fromisoformat(date))
            temperatures.append(temperature)

        dates = matplotlib.dates.date2num(dates)
        matplotlib.pyplot.plot_date(dates, temperatures, linestyle='-')
        matplotlib.pyplot.show(block=True)

First, we added an __init__ method so the data source can be injected. Second, we updated the read method to use self.data_source and **kwargs. Look at how much simpler this interface is. App is no longer coupled with the reading of the data anymore.

Finally, we need to inject our data source to App on instance creation.

if __name__ == '__main__':
    import sys
    from urban_climate_csv import DataSource
    file_name = sys.argv[1]
    app = App(DataSource())
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

Run your app again to ensure it still works as expected:

(venv)$ python app.py london.csv

Update test_read in test_urban_climate_csv.py:

import datetime

from urban_climate_csv import DataSource


def test_read():
    reader = DataSource()
    for key, value in reader.read(file_name='london.csv').items():
        assert datetime.datetime.fromisoformat(key)
        assert value - 0 == value

Do the tests pass?

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 2 items

test_app.py ..                                                                [ 66%]
test_urban_climate_csv.py .                                                   [100%]

================================= 3 passed in 0.40s =====================================

Adding a New Data Source

Now that we've decoupled our App from the data source, we can easily add a new source.

Let's use data from the OpenWeather API. Go ahead and download a pre-downloaded response from the API: here. Save it as moscow.json.

Feel free to register with the OpenWeather API and grab historical data for a different city if you'd prefer.

Add a new file called test_open_weather_json.py, and write a test for a read method:

import datetime

from open_weather_json import DataSource


def test_read():
    reader = DataSource()
    for key, value in reader.read(file_name='moscow.json').items():
        assert datetime.datetime.fromisoformat(key)
        assert value - 0 == value

Since we're using the same interface to apply Dependency Injection, this test should look very similar to test_read in test_urban_climate_csv.

In statically-typed languages, like Java and C#, all data sources should implement the same interface -- i.e., IDataSource. Thanks to duck typing in Python, we can just implement methods with the same name that takes the same arguments (**kwargs) for each of our data sources:

def read(self, **kwargs):
    return self.data_source.read(**kwargs)

Next, let's move on to implementation.

Add new file called open_weather_json.py.:

import json
import datetime


class DataSource:

    def read(self, **kwargs):
        temperatures_by_hour = {}
        with open(kwargs['file_name'], 'r') as file:
            json_data = json.load(file)['hourly']
            for row in json_data:
                hour = datetime.datetime.fromtimestamp(row['dt']).isoformat()
                temperature = float(row['temp'])
                temperatures_by_hour[hour] = temperature

        return temperatures_by_hour

So, we used the json module to read and load a JSON file. Then, we extracted the data in a similar manner as we did before. This time we used the fromtimestamp function because times of measurements are written in Unix timestamp format.

The tests should pass.

Next, update app.py to use this data source instead:

if __name__ == '__main__':
    import sys
    from open_weather_json import DataSource
    file_name = sys.argv[1]
    app = App(DataSource())
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

Here, we just changed the import.

Run your app again with moscow.json as the argument:

(venv)$ python app.py moscow.json

You should see a plot with data from the selected JSON file.

This is an example of the second benefit of Dependency Injection: Extending code is much simpler.

We can see that:

  1. Existing tests didn't change
  2. Writing a test for a new data source is simple
  3. Implementing an interface for a new data source is fairly simple as well (you just need to know the shape of the data)
  4. We didn't have to make any changes to the App class

So, we can now extend the codebase with simple and predictable steps without having to touch tests that are already written or change the main application. That's powerful. You could now have a developer focus solely on adding new data sources without them ever needing to understand or have context on the main application. That said, if you do need to onboard a new developer who does need to have context on the entire project, it can take longer for them to get up to speed due to the decoupling.

Decoupling the Plotting Library

Moving right along, let's decouple the plotting portion from the app so we can more easily add new plotting libraries. Since this will be a similar process to the data source decoupling, think through the steps on your own before reading the rest of this section.

Take a look at the test for the draw method in test_app.py:

def test_draw(monkeypatch):
    plot_date_mock = MagicMock()
    show_mock = MagicMock()
    monkeypatch.setattr(matplotlib.pyplot, 'plot_date', plot_date_mock)
    monkeypatch.setattr(matplotlib.pyplot, 'show', show_mock)

    app = App(MagicMock())
    hour = datetime.datetime.now().isoformat()
    temperature = 14.52
    app.draw({hour: temperature})

    _, called_temperatures = plot_date_mock.call_args[0]
    assert called_temperatures == [temperature]  # check that plot_date was called with temperatures as second arg
    show_mock.assert_called()  # check that show is called

As we can see, it's coupled with Matplotlib. A change to the plotting library will require a change to the tests. This is something you really want to avoid.

So, how can we improve this?

Let's extract the plotting part of our app into its own class much like we did for the reading in of the data source.

Add a new file called test_matplotlib_plot.py:

import datetime
from unittest.mock import MagicMock

import matplotlib.pyplot

from matplotlib_plot import Plot


def test_draw(monkeypatch):
    plot_date_mock = MagicMock()
    show_mock = MagicMock()
    monkeypatch.setattr(matplotlib.pyplot, 'plot_date', plot_date_mock)
    monkeypatch.setattr(matplotlib.pyplot, 'show', show_mock)

    plot = Plot()
    hours = [datetime.datetime.now()]
    temperatures = [14.52]
    plot.draw(hours,  temperatures)

    _, called_temperatures = plot_date_mock.call_args[0]
    assert called_temperatures == temperatures  # check that plot_date was called with temperatures as second arg
    show_mock.assert_called()  # check that show is called

To implement the Plot class, add a new file called matplotlib_plot.py:

import matplotlib.dates
import matplotlib.pyplot


class Plot:

    def draw(self, hours, temperatures):

        hours = matplotlib.dates.date2num(hours)
        matplotlib.pyplot.plot_date(hours, temperatures, linestyle='-')
        matplotlib.pyplot.show(block=True)

Here, the draw method takes two arguments:

  1. hours - a list of datetime objects
  2. temperatures - a list of numbers

This is what our interface will look like for all future Plot classes. So, in this case, our test will stay the same as long as this interface and the underlying matplotlib methods stay the same.

Run the tests:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 2 items

test_app.py ..                                                                [ 40%]
test_matplotlib_plot.py .                                                     [ 60%]
test_open_weather_json.py .                                                   [ 80%]
test_urban_climate_csv.py .                                                   [100%]

================================= 5 passed in 0.38s =====================================

Next, let's update the App class.

First, update test_app.py like so:

import datetime
from unittest.mock import MagicMock

from app import App


def test_read():
    hour = datetime.datetime.now().isoformat()
    temperature = 14.52
    temperature_by_hour = {hour: temperature}

    data_source = MagicMock()
    data_source.read.return_value = temperature_by_hour
    app = App(
        data_source=data_source,
        plot=MagicMock()
    )
    assert app.read(file_name='something.csv') == temperature_by_hour


def test_draw():
    plot_mock = MagicMock()
    app = App(
        data_source=MagicMock,
        plot=plot_mock
    )
    hour = datetime.datetime.now()
    iso_hour = hour.isoformat()
    temperature = 14.52
    temperature_by_hour = {iso_hour: temperature}

    app.draw(temperature_by_hour)
    plot_mock.draw.assert_called_with([hour], [temperature])

Since test_draw is no longer coupled with Matplotlib, we injected plot to App before calling the draw method. As long as the interface of the injected Plot is as expected the test should pass. Therefore, we can use MagicMock in our test. We then checked that the draw method was called as expected. We also injected the plot into test_read. That's all.

Update the App class:

import datetime


class App:

    def __init__(self, data_source, plot):
        self.data_source = data_source
        self.plot = plot

    def read(self, **kwargs):
        return self.data_source.read(**kwargs)

    def draw(self, temperatures_by_hour):
        dates = []
        temperatures = []

        for date, temperature in temperatures_by_hour.items():
            dates.append(datetime.datetime.fromisoformat(date))
            temperatures.append(temperature)

        self.plot.draw(dates, temperatures)

The refactored draw method is much simpler now. It just:

  1. Converts a dictionary into two lists
  2. Converts ISO date strings to datetime objects
  3. Calls the draw method of of the Plot instance

Test:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 2 items

test_app.py ..                                                                [ 40%]
test_matplotlib_plot.py .                                                     [ 60%]
test_open_weather_json.py .                                                   [ 80%]
test_urban_climate_csv.py .                                                   [100%]

================================= 5 passed in 0.39s =====================================

Update the snippet for running the app again:

if __name__ == '__main__':
    import sys
    from open_weather_json import DataSource
    from matplotlib_plot import Plot
    file_name = sys.argv[1]
    app = App(DataSource(), Plot())
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

We added a new import for Plot and injeced it into App.

Run your app again to see that it's still working:

(venv)$ python app.py moscow.json

Adding Plotly

Start by installing Plotly:

(venv)$ pip install plotly

Next, add a new test to a new filed called test_plotly_plot.py:

import datetime
from unittest.mock import MagicMock

import plotly.graph_objects

from plotly_plot import Plot


def test_draw(monkeypatch):
    figure_mock = MagicMock()
    monkeypatch.setattr(plotly.graph_objects, 'Figure', figure_mock)
    scatter_mock = MagicMock()
    monkeypatch.setattr(plotly.graph_objects, 'Scatter', scatter_mock)

    plot = Plot()
    hours = [datetime.datetime.now()]
    temperatures = [14.52]
    plot.draw(hours,  temperatures)

    call_kwargs = scatter_mock.call_args[1]
    assert call_kwargs['y'] == temperatures  # check that plot_date was called with temperatures as second arg
    figure_mock().show.assert_called()  # check that show is called

It's basically the same as the matplotlib Plot test. The major change is how the objects and methods from Plotly are mocked.

Second, add file called plotly_plot.py:

import plotly.graph_objects


class Plot:

    def draw(self, hours, temperatures):

        fig = plotly.graph_objects.Figure(
            data=[plotly.graph_objects.Scatter(x=hours, y=temperatures)]
        )
        fig.show()

Here, we used plotly to draw a plot with dates. That's it.

The tests should pass:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 6 items

test_app.py ..                                                                [ 33%]
test_matplotlib_plot.py .                                                     [ 50%]
test_open_weather_json.py .                                                   [ 66%]
test_plotly_plot.py .                                                         [ 83%]
test_urban_climate_csv.py .                                                   [100%]

================================= 6 passed in 0.46s =====================================

Update the run snippet to use plotly:

if __name__ == '__main__':
    import sys
    from open_weather_json import DataSource
    from plotly_plot import Plot
    file_name = sys.argv[1]
    app = App(DataSource(), Plot())
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

Run your app with moscow.json to see the new plot in your browser:

(venv)$ python app.py moscow.json

Temperature by hour

Adding Configuration

At this point, we can easily add and use different data sources and plotting libraries in our application. Our tests are no longer coupled with the implementation. That being said, we still need to make edits to the code to add a new data source or plotting library:

if __name__ == '__main__':
    import sys
    from open_weather_json import DataSource
    from plotly_plot import Plot
    file_name = sys.argv[1]
    app = App(DataSource(), Plot())
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

Although it's only a short snippet of code, we can take Dependency Injection one step further and eliminate the need for code changes. Instead we'll use a configuration file for selecting the data source and plotting library.

We'll use a simple JSON object to configure our application:

{
  "data_source": {
    "name": "urban_climate_csv"
  },
  "plot": {
    "name": "plotly_plot"
  }
}

Add this to a new file called config.json.

Add a new test to test_app.py:

def test_configure():
    app = App.configure(
        'config.json'
    )

    assert isinstance(app, App)

Here, we checked that an instance of App is returned from the configure method. This method will read the config file and load the selected DataSource and Plot.

Add configure to the App class:

import datetime
import json


class App:

    ...

    @classmethod
    def configure(cls, filename):
        with open(filename) as file:
            config = json.load(file)

        data_source = __import__(config['data_source']['name']).DataSource()

        plot = __import__(config['plot']['name']).Plot()

        return cls(data_source, plot)


if __name__ == '__main__':
    import sys
    from open_weather_json import DataSource
    from plotly_plot import Plot
    file_name = sys.argv[1]
    app = App(DataSource(), Plot())
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

So, after loading the JSON file, we imported DataSource and Plot from the respective modules defined in the config file.

__import__ is used to import modules dynamically. For example, setting config['data_source']['name'] to urban_climate_csv is equivalent to:

import urban_climate_csv

data_source = urban_climate_csv.DataSource()

Run the tests:

(venv)$ python -m pytest .

================================ test session starts ====================================
platform darwin -- Python 3.8.5, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
rootdir: /Users/michael/repos/testdriven/python-dependency-injection
collected 6 items

test_app.py ...                                                               [ 42%]
test_matplotlib_plot.py .                                                     [ 57%]
test_open_weather_json.py .                                                   [ 71%]
test_plotly_plot.py .                                                         [ 85%]
test_urban_climate_csv.py .                                                   [100%]

================================= 6 passed in 0.46s =====================================

Finally, update the snippet in app.py to use the newly added method:

if __name__ == '__main__':
    import sys
    config_file = sys.argv[1]
    file_name = sys.argv[2]
    app = App.configure(config_file)
    temperatures_by_hour = app.read(file_name=file_name)
    app.draw(temperatures_by_hour)

With the imports eliminated, you can quickly swap one data source or plotting library for another.

Run your app once again:

(venv)$ python app.py config.json london.csv

Update the config to use open_weather_json as the data source:

{
  "data_source": {
    "name": "open_weather_json"
  },
  "plot": {
    "name": "plotly_plot"
  }
}

Run the app:

(venv)$ python app.py config.json moscow.json

A Different View

The main App class started as an all-knowing object responsible for reading data from a CSV and drawing a plot. We used Dependency Injection to decouple the reading and drawing functionality. The App class is now a container with a simple interface that connects the reading and drawing parts. The actual reading and drawing logic is handled in specialized classes that are responsible for one thing only.

Benefits:

  1. Methods are easier to test
  2. Dependencies are easier to mock
  3. Tests doesn't have to change every time that we extend our application
  4. It's easier to extend the application
  5. It's easier to maintain the application

Did we do something special? Not really. The idea behind Dependency Injection is pretty common in the engineering world, outside of software engineering.

For example, a carpenter who builds house exteriors will generally leave empty slots for windows and doors so that someone specialized specifically in window and door installations can install them. When the house is complete and the owners move in, do they need to tear down half the house just to change an existing window? No. They can just fix the broken window. As long as the windows have the same interface (e.g., width, height, depth, etc.), they're able to install and use them. Can they open the window before it's installed? Of course. Can they test if the window is broken before they install it? Yes. It's a form of Dependency Injection too.

It may not be as natural to see and use Dependency Injection in software engineering but it's just as effective as in any other engineering professions.

Next Steps

Looking for more?

  1. Extend the app to take a new data source typed called open_weather_api. This source takes a city, makes the API call, and then returns the data in the correct shape for the draw method.
  2. Add Bokeh for plotting.

Conclusion

This post showed how to implement Dependency Injection in a real-world application.

Although it's a powerful technique, Dependency Injection is not a silver bullet. Think about house the analogy again: The shell of the house and the windows and doors are loosely coupled. Can the same be said for a tent? No. If the door of the tent is damaged beyond repair, you'll probably want to buy a new tent rather than trying to fix the damaged door. Thus, you can't decoupled and apply Dependency Injection to everything. In fact, it can drag you into premature optimization hell when done too early. Although it's easier to maintain, there's more surface area and decoupled code can be harder to understand for the newcomers to the project.

So before you jump in, ask yourself:

  1. Is my code a "tent" or a "house"?
  2. What are the benefits (and drawbacks) of using Dependency Injection in this particular area?
  3. How can I explain it to a newcomer to the project?

If you can easily answer these questions and the benefits outweigh the drawbacks, go for it. Otherwise, it may not be suitable to use it at the moment.

Happy coding!

Jan Giacomelli

Jan Giacomelli

Jan is a software engineer who lives in Ljubljana, Slovenia, Europe. He is co-founder of typless where he is leading engineering efforts. He loves working with Python and Django. When he's not writing code or deploying to AWS, he's probably skiing, windsurfing, or playing guitar.

Share this tutorial

Featured Course

Building Your Own Python Web Framework

In this course, you'll learn how to develop your own Python web framework to see how all the magic works beneath the scenes in Flask, Django, and the other Python-based web frameworks.

Featured Course

Building Your Own Python Web Framework

In this course, you'll learn how to develop your own Python web framework to see how all the magic works beneath the scenes in Flask, Django, and the other Python-based web frameworks.