Automating Performance Testing in Django

Inefficient database querying is one of the most common performance pitfalls with Django. N+1 queries in particular can have a negative impact on your application's performance early on. They occur when you select records from an associated table using an individual query for each record rather than grabbing all records in a single query. Such inefficiencies are unfortunately quite easy to introduce with the Django ORM. That being said, they are something you can quickly uncover and prevent via automated testing.

This article looks at how to:

Test the number of queries executed by a request along with the duration of the queries
Prevent N+1 queries using the nplusone package

N+1 Queries
Metrics Middleware
Performance Tests
Conclusion

N+1 Queries

The sample app that we'll be working with throughout this article can be found on GitHub.

Say, for example, you're working with a Django application that has the following models:

# courses/models.py

from django.db import models


class Author(models.Model):
    name = models.CharField(max_length=100)

    def __str__(self):
        return self.name


class Course(models.Model):
    title = models.CharField(max_length=100)
    author = models.ForeignKey(Author, on_delete=models.CASCADE)

    def __str__(self):
        return self.title

Now, if you're tasked with creating a new view for returning a JSON response of all courses with the title and author name, you could write the following code:

# courses/views.py

from django.http import JsonResponse

from courses.models import Course


def all_courses(request):
    queryset = Course.objects.all()

    courses = []

    for course in queryset:
        courses.append(
            {"title": course.title, "author": course.author.name}
        )

    return JsonResponse(courses, safe=False)

This code will work, but it's very inefficient since it will make far too many database queries:

1 query for obtaining all the courses
N queries for obtaining the branch in each iteration

Before addressing this, let's look at just how many queries are made and measure the execution time.

Metrics Middleware

You'll notice that the project includes custom middleware that calculates and logs the execution time of each request:

# core/middleware.py

import logging
import time

from django.db import connection, reset_queries


def metric_middleware(get_response):
    def middleware(request):
        reset_queries()

        # Get beginning stats
        start_queries = len(connection.queries)
        start_time = time.perf_counter()

        # Process the request
        response = get_response(request)

        # Get ending stats
        end_time = time.perf_counter()
        end_queries = len(connection.queries)

        # Calculate stats
        total_time = end_time - start_time
        total_queries = end_queries - start_queries

        # Log the results
        logger = logging.getLogger("debug")
        logger.debug(f"Request: {request.method} {request.path}")
        logger.debug(f"Number of Queries: {total_queries}")
        logger.debug(f"Total time: {(total_time):.2f}s")

        return response

    return middleware

Run the database seed command to add 10 authors and 100 courses to the database:

$ python manage.py seed_db

With the Django development server up and running, navigate to http://localhost:8000/courses/ in your browser. You should see the JSON response. Back in your terminal, take note of the metrics:

Request: GET /courses/
Number of Queries: 101
Total time: 0.10s

That's a lot of queries! This is very inefficient. Each additional author and course added will require an additional database query, so performance will continue to degrade as the database grows. Fortunately, the fix for this is quite simple: You can add a select_related method to create a SQL join which will include the authors in the initial database query.

queryset = Course.objects.select_related("author").all()

Before making any code changes, let's first start with some tests.

Performance Tests

Start with the following test, which uses the django_assert_num_queries pytest fixture to ensure that the database is hit only once when there is one or more author and course records present in the database:

import json

import pytest
from faker import Faker
from django.test import override_settings

from courses.models import Course, Author


@pytest.mark.django_db
def test_number_of_sql_queries_all_courses(client, django_assert_num_queries):
    fake = Faker()

    author_name = fake.name()
    author = Author(name=author_name)
    author.save()
    course_title = fake.sentence(nb_words=4)
    course = Course(title=course_title, author=author)
    course.save()

    with django_assert_num_queries(1):
        res = client.get("/courses/")
        data = json.loads(res.content)

        assert res.status_code == 200
        assert len(data) == 1

        author_name = fake.name()
        author = Author(name=author_name)
        author.save()
        course_title = fake.sentence(nb_words=4)
        course = Course(title=course_title, author=author)
        course.save()

        res = client.get("/courses/")
        data = json.loads(res.content)

        assert res.status_code == 200
        assert len(data) == 2

Not using pytest? Use the assertNumQueries test method in place of django_assert_num_queries.

What's more, we can also use nplusone to prevent the introduction of future N+1 queries. After installing the package and adding it to the settings file, you can add it to your tests with the @override_settings decorator:

...
@pytest.mark.django_db
@override_settings(NPLUSONE_RAISE=True)
def test_number_of_sql_queries_all_courses(client, django_assert_num_queries):
    ...

Or if you'd like to automatically enable nplusone across the entire test suite, add the following to your test root conftest.py file:

from django.conf import settings


def pytest_configure(config):
    settings.NPLUSONE_RAISE = True

Hop back to the sample app, and you run the tests. You should see the following error:

nplusone.core.exceptions.NPlusOneError: Potential n+1 query detected on `Course.author`

Now, make the recommended change -- adding the select_related method -- and then run the tests again. They should now pass.

Conclusion

This article looked at how to prevent N+1 queries automatically in your codebase using the nplusone package and test the number of queries executed using the django_assert_num_queries pytest fixture.

This should help prevent performance bottlenecks as your applications grows and takes on more users. If you're adding this to an existing codebase, you'll probably need to spend some time fixing broken queries so that they have a constant number of database hits. If you're still running into performance issues after the fixes and optimizations, you may need to add additional caching layers, denormalize parts of the database, and/or configure database indexes.

Featured Course

Test-Driven Development with Django, Django REST Framework, and Docker

In this course, you'll learn how to set up a development environment with Docker in order to build and deploy a RESTful API powered by Python, Django, and Django REST Framework.

Buy Now $30 View Course

Share this tutorial

Contents

N+1 Queries

Metrics Middleware

Performance Tests

Conclusion

Test-Driven Development with Django, Django REST Framework, and Docker

Test-Driven Development with Django, Django REST Framework, and Docker

Recommended Tutorials