Why Do We Test?

Part 1, Chapter 3

Testing often sparks heated debates about methodologies, coverage metrics, best practices, and so forth. Yet these discussions frequently miss the fundamental question: Why test at all? Before diving into testing techniques, let's take a step back and remind ourselves why we test in the first place.

It's All About the Money

Most of us are in the software industry to make money. In order for us, as software engineers, to get a paycheck, companies need to make a profit. To make a profit, one needs to build a product that others are willing to pay for. That requires bringing value to your customers for which they are willing to pay.

Value can come in all shapes and forms. From entertainment, to solving a problem, to making someone's life easier, to making someone's life more fun. Either way, for the product to be valuable, it needs to meet customer expectations. In other words, the nicest UI won't help much if you're selling a stock trading app where users can't actually buy/sell stocks. Same goes for any product out there: If it doesn't meet expectations, it has little value. To make sure users are getting the value they're paying for, we need to make sure the product works as expected at all times. That's where testing comes into play. With testing (which ever form of it), we're verifying that the expectations are met.

The Only Constant is Change

The world is changing. Fast. In this AI-driven world, even faster! And so are the requirements. Something that brought value yesterday, might not bring any value today.

We do our best try to keep up with the changes, to make sure we always bring value to customers. The faster we can adapt to change, the better. Nevertheless, there's a catch: Adapting to changes can be hard. Especially, if you want to make sure that things that weren't changed aren't negatively impacted. For example, if the tax rate for certain foods changes, you need to apply the change only to those specific food items. Not to cars. Not to electronics. So the question here is: How do we make sure that we can adapt to changes quickly, yet make sure that things that weren't changed are staying the same?

Long story short: We want to develop in an environment where changing things imposes low risk and low cost.

As I like to say: "Software is of high quality when you can safely change it faster than businesses can change their mind!"

Testing Takes Time and It's Unreliable

Whether we like it or not, testing takes time. And it's unreliable as well. So the question is: How do we deal with that?

The answer is simple! We want to minimize time and maximize its reliability. When we achieve that, we ensure low cost and low risk of changes.

Automated tests are certainly the way to do that. That doesn't mean automated tests achieve that by default. You need to invest time and effort into them. With valuable automated tests and great code design, you can get there.

Does that mean that manual and exploratory testing is useless? Not at all. It certainly has its place. But wherever we can automate the testing and make it part of the regular development process, we should do that.

Valuable Tests

To achieve low cost and low risk of changes, we need valuable automated tests. Tests are valuable when they are:

fast
reliable
repeatable
resistant to refactoring
readable
resistant to unrelated changes
thorough

If you feel like tests are slowing you down, it's most likely because one of the requirements above isn't met. To mention a few examples:

Your CI/CD test jobs take forever -> tests are not fast enough
You are restarting your CI/CD jobs because they are failing randomly -> tests are not reliable
Tests are passing on your computer, but failing on CI/CD -> tests are not repeatable
You're changing tests every time you touch implementation, even when only doing refactoring -> tests are not resistant to refactoring
You're searching for defect all over the place when test fails -> tests are not readable
You add a new property to your class and 100 tests fail -> tests are not resistant to unrelated changes
Tests are passing, but there are bugs in production -> tests are not thorough enough
And so forth

Throughout this course, you'll learn how to write valuable tests -- to help us keep the cost and risk of changes low. This way we can react to changes quickly without a fear of breaking things that already bring value to our customers.

Unit vs. Integration vs. End-to-end Tests

We know many different types of tests. When discussing testing inside software engineering, we usually hear the most about unit, integration, and end-to-end tests. While there's some common understanding about what end-to-end tests are, it can become quite heated when the discussion around "What's a unit test and what's an integration test?" starts. Most automated testing resources agree on these defintitions:

Unit tests: test a single unit of code in isolation (e.g., module A, class B, function C). All external dependencies should be replaced by test doubles (e.g., mocks, stubs).
Integration test: test how multiple units work together. Most of the external dependencies are real (e.g., database, external API). Unstable external APIs might still be replaced by test doubles, etc.
End-to-end tests: test the whole system (e.g., by calling the deployed API endpoint). External dependencies are real (e.g., database, external API).

If I'm being honest, I was never a big fan of these definitions. They're quite vague and can be interpreted in many different ways. Unfortunately, more often than not, software developers use these definitions in ways that make automated testing cumbersome and painful.

I prefer the following separation:

End-to-end tests: tests that interact with system the same way users do (e.g., calling deployed API, clicking through the web app using a headless browser). They require a deployed application, include dependencies that we don't control (e.g., external APIs), and are the most thorough but also by far the slowest and most complicated to set up.
I/O tests: tests that interact with external systems. They need running dependencies that we can control (e.g., test database, test Redis cache, and so forth).
Tests: all the tests that can run using memory-only. They don't interact with any external system (e.g., database, Redis cache, external API, and so forth).

This gives us much better decision points. We write I/O tests when we're testing things like data stores/repositories or caches. We write the end-to-end tests for the most critical paths in our application. And we write tests for everything else. If we realize that it's more efficient to test something by using test doubles, we do it. Otherwise, we just test things as they are. There's no hard rule that forces us to replace something so we can call the test "unit test", despite it making no sense. This way we can run most of our tests at any time. They are fast, and they don't need any external dependencies. We can run I/O tests only when necessary, because they're slower and require more setup. You'll learn more details throughout this course.

We're Engineers, not Developers from Hell

One common dilemma in software engineering is "How many tests should I write?"/"Which things should be covered with automated tests?". There's no simple answer for that. You've maybe heard about a 100% code coverage requirement. Or the Enterprise Developer from Hell. Or that every method should be tested. Therefore, my assumptions are the following:

We're engineers, not developers from hell -> We'll try to make things work, not just make the tests pass
We and AI-tools are writing tests to have faster feedback loops and to protect us and AI-tools from breaking things that already work
We don't want to spend time or tokens updating tests if nothing relevant has changed
We don't want to spend time or tokens dealing with flaky tests

So where does that leave us? In a place where we need to "do -> measure -> learn -> adapt", one can think of tests as road fences. When the road is straight and fog isn't something that's often there, a simple white line that marks the edge of the road is enough. If fog is a common thing, you need to add some additional markers. Same goes for very curvy roads in the mountains. In some places, there's even high concrete walls to prevent cars from falling off a cliff. And it's similar with automated testing. You adapt to the situation. In parts that have dense logic, you put multiple tests for protection. In places that are just plugging the things together, you might put none. And things do not have to stay the same over time. When you have new data available (e.g., from monitoring), you can add more tests or remove redundant ones.

My rules of thumb for different areas of the application are the following:

API endpoints: Test the happy path and all the expected errors.
Data stores and caches: Test that added data can be fetched, and there should be one test for every filter inside the query that's used.
Third-party integrations: Test the happy path and all the expected errors.
Business logic: Again, test the happy paths and all the expected errors.

There are also some things that I do not write tests for. Some examples:

Job queues (e.g., wrapper around Celery job submissions)
API payload validation when Pydantic is used to define the schemas. I trust Pydantic to do its job. If I expect a datetime and someone sends a string, I expect Pydantic to raise an error without me testing every possible scenario

Obviously, these things vary from one case to the other. So feel free to adapt these rules to your situation. For example, things are certainly different when using FastAPI vs. when using Django. So once again: "do -> measure -> learn -> adapt". Don't worry, things will become clearer as you progress through this course and see multiple different examples.

What Have You Learned?

In this chapter, we've discussed why we are testing in the first place: To keep the cost and risk of changes low. We've also reviewed what the properties of valuable tests are and the different types of tests that we usually hear about. In the next chapters, we'll look into several different techniques that can help us write valuable tests.

« Changelog

✓ Mark as Completed