For more than half my career, I've been a practitioner of TDD, Test-driven development. The practice of writing test code before writing production code. I strongly believe that it is by far the most important skill any developer can learn.
Unfortunately, TDD is often poorly understood, even by many who practice it; yet, poorly practiced TDD will still often be beneficial. But a challenge developers can face is that management may perceive TDD as increased cost, when the exact opposite is true; TDD reduces cost because it reduces waste.
Waste
Agile software development is sometimes compared to Lean Manufacturing, which is highly influenced by the Toyota Production System. In the 1988 booklet published by Toyota they describe it as:
The TPS is a framework for conserving resources by eliminating waste.
Examples of waste in a production facility could be excessive inventory, overproduction, moving or transporting parts or equipment, underutilised workers, idle time, relearning, and obviously, defective products.
While civil engineering, and industrial production, are both poor analogies of software development, the concept of "waste" is comparable, although the sources of waste are different.
Some examples of waste in software development are are: long running abandoned branches (unproductive paths), merge conflict resolution, knowledge loss through handoffs, unnecessary complexity, and obviously, product defects (bugs).
Waste means added costs, and reduction of waste translates directly to reduced cost.
How TDD reduces waste
The type of waste that is commonly understood to be reduced by a good test strategy is is product defects/bugs. Thorough testing may detect defects, but TDD can prevent defects; i.e., many defects would never be committed to source control in the first place. A good TDD process would be able to cover the majority of the required behaviour. TDD itself is not a complete solution to this problem, and there are tests worth writing after code was written.1
But defects is not the only source of waste reduced by TDD.
Truncating an unproductive path
As developers, we are continuously making assumptions, often not realising it. We assume that we can use component or technique X to implement functionality Y. We assume that the path we take will solve the problem. We may write many lines of code assuming it has the intended effect.
Occasionally, the assumption is flawed, and we must adjust the current work-in-progress accordingly. Occasionally, the assumption is flat-out invalid. Pursuing X was so flawed to begin with that all work based on the assumption is useless for the purpose of Y. The more work we have carried out, the more waste produced by following an unproductive path.
The way to reduce unproductive paths is to validate the assumption as early as possible. If the assumption is technical in nature, executing snippets of code continuously is often the most efficient way to validate them.
When the developer uses TDD they will quickly work towards a simple working solution using X, initially with a lot of aspects uncovered, and even an incomplete tests. TDD effectively validates the design.
Reduce time debugging
At the time of writing this, I have not launched a debugger for 4 years! A debugger is a tool you may use when your code when the code does not have the intended effect, and you can't immediately identify why.
When working in small increments, running tests early, maybe even after one line of change at a time, you get the immediate feedback if a change had the intended effect.
In poorly understood part about TDD is that TDD is a feedback tool. When practicing TDD, the first iteration of a "test" may be nothing like a test at all; the test itself will change while developing a feature and will eventually mature into a "test" describing describe desired behaviour. Just as the production code evolves slowly, so may can test code.
The granularity of the process can vary greatly. When taking the first step into a new area of the system, a very granular approach is likely. The more confident you are with the way forward, the greater the step you might take.
But the role of the test is to describes the next step you intend take towards having the desired behaviour, and have automated feedback that your work had the intended effect. And as issues are revealed early, debugging is rarely necessary, as the issue is easy to identify.
Decrease unnecessary complexity
TDD can help avoid unnecessary complexity. How effectively depends on how well the test suite facilitate refactoring. Developers or teams new to TDD have a tendency to write test suites more closely coupled to implementation details, making the test suite resist refactoring. As developers gain more TDD experience, their test suites tend to describe the system at a higher level of abstraction, reflecting system behaviour rather than design decisions.
The test suite now serves as a safety harness facilitating refactoring. Developers who fully embrace this will happily commit a simple naïve solution to the problem, defering complex design decisions and refactoring to patterns until new behaviour dictate.
Often, the elegant solution doesn't present itself immediately, and by being able to defer the design decision; we have bought ourselves time, increasing the likelyhood of finding a simple solution to before the features demand it.
Note: Behaviour does not necessarily mean business rules. Behaviour can be technical in nature. Access token renewal is a behaviour of the system. It's not a behaviour domain experts may care about, but they may care about how fault-tolerant the system is overall; making developers add new behaviour such as retrying failed attempts using an exponential backoff mechanism. Writing tests for this in a way that can describe interaction with an external system as well as the passing or time is almost essential.2 Manual testing access token expiration against a real identity provide is almost guaranteed to be flaky if you need to wait 2 hours for access tokens to expire.3
Conclusion
TDD is often misunderstood. The word "test" is somewhat misleading, as TDD is much more than testing.
TDD is a process that incorporates instant feedback into the daily work of the developer. This helps reduce unproductive paths as a source of waste while coding.
An outcome of an effective TDD process is a test suite that helps detect regression of behaviour, reducing defects as a source of waste; while facilitating refactoring.
This permits developers to choose the simplest possible solution initially, having confidence they can add complexity when changing requirements dictate it, reducing excessive complexity as a source of waste.
Effective TDD is a skill that needs to be honed like any other skill. Once mastered, teams practicing TDD are able to reduce cost due to the reduction of waste.
Appendix: Examples
This are two real examples from code bases I've worked on where a very granular process helped build the system, and solve one problem at a time. And in both cases, fast feedback was essential to moving forward quickly.
Discovering the wrong assumption with GORM.
I was new to GORM, a micro-ORM for Go. I started by creating a very simple database schema by hand, a single table with a UUID
primary key column. I created an entity type in code where the ID was initially represented by a string
, partly because Go doesn't natively have a UUID datatype.
To see if I could insert and read, I created a roundtrip test, a test that inserts the entity, then reads it again and verifies that the new copy is identical to the original. I wrote the following function to read an Entity
:
func Get(id string) (entity Entity, err error) {
err = db.Take(&entity, id).Error
return
}
But the function didn't work! The function returned an error, not an entity. After reading the documentation, I learned that when using a string
identifier, a different syntax is necessary:
func Get(id string) (entity Entity, err error) {
err = db.Take(&entity, "id=?", id).Error
return
}
And this worked.
Because I created a test as a means of fast feedback, and started with just basic static functions and a hardcoded connection, I was able to identify my incorrect assumption immediately; significantly reducing the debugging process, compared to detecting the issue in the context of a more complex business rule implementation; and having to figure out where the problem lies.
Over time I would rearrange code, grouping functionality into a dedicated repository with configurable connection options. But code structure was a different problem, and as I was venturing into unknown territory, I used a very granular approach and a tight feedback cycle.
I first solved the problem of writing the actual lines of code that can read and write data; before addressing the problem of how to organise code in a larger codebase.
First time working with RabbitMQ
The first time I had to work with RabbitMQ I had a similar problem. There were plenty of unknowns. How do I even connect? What is the valid connection string? How do I segregate data?
The first couple of iterations involved visual inspection in the admin UI. For example, I tried to send a message to a queue. Messages should not be sent directly to queues, but exchanges, but that was a different problems to solve. The first queue was created manually in the rabbit admin UI. I wrote the code to send a message to the queue, and saved, letting the test framework run the code automatically. Then I inspected manually in the admin UI to see that the message would appear.
After dealing with message publishing issues, I could then start to write code that would consume messages from a queue. The test would now send a message, and wait for one to appear. Next, I could add a verification step that the received message had an identical message body. I now had something that started to look like a test.
Now I would manually delete the queue I had created, and the test would of course fail. I wrote the code to create missing queues, and the test would verify that the queue was created. Running the test again would verify that creating a missing queue would fail when it wasn't missing.
At that point in time, I no longer needed the admin UI. It served as the best way to get feedback in the very early iterations, when I didn't have the code to read messages, and write both the publishing and consumption code in one sitting before working on the feedback would have required significantly more debugging time.
Up to this point in time, all code existed only in a test file. I had dealt with the problem of how to write the lines of code that can actually interact with the RabbitMQ server. But the test provided feedback measured in milliseconds, and now I could rearrange the code into a Publisher
and a Subscriber
component in the system. The tests would at the same time be reorganised to describe publishing messages through the Publisher
and consuming them through the Consumer
, and every time I saved, a message would pass through RabbitMQ, confirming that the refactored code worked.
Finally, I could address the problem at a higher level of abstraction. How domain events in one area of the system will eventually trigger another process to happen in another part of the system. For the most part, domain logic would be tested using a mocked Publisher
, verifying domain logic alone.
Messages were still passed through RabbitMQ as part of the test suite. The actual message routing was controlled by RabbitMQ configuration, so if this wasn't tested, an essential part of the system behaviour would go untested.
Domain event publishing and handling worked reliably, and I never had to debug the code.
To this day, whenever I add RabbitMQ behaviour to a project, I follow the same path.
-
I will typically write TDD tests covering application and business logic as a whole, HTTP endpoints as a whole (described using HTTP headers, body, status codes, and mocking application logic), and data storage as a whole (using a real database). This leaves at least one type of test to add after code was written. Do all layers connect properly? E.g., do IOC containers wire up dependencies correctly? Can a user complete a basic flow from login to checkout in a complete system? ↩
-
Libraries like lolex for JavaScript, and the experimental synctest package in Go 1.24 allow tests to simulate the passing of time; verifying behaviour without actually having to wait. ↩
-
This is the standard access token expiration for one major identity provider. ↩