Sunday, March 18, 2012

Not all TDDs are created equal.

There is little or nothing that has not been said about TDD. As you probably know if you are a practitioner, there are lots of articles describing step by step the whole process behind it and arguably also the state of mind that TDD aims to promote. However, there is something that I find again and again when I go to Coding Dojos, events like the Global Day of Code Retreat, or even when I am pair programming or mentoring someone.

There seems to be, at least from what I have seen, two main ways to approach TDD. I'm going to call them Holistic and Reductionist approaches. Both of these have their advantages and disadvantages and since it is something that comes up a lot with beginners and seasoned professionals alike it seemed like a good idea to pay some attention to it and write this post. 

So in order to understand it better and just like when you go to practice yoga or music or anything else you want to get really good at, I started paying attention to the way I was doing things. Particularly, behavior and thought processes I followed when practicing TDD from both of these perspectives.

Note: The rest of the post addresses a lot of abstract concepts, and I'm terrible at explaining myself. If you feel like this post would benefit from more examples, or needs some editing, let me know and I will add them or change it.

The Holistic approach.
What do I mean by "Holistic approach"? Holism is described as the idea that natural systems should be seen and studied as a whole and not as a collection of their parts. The underlying principle is that the whole is not just equal to the sum of its parts. 

This is mainly a top to bottom approach when dealing with solving the issue.

When we normally take this approach, what usually happens is that you tend to focus on solving the problem at hand directly instead of the sub problems. 

With this approach you think of things like: outputting all prime numbers in the sequence of the n first natural numbers starting at 2, instead of thinking first of solving how to finding out if a given number is prime or not, or outputting all numbers which satisfy a condition to the screen.

What I have seen is that since you tend to focus on the functionality and results of the principal issue instead of the sub problems, you get results that you can use faster. Most of the time they are partial or incomplete results, but the bottom line is you get more "deliverable" value from the start. Things like: I can create a user, but the process doesn't check all fields that need to be validated, etc.

However, as a side effect, what usually happens is that one tends to add "extras" to the code to help him test. I am talking about things like object properties to inspect inner state in your unit tests, having virtual methods that could be marked as final because you need to override them to test, etc.

Another common side effect that I have detected is, that the frequent mantra of "the emerging architecture", although still true, usually happens at the end, when you think you have a solid grip over your main problem and then you start refactoring your code. 

The inconvenience I see with this is, that if you don't refactor as aggressively and often as you should, this "architecture" may not emerge at all, or it may be deficient. I am not sure if this is good or not, but I personally like to have at least "some" architecture than no architecture, or an "emerging" architecture. Mainly because in both cases it could turn out to be a recipe for disaster if everyone on your team is not on the same page. Like when you have a junior developer on your team or not everyone  has the same skill level regarding TDD or refactorization.

One last thing is that as you are developing, you tend to prevent your target problem's corner cases, but not the sub problems' corner cases. This may lead to other bugs when the functionality gets refactored out or when it gets used by others, since they will tend to assume that it works as expected, not just in the context of the main issue, but in the context of the sub problem. 

To give an idea, following the prime numbers example, if another coder wanted to use your is_prime() function, it would be reasonable for him to expect false when it gets passed a 0 or 1, but if that wasn't part of the main problem's constraints, the developer may not have considered those exceptions, leading in turn to new bugs.

The Reductionist approach.
Reductionism on the other side is sometimes considered the opposite of Holism. In this case you would try to understand the whole by looking at its parts and their interactions. Inherently this is a bottom up approach when doing TDD.

You focus on sub-problems of the problem and when you have solved them, you focus on the interactions. Finally, you refine the interactions so that the results combined provide the solution you are looking for.

Contrary to holism, in this case, architecture emerges from the start, driven by the necessity to decouple particular sub problems. For me this approach makes it easy to loose focus on what your general aim is. It requires a great deal of self control and experience to know when is it enough and you should move on and continue with another area.

However, code gets tested more thoroughly. Corner cases for each sub problem are more evident and they tend to be addressed from the start. So the problems found when the clients of your API made assumptions about how the code worked diminish a great deal.

Nevertheless, focusing on the sub problems usually narrows your thinking. As a consequence you need to do more integration tests in order to validate sub parts and the way they are supposed to be used. You tend to write a more developer-friendly API because you are always thinking as the client of your API when you are writing the unit tests. 

This way of doing things seems to work better when you happen to have a roadmap or well-defined plan of how the pieces fit together. For instance, when you are implementing algorithms, which are usually decomposed in very specific sub problems.

Well, this is most of what I have observed myself and with a lot of input from the conversations I have had with friends. In the end, to me at least, no one method is applicable to every problem. In my day to day job I usually jump from one mode to the other depending on how it "feels". I'm very interested in hearing what others have to say about this subject so please leave a comment below. 

Until next time and happy coding!!

No comments:

Post a Comment