A Philosophy of Software Design: Complexity
This the first text of a series about the book A Philosophy of Software Design. It tries to propose one more way on how to approach software design. It isn’t one it should be followed, but it also isn’t the one it should be avoided. As said, is just one more way. A guide. Enjoy
If you have been around the tech industry, for while you might face discussions with your colleagues about how to build a software. How to create layers between the classes. There are a lot of approaches on how to do it. One of the most common is to focus on simplify, in a atomic level, each class to its responsibility. It does not make excuses on extending classes or implement interfaces. On one side it makes easier in module the responsibilities. On the other it might increase complexity on a big picture level. For example, when a method or class gets too big, this approach says this might indicate the method should be splitted. Despite this could generate a simpler code in an isolated level, when linked into a whole process it might increase cognitive load for other when trying to understand. Philosophy of software design initially tries to swim on the opposite direction of that proposal.
The proposal is to properly define what is complexity. Because that what software design is all about: complexity. The book defines complexity as “anything related to the structure of a software that makes it hard to understand and modify the system”. The goal of each software engineer should to be to reduce at maximum its code complexity. This would make it not only their job easier, when doing refactors for example, but every engineer that might come after will have less time to ramp up to the code. However as a software engineer I know it isn’t an easy task. Most of the times, business logic keeps increasing over time and you have to patch it up the existing code. Adding this factor with time restrictions to deliver a given task and there you go: complexity increase. The only time one might have control over complexity is when it needs to deliver a brand new feature. And please, focus on MIGHT. If time constraints are in place, this MIGHT not happen. These are some factors that are suppose to give signals if the complexity of a system is increasing.
Change amplification
It indicates the number of changes required when a modification is performed on a given point of the code. Using the clean code approach when creating a complex linked set of classes we might need to create a set of classes, factories and mappers. Suppose we introduced a new field on an entrypoint of a system. We need to modify all the classes types mentioned before. This will increase the change amplification factor. At the end the approach is trade-off: decrease complexity in an atomic level, but increase complexity on the big picture.
Cognitive load
One of the principles recommended by the clean code is that “each class or method should do one thing and one thing only”. If that phrase take exponential level we get the cognitive load issue. A main method that have to query the database, compute info and save the new data should have at least 3 methods according to clean code. How this book approach the same issue? Most of the time, complexity is measured by the number of lines of code. However we should not focus on lines of code. An engineer should consider each layer of its application as an API. Each layer doesn’t need to know about the implementation on the next and the previous layer. So doesn't the person coding it. However this doesn’t mean you should have multiple class to represent each layer. Imagine each layer as feature or capabilities of the system. Focus on context awareness on each layer. If a context is changed between layers, maybe you should need another representation of its capabilities.
Unknown unknowns
Once I worked with a system that used a factory every time it needs to load a class into the application. The factory had a config class linked to it. The config class loaded the parameters from a yaml config file. If a new engineer joined the team and requested to change a piece of code on that chain they should have to be aware that the change will have impact on at least 3 other classes. Despite having the single-responsibility followed, this chain increase hidden required changes when modifying a single place on the chain.
What this 3 factors tells us is complexity is intrinsic related to dependency. When we have signatures on methods for example we create dependency between layers. When we create a class to be used by another we create dependency. They cannot be eliminated completely. However our job is to reduce how tied they are. One change should not impact a chain of other classes. A new field for example, should not break another unrelated piece of code. That obscurity between changes can be avoided. The design should be loose enough so it can be well understood in isolation, but it should be also clear enough so the required changes are visible when needed. Find this balance is not simple and is tough to reach. But it should be the main objective.
Complexity is not about one time thing. Is a sum of minor unresolved issues. The backlog keeps increasing and increasing. And new features keeps coming. Refactors are always on the background. Engineers should know and propose when the time comes. When the refactor is needed. The point of no return can be reached and a new refactor can be so complex that it might become not feasible.
Next up: A Philosophy of Software Design: Modules deepness
About the book: A philosophy of software design.