14 March 2009

Law of Code Quality: Consistency

Fall MixtureImagine you have some MDBs (Message Driven Beans). You want to get rid of them because they are still EJB2 and they suck. You want to use Spring's JMS capabilities instead. Sounds quite good. But after some time, due to some budget problem or pending dead-line, you stop converting the old stuff "because it does not add value" (and which is almost guaranteed to happen). So you end up having MDB plus Spring JMS mixed throughout the code and maintenance people have to know both of them. It's difficult enough to know one of them, but now you end up having both solutions messed up on the long run.

Copy & Paste
In brownfield development you always have some code before you start. The look, quality and design of this code is very important for further developments. Extending complex legacy stuff often involves a copy-and-paste style of coding. (If this is preferable is a story of its own right but will not be discussed here.) In the context of maintenance I consider copy and paste a good habit because this way the existing conventions and patterns are obeyed, even if they are not documented. Of course the positive effects of copy paste are only achieved if the "right" piece of the system is copied. Like reference solutions in generator development, aka templates (Link MDSD Generator), the copied piece has to be of highest quality, i.e. according to conventions and guidelines defined for the application.

Implicit Conventions and Uniformity
If the guidelines and conventions of some software are not written down properly, the only documentation is the code itself. Even if there is decent documentation, capturing all aspects of software development is at best very difficult or can't be done at all. There are always some implicit conventions that are only available in the code. The more an application is uniformly satisfying these conventions and designs, the more pieces of it serve as "safe" templates for further extensions and modifications. This helps new members on the project to find their way around. And good (maintenance) developers see these implicit conventions in surrounding style and patterns, adapt to them and work according to them. Uniform code makes it easier for them to adapt to the new code base and "get to speed".

Broken Window AboveBroken Windows
As the pragmatic programmers toughed us in rule 4, broken windows are causing problems in real as well as virtual life. If the conventions are visible and clear, who would dare to stand out in breaking them? On the other hand, if e.g. a piece of code is formatted in three different ways, there is no shame in introducing a new style. People tend to stick with the things they know, because it is faster and feels more secure. (And usually we think it to be superior to things we do not know.) This leads to patchworks in the code. This mixture gets maintained (read copied) from time and time and all the broken windows get spread throughout the code base, the differences live on and grow.

So I postulate the 1st Law of Code Quality - Code Consistency. This applies to source layout, naming and other coding conventions, typical code fragments (also called idioms, most of the time some boilerplate code), design concerns, layering, architecture, used libraries, technologies etc. Consistency in the code is the most important issue. Failing to have a consistent code base will cause all the troubles known from mixed designs, mixed technologies, making it difficult to maintain and get new people into the team.

Living with Changes
Of course we have to change things again and again. It's easy to have consistency in simple conventions, like formatting, just use Eclipse format on the whole source tree and put Checkstyle into your build. Short code idioms like getting a database connection are more difficult. These are rarely documented, but once you have them unified, a Ruby or Groovy script can do almost any syntactical change to your code using powerful regular expressions. More complex changes, e.g. replacing EJB 2 with something else, is more involved. However just don't make the mistake to leave the old stuff as it is. No excuses about small budget, needed retest, pending dead-lines and such! If you can't convert the old stuff, leave it as it is. If you are too "weak" to change it, you earned staying with it. Bringing in new technologies needs a strong plan, better a script, to convert everything existing to the new style. All remaining code has to be changed to use the new technologies as they are supposed to. Typical idioms and best practices of that technology should be obvious at the end. You don't want a Java program to be coded like old C, a few classes with lots of static, monolithic methods, static data etc. The same is true for any refactoring.

A Brighter Future?
In keeping your code base consistent you need help. Especially in the beginning we need someone who flags new inconsistencies, reminds us of conventions. Use static code analysis as soon as you have identified a consistency target. (That might be as simple as grep *.java.) The proper (consistent) way to do something should be enforced from day one. Unfortunately documenting it is not enough. For layout use tools like Checkstyle. Other conventions and boilerplate code can be checked with tools, that support custom rule definition, e.g. Findbugs or PMD. Nowadays many tools come with a large number of base rules that cover common stuff. Most likely some are of use. Modularity and layering is enforced with reference checking, e.g. Macker or SonarJ. With some fantasy (and enough computing power at your build machine) you can create quite sophisticated checks.

No comments: