11 January 2018

Compliance with Object Calisthenics

During my work as Code Cop I run many workshops. Sometimes I use constraints to make exercises more focused or more intense. Some constraints, like the Brutal Coding Constraints, are composite or aggregate constraints, which means that they are a combination of several simpler or low level constraints. (Here simple does not mean that they are easy to follow, rather that they focus on a single thing.) Today I want to discuss Object Calisthenics.

some warmup calisthenicsObject Calisthenics
Jeff Bay's Object Calisthenics is an aggregate constraint combined of the following nine rules:
  1. Use only one level of indentation per method.
  2. Don't use the else keyword.
  3. Wrap all primitives and strings (in public API).
  4. Use only one dot per line.
  5. Don't abbreviate (long names).
  6. Keep all entities small.
  7. Don't use any classes with more than two instance variables.
  8. Use first-class collections.
  9. Don't use any getters/setters/properties.
If you are not familiar with these rules, I recommend you read Jeff Bay's original essay published in the The ThoughtWorks Anthology in 2008. William Durand's post is an exact copy of that essay, so no need to buy the book for that. Several people have followed up and discussed their interpretation and experience with Object Calisthenics, e.g. Mark Needham, Jeff Pace, Vasiliki Vockin and Juan Antonio.

Object Calisthenics is an exercise in object orientation. How is that? One of the core concepts of OOP is Abstraction: A class should capture one and only one key abstraction. Obviously primitive and built-in types lack abstraction. Wrapping primitives (rule #3) and wrapping collections (rule #8) drive the code towards more abstractions. Small entities (rule #6) help to keep our abstractions focused.

Further objects are defined by what they do, not what they contain. All data must be hidden within its class. This is Encapsulation. Rule #9, No Properties, forces you to stay away from accessing the fields from outside of the class.

Next to Abstraction and Encapsulation, these nine rules help Loose Coupling and High Cohesion. Loose Coupling is achieved by minimizing the number of messages sent between a class and its collaborator. Rule #4, One Dot Per Line, reduces the coupling introduced by a single line. This rule is misleading, because what Jeff really meant was the Law Of Demeter. The law is not about counting dots per line, it is about dependencies: "Only talk to your immediate friends." Sometimes even a single dot in a line will violate the law.

High Cohesion means that related data and behaviour should be in one place. This means that most of the methods defined on a class should use most of the data members most of the time. Rule #5, Don't Abbreviate, addresses this: When a name of a field or method gets long and we wish to shorten it, obviously the enclosing scope does not provide enough context for the name, which means that the element is not cohesive with the other elements of the class. We need another class to provide the missing context. Next to naming, small entities (rule #6) have a higher probability of being cohesive because there are less fields and methods. Limiting the number of instance variables (rule #7) also keeps cohesion high.

The remaining rules #1 and #2, One Level Of Indentation and No else aim to make the code simpler by avoiding nested code constructs. After all who wants to be a PHP Street Fighter. ;-)

Checking Code for Compliance with Object Calisthenics
When facilitating coding exercises with composite constraints, I noticed how easy it is to overlook certain violations. We are used to conditionals or dereferencing pointers that we might not notice them when reading code. Some rules like the Law Of Demeter or a maximum size of classes need a detailed inspection of the code to verify. To check Java code for compliance with Object Calisthenics I use PMD. PMD contains several rules we can use:
  • Rule java/coupling.xml/LawOfDemeter for rule #4.
  • Rule #6 can be checked with NcssTypeCount. A NCSS count of 30 is usually around 50 lines of code.
    <rule ref="rulesets/java/codesize.xml/NcssTypeCount">
            <property name="minimum" value="30" />
  • And there is TooManyFields for rule #7.
    <rule ref="rulesets/java/codesize.xml/TooManyFields">
            <property name="maxfields" value="2" />
I work a lot with PMD and have created custom rules in the past. I added rules for Object Calisthenics. At the moment, my Custom PMD Rules contain a rule set file object-calisthenics.xml with these rules:
  • java/constraints.xml/NoElseKeyword is very simple. All else keywords are flagged by the XPath expression //IfStatement[@Else='true'].
  • java/codecop.xml/FirstClassCollections looks for fields of known collection types and then checks the number of fields.
  • java/codecop.xml/OneLevelOfIntention
  • java/constraints.xml/NoGetterAndSetter needs a more elaborate XPath expression. It is checking MethodDeclarator and its inner Block/ BlockStatement/ Statement/ StatementExpression/ Expression/ PrimaryExpressions.
  • java/codecop.xml/PrimitiveObsession is implemented in code. It checks PMD's ASTConstructorDeclaration and ASTMethodDeclaration for primitive parameters and return types.
For the nitty-gritty details of all the rules have a look at codecop.xml and constraints.xml.

AutomaticInterpretation of Rules: Indentation
When I read Jeff Bay's original essay, the rules were clear. At least I thought so. Verifying them automatically showed some areas where different interpretations are possible. Different people see Object Calisthenics in different ways. In comparison, the Object Calisthenics rules for PHP_CodeSniffer implement One Level Of Indentation by allowing a nesting of one. For example there can be conditionals and there can be loops, but no conditional inside of a loop. So the code is either formatted at method level or indented one level deep. My PMD rule is more strict: Either there is no indentation - no conditional, no loop - or everything is indented once: for example, if there is a loop, than the whole method body must be inside this loop. This does not allow more than one conditional or loop per method. My rule follows Jeff's idea that each method does exactly one thing. Of course, I like my strict version, while my friend Aki Salmi said that I went to far as it is more like Zero Level Of Indentation. Probably he is right and I will recreate this rule and keep the Zero Level Of Indentation for the (upcoming) Brutal version of Object Calisthenics. ;-)

Wrap All Primitives
There is no PHP_CodeSniffer rule for that, as Tomas Votruba considers it "too strict, vague or annoying". Indeed, this rule is very annoying if you use primitives all the way and your only data structure is an associative array or hash map. All containers like java.util.List, Set or Map are considered primitive as well. Samir Talwar said that every type that was not written by yourself is primitive because it is not from your domain. This prohibits the direct usage of Files and URLs to name a few, but let's not go there. (Read more about the issue of primitives in one of my older posts.)

My rule allows primitive values in constructors as well as getters to implement the classic Value Object pattern. (The rule's implementation is simplistic and it is possible to cheat by passing primitives to constructors. And the getters will be flagged by rule #9, so no use for them in Object Calisthenics anyway.)

I agree with Tomas that this rule is too strict, because there is no point in wrapping primitive payloads, e.g. strings that are only displayed to the user and not acted on by the system. These will be false positives. There are certain methods with primitives in their signatures like equals and hashCode that are required by Java. Further we might have plain numbers in our domain or we use indexing of some sort, both will be false positives, too.

One Dot Per Line
As I said before, I use PMD's LawOfDemeter to verify rule #4. The law allows sending messages to objects that are
  • the immediate parts of this or
  • the arguments of the current method or
  • objects created inside the current method or
  • objects in global variables.
I did not look at PMD's source code to check the implementation of this rule - but it complains a lot. For me this is the most difficult rule of all nine rules. (I code according to #1, #3, #5 and #6 and can easily adapt to strictly follow #2, #7, #8 and #9.) Although it complains a lot, I found every violation correct. I learned much about Law Of Demeter by checking my code for violations. For example, calling methods on an element of an array is a violation. The indexed array access is similar to a pointer access. (In Ruby this is obvious because Array defines a method def [](index).) Another interesting fact is that (at least in PMD) the law flags calling methods on enums. The enum instances are not created locally, so we cannot send them messages. On the other hand, an enum is a global variable, so maybe it should be allowed to call methods on it.

The PHP_CodeSniffer rule follows the rule's name and checks that there is only one dot per line. This creates better code, because train wrecks will be split into explaining variables which make debugging easier. Also Tomas is checking for known fluent interfaces. Fluent interfaces - by definition - look like they are violating the Law Of Demeter. As long as the fluent interface returns the same instance, as for example basic builders do, there is no violation. When following a more relaxed version of the law, the Class Version of Law Of Demeter, than different implementations of the same type are still possible. The Java Stream API, where many calls return a new Stream instance of a different class - or the same class with a different generic type - is likely to violate the law. It does not matter. Fluent interfaces are designed to improve readability of code. Law Of Demeter violations in fluent interfaces are false positives.

Don't Abbreviate
I found it difficult to check for abbreviations, so rule #5 is not enforced. I thought of implementing this rule using a dictionary, but that is prone to false positives as the dictionary cannot contain all terms from all domains we create software for. The PHP_CodeSniffer rules check for names shorter than three characters and allow certain exceptions like id. This is a good start but is not catching all abbreviations, especially as the need to abbreviate arises from long names. Another option would be to analyse the name for its camel case patterns, requiring all names to contain lowercase characters between the uppercase ones. This would flag acronyms like ID or URL but no real abbreviations like usr or loc.

Small Entities
Small is relative. Different people use different limits depending on programming language. Jeff Bay's 50 lines work well for Java. Rafael Dohms proposes to use 200 lines for PHP. PHP_CodeSniffer checks function length and number of methods per class, too. Fabian Schwarz-Fritz limits packages to ten classes. All these additional rules follow Jeff Bay's original idea and I will add them to the rule set in the future.

Two Instance Variables
Allowing only two instance variables seems arbitrary - why not have three or five. Some people have changed the rules to allow five fields. I do not see how the choice of language makes a difference. Two is the smallest number that allows composition of object trees.

In PHP_CodeSniffer there is no rule for this because the number depends on the "individual domain of each project". When an entity or value object consists of three or more equal parts, the rule will flag the code but there is no problem. For example, a class BoundingBox might contain four fields top, left, bottom, right. Depending on the values, introducing a new wrapper class Coordinate to reduce these fields to topLeft and bottomRight might make sense.

No Properties
My PMD rule finds methods that return an instance field (a getter) or update it (a setter). PHP_CodeSniffer checks for methods using the typical naming conventions. It further forbids the usage of public fields, which is a great idea. As we wrapped all primitives (rule #3) and we have no getters, we can never check their values. So how do we create state based tests? Mark Needham has discussed "whether we should implement equals and hashCode methods on objects just so that we can test their equality. My general feeling is that this is fine although it has been pointed out to me that doing this is actually adding production code just for a test and should be avoided unless we need to put the object into a HashMap or HashSet."

From what I have seen, most object oriented developers struggle with that constraint. Getters and setters are very ingrained. In fact some people have dropped that constraint from Object Calisthenics. There are several ways to live without accessors. Samir Talwar has written why avoiding Getters, Setters and Properties is such a powerful mind shift.

Java Project Setup
I created two repositories containing starting points for the LCD Numbers and Minesweeper Kata:Both are Apache Maven projects. The projects are set up to check the code using the Maven PMD Plugin on each test execution. Here is the relevant snippet from the pom.xml:
You can add this snippet to any Maven project and enjoy Object Calisthenics. The Jar file of pmd-rules is available in my personal Maven repository.

To test your setup there is sample code in both projects and mvnw test will show two violations:
[INFO] PMD Failure: SampleClass.java:2 Rule:TooManyFields Priority:3 Too many fields.
[INFO] PMD Failure: SampleClass:9 Rule:NoElseKeyword Priority:3 No else keyword.
It is possible to check the rules alone with mvnw pmd:check. (Using the Maven Shell the time to run the checks is reduced by 50%.) There are two run_pmd scripts, one for Windows (.bat) and one for Linux (.sh).

Object Calisthenics RetrospectiveLimitations of Checking Code
Obviously code analysis cannot find everything. On the other hand - as discussed earlier - some violations will be false positives, e.g. when using the Stream API. You can use // NOPMD comments and @SuppressWarnings("PMD") annotations to suppress false positives. I recommend using exact suppressions, e.g. @SuppressWarnings("PMD.TooManyFields") to skip violations because other violations at the same line will still be found. Use your good judgement. The goal of Object Calisthenics is to follow all nine rules, not to suppress them.

Object Calisthenics is a great exercise. I used it all of my workshops on Object Oriented Programming and in several exercises I did myself. The verification of the rules helped me and the participants to follow the constraints and made the exercise more strict. (If people were stuck I sometimes recommended to ignore one or another PMD violations, at least for some time.) People liked it and had insights into object orientation: It is definitely a "different" and "challenging way to code". "It is good to have small classes. Now that I have many classes, I see more structure." You should give it a try, too. Jeff Bay even recommends to run an exercise or prototype of at least 1000 lines for at least 20 hours.

The question if Object Calisthenics is applicable to real working systems remains. While it is excellent for exercise, it might be too strict to be used in production. On the other hand, in his final note, Jeff Bay talks about a system of 100,000 lines of code written in this style, where the "people working on it feel that its development is so much less tiresome when embracing these rules".

1 comment:

Peter Kofler said...

Checking Python Code for Compliance with Object Calisthenics
To use one of my Object Orientation workshops for a different client, I needed the same for Python. I found that Pylint already contained some checkers I could use out of the box:

* checkers.design_analysis with setting max-branches=1 (rule #1)
* checkers.refactoring with setting max-nested-blocks=1 (rule #2)
* checkers.design_analysis with setting max-attributes=2 (rule #7)

Diving deeper into Pylint's custom checkers, I created some of my own to enforce rules #4, #6, #8 and #9. For rule #4 (One Dot per Line) I used the approach of CodeSniffer and looked for nested expressions, allowing only one dot per statement, ignoring self references. Due to the dynamic nature of Python there is a grey area when working with types. For example for first-class collections (rule #8), only certain uses of collections are identified at the moment. I was not able to check for primitives at object boundaries at all, so rule #3 is not verified.

Python Project Setup
As for Java I created a repository for the LCD Numbers: lcd-numbers-object-calisthenics-python-setup. To test your setup there is some code in the project and run_pylint will show its violations. Pylint is configured using the objectcalisthenics.pylintrc file in the root directory. The individual checkers are in ./pylint/checkers.