Code Cop: clean code

Showing posts with label clean code. Show all posts

11 September 2019

Human Needs vs. Bad Code

We human beings have basic needs. There might be a finite, limited number of fundamental needs. We then choose different strategies to meet these needs. Techniques like Nonviolent Communication (NVC) are based on identifying and meeting shared needs. For example, my goal of professional training and coaching is not only knowledge transfer - i.e. teaching people useful tricks. I also want to stretch them, make them think "outside the box" and change their behaviour: e.g. focus in self-improvement, take control of their career and ultimately assume their social responsibility. When I thought about my work as Code Cop, what I liked about it and what inspired me, I arrived at my own needs: learning and growth - I want my clients to learn and grow. Further clarity, consistency and integrity which is what I need when working with software.

My Needs (Wordle)

Needs are central to our work, our own needs as much as the needs of our colleagues and clients. Earlier this year I attended an unconference, meeting some friends in Grenoble. We discussed how to align technical coaching work, e.g. Technical Agile Coaching, with business goals. We paired up and worked on different approaches. I was not surprised when one team started the whole alignment with discussing basic needs of all people involved. It was very interesting and outside of the scope of what I want to write about today.

Missing Code Quality
Today I want to write about possible reasons for missing code quality. I discussed human factors before and held the strong opinion that developers creating bad code were unskilled, lazy and weak. When considering needs and strategies to meet them, the issues is getting more vague. Which needs could be fulfilled by a developer creating some quick and dirty code, duplicating some method or adding some library just to play around with it? When I discussed this with my friend Aki Salmi, Software Crafter and Communication Trainer, he quickly came up with a bunch of needs like acceptance, appreciation, cooperation, consistency, inclusion, respect/self-respect, stability, trust, integrity, autonomy, choice, challenge, competence, contribution, creativity, discovery, effectiveness and purpose. This is a huge list of needs to get started. As an exercise to the reader, try to figure out how the listed needs could be met.

Needs Met When Creating Bad Code (in German)

Needs Met When Creating Bad Code (in German)

Three Examples
Now let's motivate some of the listed needs in detail. For example, someone might bring in some technology which is not necessary for the project and introduces additional, unnecessary complexity. Met needs might be creativity (I want to try something new), learning (I want to learn it), joy (I enjoy playing with it), safety (I will add it to my CV - Resume-driven development), autonomy and choice (I choose myself), consistency and integrity (I have always been using it), safety (I already know how to use it) and so on. What about someone who never argues for quality related changes, clean-ups, more time or reduced scope? Needs met might be appreciation (My boss is happy), ease (I do not argue with anybody), security and protection (I keep my job) etc. And for typical fire fighting, quick-and-dirty developers: competency (I can do it), efficacy (I am fast), effectiveness (I can make it work). These are just a few ideas I got while discussing the topic during a workshop.

Needs Met When Creating Good Code (in German)

Needs Met When Creating Good Code (in German)

Opposite
Now let us look at the opposite side. For quality code, I value consistency - which is obviously a good thing. It is also consistency which keeps some people from adapting to new ways of working, as in "this is the way I've always worked" - obviously a bad thing. If I keep my code base clean, I am sure that I can work on it later, which makes me feel safe. The same need for safety might keep someone from trying something new, because it is scary, or it might make people thrash their code because their boss is requesting too much in too short time.

Conclusion - If Any
Needs are everywhere. They are universal, cannot be denied nor argued. We use some strategies to fulfil needs when we mess up the code. There are many needs involved, maybe that is why there is so much bad code written. On the other hand, we use strategies to fulfil needs when keeping our code clean. It scares me that opposite behaviour might be driven by same needs. How can we condemn these duct-tape and legacy coders when they are just driven by their needs as we are?

15 October 2017

Introducing Code Smells into Code

Code Smells
Code smells are hints that show you potential problems in your code. They are heuristics: Like in real life, if something smells, look at it, think about it, and change it if necessary. In the classic book Refactoring - Improving the Design of Existing Code, Martin Fowler describes 21 code smells like Long Method, Primitive Obsession, Switch Statements, Feature Envy and other anti-patterns that indicate deeper problems in your code.

The Brutal Refactoring Game
Adrian Bolboaca came up with the Brutal Refactoring Coding Game. He explained the history of the game and the game itself on his blog. I attended his workshop at the XP conference 2013 and experienced the game first hand.

In the game participants are asked to write the cleanest code possible. If the facilitator spots any code smell, participants must stop and immediately remove it. Adding functionality is forbidden until the facilitator agrees that the smell has been removed. In his workshop, Adi gave us a numbered list of smells and gave cards with the appropriate number to pairs where he saw a code smell. And he was mercilessly flagging the smallest problems in our code. ;-)

Code Smells Used in the Game
Adi chose these code and test smells for his game:

Lack of tests
Name not from domain
Name not expressing intent
Unnecessary if
Unnecessary else
Duplication of constant
Method does more than one thing
Primitive obsession
Feature envy
Method too long (has more than six lines)
Too many parameters (more than three parameters)
Test is not unitary
Test setup too complex
Test has an unclear Act
Test has more than one assert
Test has no assert
Test has too many paths

Adi told me that he chose these smells because he saw them most often in his clients' code bases. His list definitely misses duplication, deeply nested conditionals and a some more. A more complete list might contain 30 items, making it more difficult and potentially frustrating for participants. (Maybe I will come up with the Moar Brutal Refactoring Game in the future...)

Observations during the Brutal Refactoring Game
This article is not about the Brutal Refactoring Game, but about code smells introduced into code. The game allows observation how and when code smells are introduced (because the whole point is to spot and remove them). As part of my refactoring training I facilitated the game more than ten times. Each time took 3 to 5 hours and had six to eight participants. The teams were average teams with several senior developers and an occasional junior developer. People worked in pairs and implemented Tic-Tac-Toe. Most teams used Java, two teams used C.

Discussion of Introduced Code Smells
Here is the code smells statistic:

Code Smells Introduced

The chart shows the number of problems I flagged during the last ten games. The different colours of the bars show the different teams. Obviously not all smells are introduced equally often. The first smells appear 10 to 15 minutes into the exercise. One team using C had difficulties with the setup and was going forward very slow - they produced little code and very few smells.

The first smell I usually see is 1 - Lack of tests. Even people following the TDD cycle happen to create "more production code than is sufficient to pass the test." This happens in the beginning and also later during the game.

Naming is hard. Not surprisingly the most common smells (number two - Name not from domain - and number three - Name not expressing intent) are naming related. Naming things after the problem domain seems twice as hard as pure technical naming. Any non trivial method could be named process or execute, but that does not help understanding the code at all.

Primitive Obsession (number eight) is the most common single code smell I have seen during the game. It is introduced early in development when method signatures are created and APIs are designed. It occurs roughly as often as the naming related smells together. Most Tic-Tac-Toe implementations are (publicly) based on numbers, pairs of numbers, arrays of numbers or the like. Primitive Obsession is very dominant in many (Java) code bases. In my code reviews I am used to method argument lists like String, String, String, String, String, int, long, long etc. Instead of using all these primitive values, they should be wrapped and should not be visible at object boundaries. (I have written more about primitives in the past.) This is an object oriented design smell.

The third most often flagged code smell are long methods (number ten). This smell is introduced later, when logic is added to existing methods. I see this smell more in the second part of the game. Even when using TDD, this smell is introduced if the refactoring phase is skipped or taken lightly. Long methods are also very common in legacy code bases and difficult to understand or change. Everyone hates these 1000 lines long methods, still I find them in every (large) code base I look at.

Code Smells Categories
To conclude this analysis let us have a look at problem categories. I aggregated Adi's 17 code smells into four groups:

Problems in test code
Naming related smells
Missing object orientation
Complexity

It seems that unit testing is the least problem - which it definitely not true. Most teams I work with have no automated (unit) tests for their production code. Maybe there were less testing issues during the game because the teams had learned about testing smells before. I practice refactoring with my teams after we have worked through all of unit testing.

Initially I was surprised to see missing object orientation high on the list. Now, after writing about it, I think it is also related to my "coaching/ learning plan". After refactoring I go for naming and finally object orientation. (Maybe the order of topics is wrong, but unit testing is easily sold to management and refactoring is asked for by developers often, making both topics ideal to start an improvement initiative.) I do not expect less naming problems, even after a few sessions on naming, because - as discussed before - naming is hard. I would expect the object orientation of the solutions to improve.

Samir Talwar wrote about his experience with the game. As facilitator he had a different focus, e.g. he was more strict about unnecessary if, treating it more like a No If constraint. He also saw different code smells being introduced. (I recommend reading his summary.) We both agree that naming is hard and causes many problems.

Comparison of Team Performance
While the participants were industry average - maybe even above - they were in need of improving. (Who is not?) The following bar chart shows the number of code smells introduced into the code by each team. (To compare the teams I removed the one with setup problems.) Some teams ran the exercise twice. Some of them improved, some did not.

Code Smell Categories by Team

On average, each team introduced 17 issues into their code base, right from the beginning of a small project, during a few hours of work. I am sure they tried hard because I was watching them, still this result is very disappointing. I am scared of the massive amount of code smells lurking in real world projects.

There is a noticeable difference between individual teams. Some teams created only half as many smells as other ones. Better teams introduced less code smells creating less technical debt.

Conclusion
Adi claims that you can have legacy code after 15 minutes. It is true. In a short time, the teams introduced many code smells into their code. The most common smells were bad names and Primitive Obsession. Different smells were introduced during different development activities. Some teams introduced less smells than others.

We need to focus on code smells. Noticing smells in our code is an important skill which can be trained. A good place to start practising are refactoring code katas like Emily Bache's Tennis Game and Yatzy. (Both exercises are available in many programming languages.) "Listening to code smells" improves our design. Finally I want to encourage you to watch out for primitive values on object boundaries as Primitive Obsession seems to be the most common problem in object oriented code.

Final disclaimer: The game is no scientific experiment throughout our industry. Only a few teams participated and the results are biased. Nevertheless I wanted to share the results.

27 February 2017

Time for Quality

Since one of my very first presentations in 2009 I have been asked how to make time for testing and code quality related activities. Keeping the code of high quality is difficult if your manager or product owner is only interested in deadlines and the number of hours spent on creating the software. People told me that in their organisation there is neither time nor budget for code quality and that their boss does not consider it necessary. While this is the perfect subject for an angry rant, this time I want to list some options instead.

Context
First let me set the context for my answers:

The demand for software developers and IT professionals in general is ever increasing. We are a highly privileged part of the entire workforce. While finding a decent job is always a (subjective but nevertheless real) problem, finding any work in software just to feed the family is not an issue. There is always another job.
Testing is a mindset - you have to want it. There are thousand excuses why not to write a test or delay that particular cleanup. Some people pretend not to be allowed to create quality code, while they are just lazy. I would rather hear their honest views. I am suspicious of colleagues, who claim an interest in quality code but keep delivering crap on a daily basis.
More often than the lack of interest is the lack of knowledge. For example many developers I meet are not sure how to write good automated tests, because they have written only a few in their entire career. Because they are unsure - they do not know how to start - they claim it to be time consuming and drop the idea. If I do something for the first time I will not do it well. Until I master the new skill I will be slow. These developers need training, more specifically they need practise. Every developer needs to be familiar with clean code, refactoring unit testing, and much more. If you feel not sure enough to apply these core skills in your daily work, ask for training, attend programming workshops at conferences or get a programming coach. Coding Dojos and Coderetreats are a great place to start.

Actions
So you want to write unit tests and are able to do it. But your boss has no interest in it or there is no team culture to do it. What could you do?

First read my favourite article of Joel Spolsky, Getting Things Done When You're Only a Grunt, which covers some organisational issues. Like Joel, I recommend staying true to yourself - just do it. This is hard in the beginning when you are alone, but others might follow. (It takes a lot of energy and several years to transform a whole team from within.) When complaints about your work start coming in you have several options:

Do not try to defend yourself and avoid arguing. This is the way you work, end of discussion. Again, you need to be fluent and sure to be able to do that. Especially if you are a new employee you have some privileges. The team might accept your new style. If you are applying testing and clean code successfully others might follow.
Argue for code quality. Sure you need some extra time to write these tests, but then you will not have to go back and fix the code again and again which results in less context switches, less defects and less hot-fix releases. I have not met any manager who would not understand that. But sometimes we (IT) do not use the right vocabulary to be understood by managers. When you talk to your boss, it is better to talk about cost, risk and financial benefit than technical issues. Using the management speak is a skill. Start by collecting hard facts about code quality related activities. Research the estimated cost and later benefit for your particular case and compare it to the risk and later cost of skipping it. The Technical Debt metaphor helps here. Create a simple Powerpoint (yes, blasphemy ;-) with these facts. End with the red traffic light or the estimated downward trend of the team's velocity. For example instead of "the code is ugly" you could say that "because of its inconsistent structure you are more likely to introduce mistakes when changing it." Charts displaying the absence of structure, e.g. class diagrams with edges all over the place, make the missing structure obvious for non-technical people.
Hide the activities. Push for code quality while you create the code. If you write the test before (TDD anyone?) or immediately after the implementation, writing unit tests is not visible as separate activity. The same is true for refactoring. Do not wait for Friday afternoon to clean up the code you wrote all week, because you might not have time for that. If you improve the code after each green test, there is no bad code even if you are forced to stop early. Further add the time for testing and cleanup to your estimates, but do not talk about it. I know some developers who estimate higher than others. Their estimates are accepted because their solutions are better and have less defects.
Make excuses. Even technical managers do not know the exact details about the code you work with. Maybe there were some issues not anticipated during planning. There are always some. Communicate their (exaggerated) impact and argue that you needed some time to deal with them while in reality you had to restructure some legacy class to add to it.
Finally, you can always find another job. Today you have more options than ever due to remote work.

There are several related questions on Stack Exchange which give more options, e.g. what do you do when your boss doesn't care about code quality or how can I convince management to deal with technical debt.

I wish you all the best for your quest for more code quality in your daily work!

16 August 2016

Absolute Priority Premise, an Example

Transformation Priority Premise
I would like to start this article with Uncle Bob Martin's Transformation Priority Premise - TPP for short - which he defined in 2011. With the premise Uncle Bob gave an algorithm how to find the next test when doing Test Driven Development. During (classic) TDD we use transformations to change the code to get from red to green. For example when starting with a new method the very first test is usually red because the empty or generated method returns null. We fake the first test by changing that null to return a constant value, e.g. return 5. This transformation is called nil->constant and has a high priority. A later test might force us to add a conditional to enable another fake. This transformation is called unconditional->if and has a medium priority. Replacing the value of a variable, i.e. variable->assignment has a low priority.

According to Uncle Bob, these transformations have a preferred ordering. We should prefer higher priority transformations to pass tests and chose tests in a way that they can be passed with higher priority transformations. When an implementation seems to require a low priority transformation, we could backtrack to see if there is another test to pass first which does not need that transformation. The theme behind the TPP is that as the tests get more specific, the code gets more generic. If you want to know more about TPP, see Uncle Bob's cartoon follow-up on the TPP and Sorting and his talk on the Transformation Priority Premise.

Absolute Priority Premise
In 2012 Micah Martin set out to define some heuristics to compare code objectively and proof that some code is better than some other code. Building on the TPP's priorities he defined the Absolute Priority Premise - APP for short. The APP knows six components, i.e. basic building blocks of code, and assigns them a mass. The building blocks and their weights were

constant, a value in code has mass of 1.
binding, a name or variable has a mass of 1, too.
invocation, calling a method or function - mass 2.
conditional, any form of if, switch or case - mass 4.
loop, for or while loops - mass 5.
assignment, replacing the value of a variable - mass 6.

More complex building blocks have a higher mass. The total mass of a piece of code is the sum of the mass of its respective components. A lower value is better. The best set of specific values of mass are unknown and I will use Micah's weights. If you chose to count different, please let us discuss!

A detailed explanation of the Absolute Priority Premise is given in the two presentations of 8th Light University (8LU) - Part One and Part Two. See also Micah's Coin Changer Kata as a complete example of applying the premise.

Measuring the Mass of Code
I like Micah's idea to measure the mass of code. It might not be a direct indication of readability but simpler code is always better. I wrote six different versions of the Word Wrap kata and was wondering which one would be considered the "best". I should have calculated the mass of these algorithms manually, but I followed Terence Parr's advice, to avoid working by hand five days what I can spend five years of my life automating. I had to create a tool to calculate the mass of Java code - which of course took me much longer, especially as I verified the mass of each algorithm manually to make sure my code worked as expected.

Absolute Priority Counter
I created abpricou, the ABsolute PRIority COUnter. It parses Java source and collects its mass as defined by the Absolute Priority Premise. It is written in Python, because it was Python month when I started working on it. It uses the Python Java Parser plyj. plyj offers a Visitor API for parser events,

class CountingVisitor(m.Visitor):
    def __init__(self):
        super(CountingVisitor, self).__init__()
        ...

Constants and bindings are staight forward, e.g.

    def visit_Literal(self, literal):
        return self._a_constant_value()

    def visit_FormalParameter(self, parameter):
        return self._a_name()

Literals in the code are counted as constants and parameters are only names for values, thus bindings. A method or function counts as a constant for the code it represents and a binding for its name.

    def visit_MethodDeclaration(self, declaration):
        self._a_name()
        return self._code_is_a_constant()

Invocations, conditionals and loops are the same. For example

    def visit_MethodInvocation(self, invocation):
        return self._an_invocation()

    def visit_IfThenElse(self, conditional):
        return self._a_conditional()

    def visit_While(self, loop):
        return self._a_loop()

The only interesting case is the assignment. Not every assignment in Java is counted as an assignment, only re-assignments which modify values are counted. final fields or local variables are just names for expressions similar to parameters.

    def visit_Assignment(self, assignment):
        if ... :
            # code to ignore
            # final field is assigned in constructor
            # final local is assigned in block
        else:
            return self._an_assignment()

Get the tarball and installable egg of the Absolute Priority Counter.

Let's see some code: Word Wrap Kata
As I said before, I developed the counter to calculate the mass of my different implementations of Word Wrap. I was interested in the mass of the algorithm and skipped constructs like class definitions. (I suppose a class definition is a constant for the code and a binding for the name.) I also ignored Annotations, Enumerations and Generics. The first algorithm uses recursion to loop over the blanks to find the end of each line. Its code is

   final char BLANK = ' ';
   final char NEWLINE = '\n';

   String wrapRecursive(String line, int maxLineLen) {
      if (line.length() <= maxLineLen) {
         return line;
      }

      int indexOfBlank = line.lastIndexOf(BLANK, maxLineLen);
      int split;
      int offset;
      if (indexOfBlank > -1) {
         split = indexOfBlank;
         offset = 1;
      } else {
         split = maxLineLen;
         offset = 0;
      }
      return line.substring(0, split) + NEWLINE
           + wrap(line.substring(split + offset), maxLineLen);
  }

including the components

| Component   | Mass  | Count |
| constant    |   1   |   7   |
| binding     |   1   |   8   |
| invocation  |   3   |  10   |
| conditional |   4   |   2   |
| loop        |   5   |   0   |
| assignment  |   6   |   0   |

resulting in a total mass of 53. Further implementations have the following masses:

The tail recursive solution has neither loops nor assignments similar to the recursive one and has a total mass of 71. (The source of this and all further variants is given in Word Wrap Kata Variants.)
The looping variant contains a loop and an assignment resulting in a mass of 68.
The optimised loop which avoids temporary String objects, contains a loop and an assignment and some more invocations. Its mass is 80.
The loop using a buffer saves even more heap allocations and needs more invocations summing up to 105.
The solution using Regular Expressions as shown in the original article has a mass of 55. Merging the parts of the expression into a single expression would save five invocations, but makes the expression less readable. I argue that the Regular Expression itself has some weight, as it contains one conditional ("|" in line 3) and two loops ("(.{1," + maxLineLen + "})" in lines 1 and 4). So a fair weight is 69.
The functional version using flatMap and reduce is quite verbose due the lack of inferred pair types in Java. As plyj does not support Java 8 I was unable to measure its mass.

First Conclusion on Algorithms
The most basic, recursive version of Word Wrap has the least weight. What does it mean? Is it the best version? It is the most compact version without playing Code golf, at least in Java. It does not mutate any variables but it puts the highest load on the Garbage Collector. All the discussed algorithms have different memory and run time performance characteristics. I had hoped for a clear answer what the best version would be and I am not seeing that. I challenge you to leave a comment with a version of Word Wrap with a smaller weight. Would it be considered better code?

The mass of the basic loop version seems too much compared to the recursive version. The current weights favour functional programming by putting a penalty on loops and assignments. Mutating a local variable has a smaller weight than mutating the state of an object. The looping version uses StringBuilder instead of plain String + which needs two more invocations for its construction. Then its components are

| Component        | Mass  | Count |
| constant         |   1   |   8   |
| binding          |   1   |  10   |
| local assignment |   1   |   3   | like a new local binding
| invocation       |   3   |  10   |
| conditional      |   4   |   1   |
| loop             |   5   |   1   |
| assignment       |   6   |   0   |

and its mass is 60 which is a bit more code than the recursive solution.

More Structure
The different wrap() functions above are just algorithms. They are not factored into parts that might change at different speeds over time and definitely violate the SRP. They also break the OCP as it is impossible to change the strategy of splitting without touching the logic of collecting. To address this I wrote another recursive version following Tell, don't ask,

interface Page {
   void renderLine(String lineOfProperLength);
}

class Wrapper {

   final char BLANK = ' ';

   final Page page;
   final int maxLineLen;

   Wrapper(Page page, int maxLineLen) {
      this.page = page;
      this.maxLineLen = maxLineLen;
   }

   void wrap(String line) {
      if (line.length() <= maxLineLen) {
         page.renderLine(line);
         return;
      }

      int indexOfBlank = line.lastIndexOf(BLANK, maxLineLen);
      int split;
      int offset;
      if (indexOfBlank > -1) {
         split = indexOfBlank;
         offset = 1;
      } else {
         split = maxLineLen;
         offset = 0;
      }

      page.renderLine(line.substring(0, split));
      wrap(line.substring(split + offset));
   }
}

which is very similar to the tail recursive solution above, with a total weight of 56. The number is smaller because it does not contain the necessary implementation of Page.renderLine(). The shown code does less, which makes it difficult to compare. On the other hand Page.renderLine() might be used by other parts of the code as well, so it is not strictly a part of Word Wrap.

Moar Structure!1!!
Later I created an extremely factored and decomposed implementation of Word Wrap that separated concepts of rendering, splitting, accumulating lines and hyphenation rules which should fulfil both SRP and OCP. Ignoring the HyphenationRule because that is not covered by the other solutions, its mass is 167 due to many method invocations and parameter names:

| Component   | Mass  | Count |
| constant    |   1   |  17   |
| binding     |   1   |  33   |
| invocation  |   3   |  30   |
| conditional |   4   |   4   |
| loop        |   5   |   1   |
| assignment  |   6   |   1   |

There is a loop and an assignment, at its core this implementation is a looping one. Using the recursive solution I should be able to get rid of the loop and the assignment.

Conclusion
The Absolute Priority Premise counts the different components of a program and sums their weights. If the weights are correct than the program with the smallest mass would be considered the "best" program. This is not true for algorithms like the Word Wrap because the APP ignores features like memory usage or performance optimisation. For general purpose code the validity of the APP is unclear. It just measures the number of code constructs, favouring more compact, functional code. On the other hand the four rules of simple design encourage us to introduce explaining variables to reveal the intent of the code which is more important than fewer elements.

9 June 2016

Oracle Code QA

As Code Cop I want all my code to be clean so I keep my sanity when maintaining it. Some basic pillars that support internal code quality regardless of programming language are Coding Conventions, automated (unit) tests, Static Code Analysis and Continuous Integration. I discuss all of them in my Code Quality Assurance lecture (and its latest slides are here). A good development process covers all these and more.

Recently a colleague inherited a bunch of Oracle PL/SQL code and asked me for help. Being used to Java and many tools that help us keeping the code in shape, e.g. JUnit, Checkstyle, PMD, Jenkins, he wanted the same for his database code. While some programming language ecosystems are traditionally strong in supporting the things I mentioned earlier, some other languages seem to lack behind. Clearly there are fewer options for less used languages. But that must not stop us from applying the same rigour to our code. Let's get started!

Database Naming Conventions
First we need coding conventions because consistency is important. Unlike Java where most projects follow the Oracle conventions, there is no such thing for Oracle databases. Instead there are several, sometimes contradicting proposals and you have to put together your own set of rules. Here are some reasonable ones for schema objects:

Tim Hall's Oracle Naming Conventions contain a short list of rules for entity names, foreign keys, triggers and PL/SQL variables.
Simon Sheppard's Oracle Naming Conventions are similar to Tim's with additional focus on constraints and more on SQL and PL/SQL variables.
Gints Plivna's Naming conventions for Oracle tables, columns, indexes present good rules how to combine aliases to meaningful names.

PL/SQL Coding Conventions
The Procedural Language/Structured Query Language (PL/SQL) was introduced by Oracle in 1992. It is a compiled, procedural and structured language. By these attributes it is similar to modern languages like Java or C#, and all the general advice for naming, formatting, commenting, function scope and code size apply. Even object oriented concepts like Encapsulation or Coupling are meaningful (to a certain degree). See my presentation on Clean PL/SQL for more details. Again there are no official conventions from Oracle.

Steven Feuerstein's Naming Conventions and Coding Standards contain a list of naming conventions for PL/SQL variables together with some guidelines and a discussion of rejected conventions. If you do not know Steven, he is probably the authority on PL/SQL programming and knows what he is talking about. He also outlines a way to check the conventions, which I really like.
Philip Greenspun's SQL Style contains a few rules on formatting SQL statements for better readability.
Trivadis' PL/SQL and SQL Coding Guidelines are a complete set of standards regarding naming, formatting, language usage and control structures. It is a very comprehensive document of almost 60 pages and looks really impressive.

How to Choose Your Own Conventions
As there is no standard, you need to roll your own. To get started I recommend reading all the resources above (and even google for some more) and get an idea what could and should be defined. Then you look at your existing database objects and source code. Usually developers follow some conventions and some percentage of the code uses similar patterns in formatting or naming. If one of the used conventions is in the limits of the different proposals above - and you like it - then start with it. (Starting from something that is already there reduces your options and the resulting conventions are less optimal, but on the other hand you have a bigger change to get the code into a consistent state, because some part of the code follows the rules. If there are no existing patterns in your code, if you are starting from scratch or if all you see is crap, you still need to define different conventions.)

Start with a small set of rules in the beginning. There should be some naming schemes, table aliases and formatting rules. If you define too many rules at once, there will be too many violations in the existing code and people will argue that adhering to the conventions is too much work. Later, when everybody got used to the rules, it is time to add more of them. You will find more specific rules during the lifetime of a project, e.g. by identifying bug patterns to be avoided in the future. Conventions need to grow. Unless you are beginning a new project and want to start with a full set of conventions, the Trivadis conventions mentioned above might be too comprehensive to start with. But they are an excellent example how a full blown conventions document looks like.

Reviewing your code, reading the provided resources and collecting the basic rules that apply and that you like should not take you more than a few hours. It is more important to start with the first version of coding conventions sooner than to start with a complete set later.

Unit Test
If you are used to JUnit, RSpec or Jasmine you will be disappointed. There is not much support for unit testing in PL/SQL.

Usually - if at all - developers create stored procedures that call other procedures and check the results programmatically. If these test procedures follow a common convention, e.g. raising an exception on test failure, it is possible to automate calling them from the command line or build server.
Another option is utPLSQL, a basic unit testing framework created by Steven Feuerstein. It works as expected, but lacks the comfort of modern unit testing frameworks. I used it to test my PL/SQL port of Gilded Rose. (Gilded Rose is a testing kata where you need to create a lot of tests. It is an excellent exercise to get a first impression of a unit testing framework.)
Oracle SQL Developer has some support for automated testing. Unlike utPLSQL it is driven through the user interface. Tests are created and executed through the UI of SQL Developer. A Test case is a set of input values - usually rows in one or more tables - and a call to a stored procedure. Then the updated values are compared against a set of expected values. To see this in action, check out Jeff Smith's introduction to Unit Testing Your PL/SQL with Oracle SQL Developer. It is easy to create first tests, but test definition lacks the power of a general purpose programming language. Further I do not like that test definitions are "hidden" in some SQL Developer specific tables, the Unit Test Repository. However if you are a heavy user of SQL Developer, it might still be reasonable to use it for testing, too.
DbFit is an extension to FitNesse, a standalone, acceptance testing framework. DbFit tests are written using tables, making some test scenarios more readable than xUnit-style tests.
Steven has more recommendations for unit testing PL/SQL, including Toad's Code Tester for Oracle. If you licenced the Toad Suite, it is worth checking out its testing functionality.

Unit testing is mandatory, but no single approach or framework looks superior. I believe the best approach is to evaluate the different options, using the tools you already have. Maybe create some tests for the Gilded Rose in each of the testing frameworks to see what works for you and what not.

Static Code Analysis

A vital part of code quality assurance is static code or program analysis. The source or object code is analysed without actually executing it to highlight possible coding errors. I love static analysis but have never used tools for PL/SQL myself. Steven agrees with me that we should use Lint Checkers for PL/SQL - and of course he is right. He lists some tools that add warnings besides the checks provided by Oracle.

A free tool is PMD for PL/SQL. PMD is my favourite code checker for Java and it works great. There are only a few rules for PL/SQL but more can be added easily. (See how I added custom rules to PMD in the past.) PMD is definitely a tool I would use first.
Trivadis PL/SQL Cop looks very promising. I am not sure about its licence, but it seems to be free. Rules must be checked automatically each day, e.g. in the nightly build, so tools must work from the command line. Again I do not know if PL/SQL Cop works like that. The next step would be to experiment with it and see if it can be run from the command line.
Another great tool is the Sonar Source PL/SQL Plugin. The plugin adds PL/SQL support to SonarQube. SonarQube is a free, open platform to manage code quality. It is widely used in the Java community. The plugin is commercial, but if you need to manage a lot of PL/SQL code I would recommend buying it nevertheless.
There are several other commercial tools available, e.g. ClearSQL by CONQUEST, which I did not check.

For static code analysis I follow the rule that more is better. I recommend to start with a basic tool, e.g. PMD, and keep adding tools and rules over time. In existing projects you need time to fix the violations, e.g. WHEN OTHERS THEN NULL, and starting with too many rules in the beginning creates a lot of work.

Putting It All Together
After you established conventions, added automated tests and configured some static analysis tools, it is time to put it all together and shorten the feedback loop. While you could run tests and checks manually from time to time, it would be more helpful to do so every night, or even better on each check-in/push. (Checking your code on each check-in requires you to put your DDLs and package sources under version control. While this adds some extra steps to your development workflow, I highly recommend doing so.) A tool like Jenkins or another Continuous Integration server could be used to create an empty database instance, execute your DDLs and compile all your packages. Starting with an empty database instance is important to avoid works on my machine problems. Then Jenkins should run all tests to verify that the code works as expected. The final step is to analyse your code for violations of coding convention and potential problems. Many people add more steps like generating documentation or packaging deployment bundles suitable to be deployed by the operations team's DBA.

Just Do It!
Terence Parr recommends to automate anything that you might screw up and he is right. Creating working software is hard enough, we should not bother with manual tasks, rather automate them. Further automated checks keep the quality of our software high, resulting in faster maintenance and less bugs. This leaves us more time for the interesting parts of software development - solving problem and creating solutions.

20 July 2015

Write the worst code you can

Global Day of Coderetreat 2014
Last Code Retreat I was lucky to co-facilitate the event together with Alexandru Bolboaca from Romania. He wrote a summary of the event in Vienna in his blog, so I will not describe it myself. I rather describe one constraint he introduced. A constraint, also known as an activity, is a challenge during a kata, coding dojo or code retreat designed to help participants think about writing code differently than they would otherwise. I have written about other constraints before, e.g. No naked primitives.

The Fun Session
The last session of a Code Retreat is usual a free session that is supposed to close a great day and it should be fun. Alex liked to gives participants many options and he came up with a list of different things they could do: Alex Bolboaca Fun Session

While I was familiar with all the pairing games, e.g. Silent Evil Pairing, which is also called Mute with Find the Loophole, I had never tried inverting a constraint.

Write the worst code you can
Alex explained that there were many ways to write bad code. All methods could be static, only mutable global state, bad names, too big or too small units (methods, classes), only primitives, only Strings, to just name a few. Some of these bad things are available as their own (inverted) constraints, e.g. all data structures must be hash tables. (PHP I am looking at you ;-) Bad code like this can get out of hand easily, imagine writing the whole Game of Life in a single, huge method.

But how could this teach us anything? A regular constraint is an exaggeration of a fundamental rule of clean code or object oriented design. People have to figure out the rule and work hard to meet the constraint. The inverted constraint is similar. People have to think about what they would do, and then do the opposite, exaggerating as much as possible. As a bonus, most of them get stung by the crap they just created a few minutes ago. And most developers really enjoy themselves working like that. Maybe they have too much fun, but I will come back to that later.

... or maybe not
I used the constraint in several in-house Coding Dojos when the teams asked for fun sessions. Here is what I learned: The constraint is not hard. Some people get creative but in general it is too easy to write bad code. Most "worst" code I saw had inconsistent naming, a lot of duplication, primitives all over the place and bad tests. In general it looked almost like the day to day code these teams deliver. I particularly loved a dialogue I overheard, when one developer wrote some code and proclaimed it as ugly while his pair denied its ugliness.

One participant complained that he wanted to learn how to write clean code and that he saw this worst case code everyday anyway. He was really frustrated. (That is why I said not all people like this constraint.) And this is also pointing out a potential problem of inverted constraints in general - practising doing it wrong might not be a good idea after all.

Conclusion
I am unable to decide if this constraint is worth using or not. I like the idea of inverted constraints and this one is definitely fun and probably good to use once in a while. But I believe the constraint is lacking focus and is therefore too easy to meet. People should not deliver code during a practise session that just looks like their production code. I want them to work harder ;-) I will try more focused, inverted constraints next. Allowing only static methods might be a good start. On the other hand, code created when following the constraint to only use Strings as data structures is too close to the regular (bad) code I see every day. Therefore a good inverted constraint would be one that takes a single, bad coding style to the extreme. Allowing only one character for names comes to my mind, which is an exaggeration of bad naming - or even a missing tool constraint, because it renders naming non-existing.

Have you worked with inverted constraints? If so, please share your constraints and opinions about them in the comments below.

27 October 2013

CodeCopTour Week 7

I think it would speed things up if I cover two weeks of my Pair Programming Tour in one swoop, as I did two weeks ago. I want to share some insights besides the usual diary, so I need to increase my writing throughput. On the other hand I do not like going faster because writing a blog post takes as long as it takes. Although I know people who do speed blogging, the OCD part of my personality does not like short-cuts, obviously.

Last week I was hosted by Thomas Baldauf of the Austrian Environment Agency. I did not know Thomas but had got his name from a friend. After a short email he immediately agreed to host me without any questions which surprised me a lot. Unfortunately Thomas, the lead of the development team, did not have time to work with me in person but had prepared some developers from his team for my visit. I worked with Nexhat Gashi on a small feature of their newest web application which is based on JSF technology. I did not know JSF before I started my tour, but saw it all the time when pairing with people. Probably I will be fluent with JSF until end of November ;-)

I also worked with Martin Lackner whom I knew as long time attendee of our Eclipse DemoCamps in Vienna. Martin, an Eclipse platform veteran, worked on an MDA prototype using Xtext. I had worked with the older versions 0.7 and 1.0 before and he had prepared all the nitty-gritty details in the previous week. I reviewed his DSL and probably spoilt all his fun by proposing a different, shorter syntax. We moved forward very fast. In only one day we created a technical language to define entities and aggregates, together with full editor support, code completion, validation and proper formatting. Such is the power of Xtext - I love it. If you do not know Xtext, I really encourage you to check it out. The Eclipse Xtext project has excellent documentation and many examples which provide a starting point to get your DSL up and running in no time.

Free Lunch

The only compensation I asked for pairing was food and beverages throughout the day. It worked out well - till now I have been provided with free lunch every day. All my hosts were very polite and offered me coffee or drinks and asked me for my preferences when choosing places to eat. I never asked for anything special but went for lunch where my partners went. I visited take-away noodle stores, staff canteens and fancy restaurants. Having lunch with many people, together with the various places in Vienna I have been to, was a culinary trip of its own.

The two weeks I spent away from home, I stayed in a hotel. The host company refunded me 600 Euro for my expenses per week. It was a nice hotel, expensive but comfortable. The money was sufficient because I saved on food, eating in the company canteen now and then. Staying in a hotel and working all day was acceptable for the two weeks, and I enjoyed the rich breakfast buffet including bacon and scrambled eggs. But I would not like to stay in a hotel for extended periods of time. I am not a travelling person and I never worked like that. I enjoy going home after work, where connectivity is good and wireless network is free ;-)

Can PL/SQL be Clean?
At the end of the week I spoke at the Austrian Oracle User Group. The organizer had planned an event focusing on clean development and three friends had recommended me to him independently. I agreed to give a presentation but was not sure about the topic I should talk about. For talking to an Oracle user group, the first thing that came to my mind was PL/SQL, Oracle's database language. I have seen horrible pieces of PL/SQL and the question was if it can be written in a clean way? According to Michael Feathers, the author of Working Effectively with Legacy Code, "clean code looks like it was written by someone who cares." I presented his quote as first rule of clean code. Continuing the discussion I listed several books about clean code, including Code Complete by Steve McConnell. Steve said to "write programs for people first, computers second", which I defined as second rule of clean code. Following both rules it was obvious that even PL/SQL can be written in a clean way. I got good feedback on my presentation, especially that it was very entertaining, so I encourage you to check out the slides on Slideshare.

9 February 2012

Required Reading: Clean Code

Here is an email that I wrote to my team earlier this year. The team members are able to deliver new features on time but do not care for code quality (at least not as much as I do ;-). This is going to change.

Raising the bar. ⁽¹⁾

Clean Code Assets

Dear team,
as mentioned in the kick-off, we need to get into the right attitude for the upcoming refactoring effort. Many items on our code clean-up list are ongoing changes, e.g.

use proper names for fields and methods
clean up magic numbers
split large classes
add JavaDoc to core classes
fix compiler warnings
remove duplicated code
remove dead code
add JUnit tests

All of these changes are in fact rules how to produce code that is easy to read and maintain. All these rules are part of a coding style sometimes called "clean code".

The Pragmatic Programmers once wrote that you should read at least four technical books a year to stay sharp and relevant in our fast paced industry. Remember that the half-life of our technical knowledge in only 18 months. (Heinz Kabutz) So as first technical book to read in 2012 I highly recommend Clean Code by Bob Martin. It's a great book and has raised a new wave of code-consciousness. Read one of its reviews if you do not believe me.

Sooner or later you will have to read it, so why not start now? Go ahead, buy it, read it! Or at least browse it and see what is inside. We cannot afford to have any more cryptic variable names or huge classes.

Regards,
Code Cop

Lowering the bar.

Of course I did not attach the PDF of the book. That would have been illegal but it might lower the bar to get my colleagues into reading it. I know that not many of the team will read it. I would consider my email successful if one of two read the table of contents or browse a chapter.

⁽¹⁾ "Raising the bar" is the subtitle of the Software Craftsmanship Manifesto.