7 May 2010

Custom PMD Rules

Cure PMDLast month I submitted my fourth article from the 'Code Cop' series. After Daily Build, Daily Code Analysis and Automated Testing I wrote about checking architecture rules. One approach described there used PMD. In case you don't know what PMD is, it's a static code analysis tool for Java. It was first developed in 2002 and is updated regularly. It checks the abstract syntax tree (AST) of Java code. (PMD is a great tool. You should definitely use it!)

PMD vs. AST (I just love TLAs)
PMD comes with many predefined rules. Some of them are real life-savers, e.g. EmptyCatchBlock. It's possible to define your own rules. The easiest way to do this is to use XPath expressions to match patterns in the AST. For example, the following trivial class
package org.codecop.myapp.db;

import java.sql.Connection;

public class DBSearch {
   ...
}
is represented inside PMD by the following AST
+ PackageDeclaration
|    + Name:org.codecop.myapp.db
+ ImportDeclaration
|    + Name:java.sql.Connection
+ TypeDeclaration
     + ClassOrInterfaceDeclaration:DBSearch
          + ClassOrInterfaceBody
               + ...
An XPath expression to find imports from the java.sql package looks like
//ImportDeclaration[starts-with(Name/@Image, 'java.sql.')]
It is quite simple (at least when you are familiar with XPath ;-) For further details read the PMD XPath Rule Tutorial.

I've always been enthusiastic about static code analysis and PMD in particular and have been using it successfully since 2004. (For example see my very first presentation on static code analysis with PMD.) I love it; It's a great tool. I have created several custom rules to enforce consistent coding conventions and to prevent bad coding practices. Today I am going to share some of these rules with you in this post.

First Rule
One common bug is public boolean equals(MyClass o) instead of public boolean equals(Object o) which shows that the developer is trying (and failing) to override the equals(Object) method of Object in MyClass. It will work correctly when invoked directly with a MyClass instance but will fail otherwise, e.g. when used inside collections. Such suspiciously close (but still different) equals methods are matched by
//ClassDeclaration//MethodDeclarator
  [@Image = 'equals']
  [count(FormalParameters/*)=1]
  [not ( FormalParameters//Type/Name[
           @Image='Object' or @Image='java.lang.Object' ] ) ]
A Copy is just a CopyThis rule explained in plain English matches all the method declarations that are named equals, and have one parameter which type is neither Object nor java.lang.Object. (Object is mentioned twice because PMD analyses the source and therefore can't know about simple and full qualified class names.) Later, Tom included this SuspiciousEqualsMethodName rule into PMD's default set of rules.

Copy and Paste
The easiest way to define your own rule is to take an existing one and tweak it. Like SuspiciousEqualsMethodName was derived from SuspiciousHashcodeMethodName, the next JumbledIterator is quite similar to JumbledIncrementer. (JumbledIterator was created by Richard Beitelmair, one of my colleagues who appointed me "Code Cop" and presented me with my first Code Cop T-shirt.) So what's wrong with the following line?
for (Iterator it1 = iterator(); it2.hasNext(); ) { ... }
Most likely it2 should be it1, shouldn't it. Richard created the following rule to pick up on these kinds of errors:
//ForStatement[
  ( ForInit//ClassOrInterfaceType/@Image='Iterator' or
    ForInit//ClassOrInterfaceType/@Image='Enumeration'
  ) and
  ( ends-with(Expression//Name/@Image, '.hasNext') or
    ends-with(Expression//Name/@Image, '.hasMoreElements')
  ) and not (
    starts-with(Expression//Name/@Image,
      concat(ForInit//VariableDeclaratorId/@Image, '.'))
  )
]
A Real Environment
After the last two sections we are warmed up and finally ready for some real stuff. On several occasions I have met developers who wondered why their code Long.getLong(stringContainingANumber) would not work. Well it worked, but it did not parse the String as they expected. This is because the Long.getLong() is a shortcut to access System.getProperty(). What they really wanted was Long.parseLong(). Here is the UnintendedEnvUsage rule:
//PrimaryExpression/PrimaryPrefix/Name[
  @Image='Boolean.getBoolean' or
  @Image='Integer.getInteger' or
  @Image='Long.getLong'
]
Chain LinkageCare for Your Tests
PMD provides several rules to check JUnit tests. One of my favourite rules is JUnitTestsShouldIncludeAssert which avoids (the common) tests that do not assert anything. (Such tests just make sure that no Exception is thrown during their execution. This is fair enough but why bother to write them and not add some assert statements to make sure the code is behaving correctly.) Unfortunately, one "quick fix" for that problem is to add assertTrue(true). Rule UnnecessaryBooleanAssertion will protect your tests from such abominations.

A mistake I keep finding in JUnit 3.x tests is not calling super in test fixtures. The framework methods setUp() and tearDown() of class TestCase must always call super.setUp() and super.tearDown(). This is similar to constructor chaining to enable the proper preparation and cleaning up of resources. Whereas Findbugs defines a rule for that, PMD does not. So here is a simple XPath for JunitSetupDoesNotCallSuper rule:
//MethodDeclarator[
  ( @Image='setUp' and count(FormalParameters/*)=0 and
    count(../Block//PrimaryPrefix[@Image='setUp'])=0
  ) or
  ( @Image='tearDown' and count(FormalParameters/*)=0 and
    count(../Block//PrimaryPrefix[@Image='tearDown'])=0
  )
]
Obviously this expression catches more than is necessary: If you have a setUp() method outside of test cases this will also be flagged. Also it does not check the order of invocations, i.e. super.setUp() must be the first and super.tearDown() the last statement. For Spring's AbstractSingleSpringContextTests the methods onSetUp() and onTearDown() would have to be checked instead. So it's far from perfect, but it has still found a lot of bugs for me.

That's all for now. There are more rules that I could share with you, but this blog entry is already far too long. I've set up a repository pmd-rules containing the source code of the rules described here (and many more).

12 comments:

Francisco Philip said...

hi code cop,

do you know how I do to test if the first line of method is a logging statement?

Peter Kofler said...

You would have to identify the exact pattern of methods that are ok, e.g.

public void goodMethod() {
   log.info("Processing");
    ...
}


Then you could use an XPATH to find all methods that do not start exactly like that:

//MethodDeclaration[not(Block/BlockStatement/Statement/StatementExpression/PrimaryExpression/PrimaryPrefix/Name/@Image='log.info')]

But this may be too simplistic: The name of the logger is hard-coded and only info type is supported. But you get the idea?!

Carsten said...

I noticed that you use Object and java.lang.Object in your PMD XPath rule.

Do you know if it is possible to check if a VariableDeclarator uses a class from a specific package?

I tried to find declarations from 3rd party packages but I can only get to the ImportDeclaration.

Peter Kofler said...

Yes, the Object and java.lang.Object are there because the source may contain simple of full qualified names. The same is true for your variable declaration. If I would like to suppress String variables, a first draft would be:

//VariableDeclarator/../Type/ReferenceType/ClassOrInterfaceType[@Image='String' or @Image='java.lang.String']

Note that this works only if the used class name is unique, otherwise you get false positives.

Usman Ismail said...

I found your article while searching for an image for mine. I discuss the same issue but go into use of the PMD Rule Designer tool and some details of AST if anyone is intersted.

http://www.techtraits.ca/custom-pmd-rules-using-xpath/

Peter Kofler said...

Thank you for the link Usman. That's an excellent post because I do not discuss how to develop these rules in detail.

Peter Kofler said...

Usman, I took the rule you published there and your recent one and added them to my custom PMD ruleset in which I collect useful rules as discussed in the post above. Thanks again for sharing.

Peter Kofler said...

Usman contacted me that his site has moved from techtraits.ca to techtraits.com. While I am unable to change old comments, I will repost the comment with the current URL:

PMD Rule Designer tool and some details of AST

Peter Kofler said...

Usman, I took the rule you published there and your recent one and added them to my custom PMD ruleset in which I collect useful rules as discussed in the post above. Thanks again for sharing.

Unknown said...

could anyone help me in writing a sonar rule for java program which states that whenever a particular list is returned by the function then that list must be assigned to a temporary variable only can be used

Peter Kofler said...

Since last year Sonar has deprecated PMD rules. Depending on your version of Sonar you can still use PMD rules, or have to create custom Sonar rules, which follow a different style.

I believe a custom PMD rule implemented as class could possibly check what you need, but it might not be as easy as other rules.

I propose you start your rule on your own and then share what you have on http://stackoverflow.com/ and ask for concrete advice there.

Good luck

Peter Kofler said...
This comment has been removed by the author.