9 June 2016

Oracle Code QA

As Code Cop I want all my code to be clean so I keep my sanity when maintaining it. Some basic pillars that support internal code quality regardless of programming language are Coding Conventions, automated (unit) tests, Static Code Analysis and Continuous Integration. I discuss all of them in my Code Quality Assurance lecture (and its latest slides are here). A good development process covers all these and more.

Recently a colleague inherited a bunch of Oracle PL/SQL code and asked me for help. Being used to Java and many tools that help us keeping the code in shape, e.g. JUnit, Checkstyle, PMD, Jenkins, he wanted the same for his database code. While some programming language ecosystems are traditionally strong in supporting the things I mentioned earlier, some other languages seem to lack behind. Clearly there are fewer options for less used languages. But that must not stop us from applying the same rigour to our code. Let's get started!

Everyone is Marcin Grzejszczak todayDatabase Naming Conventions
First we need coding conventions because consistency is important. Unlike Java where most projects follow the Oracle conventions, there is no such thing for Oracle databases. Instead there are several, sometimes contradicting proposals and you have to put together your own set of rules. Here are some reasonable ones for schema objects:PL/SQL Coding Conventions
The Procedural Language/Structured Query Language (PL/SQL) was introduced by Oracle in 1992. It is a compiled, procedural and structured language. By these attributes it is similar to modern languages like Java or C#, and all the general advice for naming, formatting, commenting, function scope and code size apply. Even object oriented concepts like Encapsulation or Coupling are meaningful (to a certain degree). See my presentation on Clean PL/SQL for more details. Again there are no official conventions from Oracle.
  • Steven Feuerstein's Naming Conventions and Coding Standards contain a list of naming conventions for PL/SQL variables together with some guidelines and a discussion of rejected conventions. If you do not know Steven, he is probably the authority on PL/SQL programming and knows what he is talking about. He also outlines a way to check the conventions, which I really like.
  • Philip Greenspun's SQL Style contains a few rules on formatting SQL statements for better readability.
  • Trivadis' PL/SQL and SQL Coding Guidelines are a complete set of standards regarding naming, formatting, language usage and control structures. It is a very comprehensive document of almost 60 pages and looks really impressive.
How to Choose Your Own Conventions
As there is no standard, you need to roll your own. To get started I recommend reading all the resources above (and even google for some more) and get an idea what could and should be defined. Then you look at your existing database objects and source code. Usually developers follow some conventions and some percentage of the code uses similar patterns in formatting or naming. If one of the used conventions is in the limits of the different proposals above - and you like it - then start with it. (Starting from something that is already there reduces your options and the resulting conventions are less optimal, but on the other hand you have a bigger change to get the code into a consistent state, because some part of the code follows the rules. If there are no existing patterns in your code, if you are starting from scratch or if all you see is crap, you still need to define different conventions.)

Quality ControlStart with a small set of rules in the beginning. There should be some naming schemes, table aliases and formatting rules. If you define too many rules at once, there will be too many violations in the existing code and people will argue that adhering to the conventions is too much work. Later, when everybody got used to the rules, it is time to add more of them. You will find more specific rules during the lifetime of a project, e.g. by identifying bug patterns to be avoided in the future. Conventions need to grow. Unless you are beginning a new project and want to start with a full set of conventions, the Trivadis conventions mentioned above might be too comprehensive to start with. But they are an excellent example how a full blown conventions document looks like.

Reviewing your code, reading the provided resources and collecting the basic rules that apply and that you like should not take you more than a few hours. It is more important to start with the first version of coding conventions sooner than to start with a complete set later.

Unit Test
If you are used to JUnit, RSpec or Jasmine you will be disappointed. There is not much support for unit testing in PL/SQL.
  • Usually - if at all - developers create stored procedures that call other procedures and check the results programmatically. If these test procedures follow a common convention, e.g. raising an exception on test failure, it is possible to automate calling them from the command line or build server.
  • Another option is utPLSQL, a basic unit testing framework created by Steven Feuerstein. It works as expected, but lacks the comfort of modern unit testing frameworks. I used it to test my PL/SQL port of Gilded Rose. (Gilded Rose is a testing kata where you need to create a lot of tests. It is an excellent exercise to get a first impression of a unit testing framework.)
  • Oracle SQL Developer has some support for automated testing. Unlike utPLSQL it is driven through the user interface. Tests are created and executed through the UI of SQL Developer. A Test case is a set of input values - usually rows in one or more tables - and a call to a stored procedure. Then the updated values are compared against a set of expected values. To see this in action, check out Jeff Smith's introduction to Unit Testing Your PL/SQL with Oracle SQL Developer. It is easy to create first tests, but test definition lacks the power of a general purpose programming language. Further I do not like that test definitions are "hidden" in some SQL Developer specific tables, the Unit Test Repository. However if you are a heavy user of SQL Developer, it might still be reasonable to use it for testing, too.
  • DbFit is an extension to FitNesse, a standalone, acceptance testing framework. DbFit tests are written using tables, making some test scenarios more readable than xUnit-style tests.
  • Steven has more recommendations for unit testing PL/SQL, including Toad's Code Tester for Oracle. If you licenced the Toad Suite, it is worth checking out its testing functionality.
Unit testing is mandatory, but no single approach or framework looks superior. I believe the best approach is to evaluate the different options, using the tools you already have. Maybe create some tests for the Gilded Rose in each of the testing frameworks to see what works for you and what not.

Static Code Analysis
magnifyA vital part of code quality assurance is static code or program analysis. The source or object code is analysed without actually executing it to highlight possible coding errors. I love static analysis but have never used tools for PL/SQL myself. Steven agrees with me that we should use Lint Checkers for PL/SQL - and of course he is right. He lists some tools that add warnings besides the checks provided by Oracle.
  • A free tool is PMD for PL/SQL. PMD is my favourite code checker for Java and it works great. There are only a few rules for PL/SQL but more can be added easily. (See how I added custom rules to PMD in the past.) PMD is definitely a tool I would use first.
  • Trivadis PL/SQL Cop looks very promising. I am not sure about its licence, but it seems to be free. Rules must be checked automatically each day, e.g. in the nightly build, so tools must work from the command line. Again I do not know if PL/SQL Cop works like that. The next step would be to experiment with it and see if it can be run from the command line.
  • Another great tool is the Sonar Source PL/SQL Plugin. The plugin adds PL/SQL support to SonarQube. SonarQube is a free, open platform to manage code quality. It is widely used in the Java community. The plugin is commercial, but if you need to manage a lot of PL/SQL code I would recommend buying it nevertheless.
  • There are several other commercial tools available, e.g. ClearSQL by CONQUEST, which I did not check.
For static code analysis I follow the rule that more is better. I recommend to start with a basic tool, e.g. PMD, and keep adding tools and rules over time. In existing projects you need time to fix the violations, e.g. WHEN OTHERS THEN NULL, and starting with too many rules in the beginning creates a lot of work.

Putting It All Together
After you established conventions, added automated tests and configured some static analysis tools, it is time to put it all together and shorten the feedback loop. While you could run tests and checks manually from time to time, it would be more helpful to do so every night, or even better on each check-in/push. (Checking your code on each check-in requires you to put your DDLs and package sources under version control. While this adds some extra steps to your development workflow, I highly recommend doing so.) A tool like Jenkins or another Continuous Integration server could be used to create an empty database instance, execute your DDLs and compile all your packages. Starting with an empty database instance is important to avoid works on my machine problems. Then Jenkins should run all tests to verify that the code works as expected. The final step is to analyse your code for violations of coding convention and potential problems. Many people add more steps like generating documentation or packaging deployment bundles suitable to be deployed by the operations team's DBA.

Just Do It!
Terence Parr recommends to automate anything that you might screw up and he is right. Creating working software is hard enough, we should not bother with manual tasks, rather automate them. Further automated checks keep the quality of our software high, resulting in faster maintenance and less bugs. This leaves us more time for the interesting parts of software development - solving problem and creating solutions.