11 September 2019

Human Needs vs. Bad Code

We human beings have basic needs. There might be a finite, limited number of fundamental needs. We then choose different strategies to meet these needs. Techniques like Nonviolent Communication (NVC) are based on identifying and meeting shared needs. For example, my goal of professional training and coaching is not only knowledge transfer - i.e. teaching people useful tricks. I also want to stretch them, make them think "outside the box" and change their behaviour: e.g. focus in self-improvement, take control of their career and ultimately assume their social responsibility. When I thought about my work as Code Cop, what I liked about it and what inspired me, I arrived at my own needs: learning and growth - I want my clients to learn and grow. Further clarity, consistency and integrity which is what I need when working with software.

My Needs (Wordle)
Needs are central to our work, our own needs as much as the needs of our colleagues and clients. Earlier this year I attended an unconference, meeting some friends in Grenoble. We discussed how to align technical coaching work, e.g. Technical Agile Coaching, with business goals. We paired up and worked on different approaches. I was not surprised when one team started the whole alignment with discussing basic needs of all people involved. It was very interesting and outside of the scope of what I want to write about today.

Missing Code Quality
Today I want to write about possible reasons for missing code quality. I discussed human factors before and held the strong opinion that developers creating bad code were unskilled, lazy and weak. When considering needs and strategies to meet them, the issues is getting more vague. Which needs could be fulfilled by a developer creating some quick and dirty code, duplicating some method or adding some library just to play around with it? When I discussed this with my friend Aki Salmi, Software Crafter and Communication Trainer, he quickly came up with a bunch of needs like acceptance, appreciation, cooperation, consistency, inclusion, respect/self-respect, stability, trust, integrity, autonomy, choice, challenge, competence, contribution, creativity, discovery, effectiveness and purpose. This is a huge list of needs to get started. As an exercise to the reader, try to figure out how the listed needs could be met.

Needs Met When Creating Bad Code (in German)
Three Examples
Now let's motivate some of the listed needs in detail. For example, someone might bring in some technology which is not necessary for the project and introduces additional, unnecessary complexity. Met needs might be creativity (I want to try something new), learning (I want to learn it), joy (I enjoy playing with it), safety (I will add it to my CV - Resume-driven development), autonomy and choice (I choose myself), consistency and integrity (I have always been using it), safety (I already know how to use it) and so on. What about someone who never argues for quality related changes, clean-ups, more time or reduced scope? Needs met might be appreciation (My boss is happy), ease (I do not argue with anybody), security and protection (I keep my job) etc. And for typical fire fighting, quick-and-dirty developers: competency (I can do it), efficacy (I am fast), effectiveness (I can make it work). These are just a few ideas I got while discussing the topic during a workshop.

Needs Met When Creating Good Code (in German)
Now let us look at the opposite side. For quality code, I value consistency - which is obviously a good thing. It is also consistency which keeps some people from adapting to new ways of working, as in "this is the way I've always worked" - obviously a bad thing. If I keep my code base clean, I am sure that I can work on it later, which makes me feel safe. The same need for safety might keep someone from trying something new, because it is scary, or it might make people thrash their code because their boss is requesting too much in too short time.

Conclusion - If Any
Needs are everywhere. They are universal, cannot be denied nor argued. We use some strategies to fulfil needs when we mess up the code. There are many needs involved, maybe that is why there is so much bad code written. On the other hand, we use strategies to fulfil needs when keeping our code clean. It scares me that opposite behaviour might be driven by same needs. How can we condemn these duct-tape and legacy coders when they are just driven by their needs as we are?

30 August 2019

Visualising Architecture: Neo4j vs. Module Dependencies

Last year I wrote about some work I did for a client using NATURAL (an application development and deployment environment using a proprietary language), for example using NATstyle, adding custom rules and creating custom reports. Today I will share some things I used to visualise the architecture. Usually I want to get the bigger picture of the architecture before I change it.

Industrial LegacyDependencies are killing us
Let's start with some context: This is Banking with some serious legacy: Groups of "solutions" are bundled together as "domains". Each solution contains 5.000 to 10.000 modules (files), which are either top level applications (executable modules) or subroutines (internal modules). Some modules call code of other solutions. There are some system "libraries" which bundle commonly used modules similar to solutions. A recent cross check lists more than 160.000 calls crossing solution boundaries. Nobody knows which modules outside of one's own solution are calling in and changes are difficult because all APIs are potentially public. As usual - dependencies are killing us.

Graph Database
To get an idea what was going on, I wanted to visualize the dependencies. If the data could be converted and imported into standard tools, things would be easier. But there were way too many data points. I needed a database, a graph database, which should be able to deal with hundreds of thousand of nodes, i.e. the modules, and their edges, i.e. the directed dependencies (call or include).

Extract, Transform, Load (ETL)
While ETL is a concept from data warehousing, we exactly needed to "copy data from one or more sources into a destination system which represented the data differently from the source(s) or in a different context than the source(s)." The first step was to extract the relevant information, i.e. the call site modules, destination modules together with more architectural information like "solution" and "domain". I got this data as CSV from a system administrator. The data needed to be transformed into a format which could be easily loaded. Use your favourite scripting language or some sed&awk-fu. I used a little Ruby script,
  map { |line| line.chomp }.
  map { |csv_line| csv_line.split(/;\s*/, 9) }.
  map { |values| values[0..7] }. # drop irrelevant columns
  map { |values| values.join(',') }. # use default field terminator ,
  each { |line| puts line }
to uncompress the file, drop irrelevant data and replace the column separator.

Loading into Neo4j
Neo4j is a well known Graph Platform with a large community. I had never used it and this was the perfect excuse to start playing with it ;-) It took me around three hours to understand the basics and load the data into a prototype. It was easier than I thought. I followed Neo4j's Tutorial on Importing Relational Data. With some warning: I had no idea how to use Neo4j. Likely I used it wrongly and this is not a good example.
CREATE INDEX ON :Module(solution);
CREATE INDEX ON :Module(domain);

// left column
LOAD CSV WITH HEADERS FROM "file:///references.csv" AS row
MERGE (ms:Module {name:row.source_module})
ON CREATE SET ms.domain = row.source_domain, ms.solution = row.source_solution

// right column
LOAD CSV WITH HEADERS FROM "file:///references.csv" AS row
MERGE (mt:Module {name:row.target_module})
ON CREATE SET mt.domain = row.target_domain, mt.solution = row.target_solution

// relation
LOAD CSV WITH HEADERS FROM "file:///references.csv" AS row
MATCH (ms:Module {name:row.source_module})
MATCH (mt:Module {name:row.target_module})
MERGE (ms)-[r:CALLS]->(mt)
ON CREATE SET r.count = toInt(1);
This was my Cypher script. Cypher is Neo4j's declarative query language used for querying and updating of the graph. I loaded the list of references and created all source modules and then all target modules. Then I loaded the references again adding the relation between source and target modules. Sure, loading the huge CSV three times was wasteful, but it did the job. Remember, I had no idea what I was doing ;-)

Querying and Visualising
Cypher is a query language. For example, which modules had most cross solution dependencies? The query MATCH (mf:Module)-[c:CALLS]->() RETURN mf.name, count(distinct c) as outgoing ORDER BY outgoing DESC LIMIT 25 returned
YYI19N00 36
YXI19N00 36
YWI19N00 32
and so on. (By the way, I just loved the names. Modules in NATURAL can only be named using eight characters. Such fun, isn't it.) Now it got interesting. While usually visualisation is the main issue, with Neo4j it was a no-brainer. The Neo4J Browser comes out of the box, runs Cypher queries and displays the results neatly. For example, here are the 36 (external) dependencies of module YYI19N00:

Called by module YYI19N00
As I said, the whole thing was a prototype. It got me started. For an in-depth analysis I would need to traverse the graph interactively or scripted like a Jupyter notebook. In addition, there are several visual tools for Neo4J to make sense of and see how the data is connected - exactly what I would want to know about the dependencies.

20 August 2019

Who should go on journeyman tour

Reisender Geselle im GebirgeIn our recent, bi-weekly software engineering podcast we talked about my Journeyman Tour. In particular Christian wanted to know who should go on such a tour. We discussed for a while, in fact I was talking most of the time and interrupting everybody ;-) It was kind of a hard question and I had not thought about it before, some new ideas came up.

Listen to Developer Melange: Who should do a journeyman tour?