20 March 2021

11 Years of Prime Factors Kata

In this post I want to celebrate the Prime Factors Code Kata. Prime Factors is a small coding exercise first used by Robert C. Martin in 2005. It is my favourite code kata and it has been almost nine years since I last wrote about it - time to change that. Weird enough, the first code kata I ever worked on - outside of university assignments that turned out to be katas later - was Roman Numerals in 2007. The first time I did the Prime Factors was during Christmas holidays 2009 in Java and Ruby:
import java.util.ArrayList;
import java.util.List;

public class PrimeFactors {
  public static List<Integer> generate(int n) {
    List<Integer> primes = new ArrayList<Integer>();

    for (int candidate = 2; candidate <= n; candidate++) {
      for (; n % candidate == 0; n /= candidate) {
        primes.add(candidate);
      }
    }

    return primes;
  }
}
Now the Java code is exactly the code Robert Martin showed, I was following his example. The Ruby version from back then looks pretty similar, just using while instead of for.
module PrimeFactors
  def generate(n)
    prime_factors = []

    candidate = 2
    while n > 1 do
      while n % candidate == 0 do
        prime_factors << candidate
        n /= candidate
      end
      candidate += 1
      candidate = n if candidate > Math.sqrt(n) # performance fix
    end

    prime_factors
  end
end
If you are wondering how I am still able to find the code, I organise my code katas to allow lookup and comparison. Since then I did the kata 141 times and it has many uses.

Learn a language
Prime Factors is one of the first pieces of code I write - Test Driven of course - when revisiting old languages, like Commodore BASIC or looking at a new language, like Forth, using Gforth 0.7:
: prime_factors ( n -- n1 n2 n3 n4 )
  DUP 1 > IF           \ test for ?DO
    DUP 2 ?DO
      BEGIN
        DUP I MOD 0 =  \ test candidate I
      WHILE
        I SWAP I /     \ add candidate, reduce number
      REPEAT
    LOOP
  THEN
  DUP 1 = IF DROP THEN ;

T{ 1 prime_factors -> }T
T{ 2 prime_factors -> 2 }T
T{ 3 prime_factors -> 3 }T
T{ 4 prime_factors -> 2 2 }T
T{ 6 prime_factors -> 2 3 }T
T{ 8 prime_factors -> 2 2 2 }T
T{ 9 prime_factors -> 3 3 }T
Gforth came with a modified testing framework based on John Hayes S1I's tester.fs, defining the functions T{, -> and }T for testing. Note that the given function prime_factors is not realistic as the number of returned arguments is not known by the caller.

When I had a look at Scala, of course I did Prime Factors:
import java.lang.Math.sqrt

object PrimeFactors {
  def generate(number: Int): List[Int] = {

    def fold(current: Pair[Int, List[Int]], candidate: Int): Pair[Int, List[Int]] = {
      if (current._1 % candidate == 0)
        fold((current._1 / candidate, candidate :: current._2), candidate)
      else
        current
    }

    val (remainder, factors) =
      (2 to sqrt(number).intValue).foldLeft((number, List[Int]()))(fold)

    if (remainder > 1)
      (remainder :: factors).reverse
    else
      factors.reverse
  }
}
This is crazy. The code creates a sequence of all candidate primes and folds it starting from left by dividing by the candidate recursively, appending to the begin of the list, which is cheap. Because of that the list is reversed at the end. I have no idea why I created this, probably I was playing around with foldLeft. This is not a good example, please do not copy it. After all these years, the Forth solution seems easier to grasp. ;-)

So which languages are missing? PowerShell looks much like my PHP (shown below) and my Python Prime Factors looks similar too, just with Python specific range(2, number + 1) and //= inside. And of course JavaScript is missing:
PrimeFactors = function() {
  this.factors = [];
};

PrimeFactors.prototype.generate = function(num) {
  var candidate;
  for (candidate = 2; candidate <= num; candidate += 1) {
    num = this.tryCandidate(num, candidate);
  }
  return this.factors;
};

PrimeFactors.prototype.tryCandidate = function(num, candidate) {
  while (num % candidate === 0) {
    num = this.reduceByFactor(num, candidate);
  }
  return num;
};

PrimeFactors.prototype.reduceByFactor = function(num, factor) {
  this.factors.push(factor);
  return num / factor;
};
Isn't that lovely? Again this is not good code, please do not copy it. At least I showed some creativity using prototype functions.

Learn a testing framework
Using TDD to write some known code is a perfect way to learn more about a testing framework. So I explored XSLTunit using Prime Factors in XSLT or NUnit in C#:
using NUnit.Framework;

[TestFixture]
public class PrimeFactorsTest
{
  [TestCase(new int[0], 1)]
  [TestCase(new int[] { 2 }, 2)]
  [TestCase(new int[] { 3 }, 3)]
  [TestCase(new int[] { 2, 2 }, 4)]
  [TestCase(new int[] { 2, 2, 2 }, 8)]
  [TestCase(new int[] { 3, 3 }, 9)]
  public void TestFactors(int[] expected, int number)
  {
    CollectionAssert.AreEqual(expected, PrimeFactors.Generate(number));
  }

  [Test]
  [Timeout(100)]
  public void TestLarge()
  {
    CollectionAssert.AreEqual(new int[] { int.MaxValue },
                              PrimeFactors.Generate(int.MaxValue));
  }
}
Test your own testing framework
Sometimes, when I need to create my own unit testing framework, e.g. TPUnit for old Turbo Pascal or ASM Unit for Windows Assembly, I use Prime Factors as test cases:
_prime_factors:
  mov     esi, 0          ; esi = number of factors
  mov     edi, ebx        ; edi = address of factors
  mov     ecx, eax        ; ecx = current number
  mov     ebx, 1          ; ebx = candidate

  jmp .not_diviseable

.loop_over_candidates:
  inc     ebx             ; next candidate

.break_if_candidate_is_larger_than_square:
; if candidate * candidate <= number then try candidate
  mov     eax, ebx
  mul     ebx
  cmp     eax, ecx
  jbe     .try_candidate

; else number is a (large) prime and we are done
  mov     [edi + esi * register_size], ecx
  add     esi, 1
  jmp     .done

.try_candidate:
; if number % candidate == 0 then add candidate
  mov     eax, ecx
  xor     edx, edx
  div     ebx
  cmp     edx, 0          ; remainder is 0
  jne     .not_diviseable

.is_diviseable:
  mov     [edi + esi * register_size], ebx
                          ; store candidate in factors
  add     esi, 1          ; we found a factor
  mov     ecx, eax        ; number is remainder of division
  jmp     .try_candidate  ; try current candidate again

.not_diviseable:
; if number > 1 then try next candidate
  cmp     ecx, 1
  jne     .loop_over_candidates

.done:
; return number of factors
  mov     eax, esi
  ret
This piece of assembly calcultes the prime factors of the number passed in EAX into in the dword array address EBX.

TDD Demo
In 2012 I started practising Prime Factors as kata performance, minimising the number of keys I pressed. I ran it around 50 times. In the end I used the practice to calm down when I was anxious - it was like mediation. I still have to perform it somewhere, adding music and all... I have used it demoing TDD in uncounted presentations - actually around 40: during my university guest lectures, user group meetings and at my clients. Most demos were in Java and some in PHP,
<?php
class PrimeFactors {
  static function generate($n) {
    $factors = [];
    for ($candidate = 2; $candidate <= $n; $candidate += 1) {
      while ($n % $candidate == 0) {
        $factors[]= $candidate;
        $n /= $candidate;
      }
    }
    return $factors;
  }
}
and a single demo of test driving R code,
primeFactors <- function(number) {
  factors <- vector(mode="integer")

  candidate <- 2
  while (candidate <= sqrt(number)) {
    while (number %% candidate == 0) {
      factors <- c(factors, candidate)
      number <- number / candidate
    }
    candidate = candidate + 1
  }

  if (number > 1) {
    factors <- c(factors, number)
  }

  factors
}
It seems, most programming languages look the same. The last sentence is not true for NATURAL, Cobol's cousin, which is ugly. I will not show it here as it would destroy this lovely post.

Conclusion
By writing this post, I learned that I still need to create Prime Factors in the programming languages Go, Kotlin, OpenOffice Basic, Oracle PL/SQL and of course TypeScript - I could - and I will, it is just a matter of time. So Prime Factors - in fact any small enough code kata - is a great tool for exploring, studying and practising programming languages, testing frameworks, IDE tools and Test Driven Development in general. I will leave you with my latest addition to my collection of Prime Factors, using C99. Have fun!
#include <math.h>

typedef struct {
  unsigned char count;
  unsigned int values[31];
} PrimeFactors;

void PrimeFactors_init(PrimeFactors* factors)
{
  (*factors).count = 0;
}

void PrimeFactors_add(PrimeFactors* factors, const unsigned int factor)
{
  int count = (*factors).count;
  (*factors).values[count] = factor;
  (*factors).count = count + 1;
}

void generate(const unsigned int number, PrimeFactors* factors)
{
  PrimeFactors_init(factors);

  unsigned int remaining = number;
  for (unsigned int candidate = 2; candidate <= sqrtl(remaining); candidate += 1) {
    while (remaining % candidate == 0) {
      PrimeFactors_add(factors, candidate);
      remaining /= candidate;
    }
  }

  if (remaining > 1) {
    PrimeFactors_add(factors, remaining);
  }
}

17 February 2021

Dice Namer Constraint

A constraint is an artificial challenge during an exercise to make the exercise more interesting, challenging or fun. I like constraints and wrote about some of them.. The Dice Namer is such a constraint: Everything but the names of test methods is named using random dices. The French company Arolla created some really nice dices using random, enterprise-y useless names like Processor, Dummy or Factory. I managed to get several sets and to use them in my coding exercises after discussing naming in code. Now with the remote work due to Covid, I had to come up with something new. And here it is, the

Arolla Dice Namer Application

Press the buttons and see the random dices for your name, together with some dices-like sound (if you allow your browser to play it). This is a real fun and it will help you create amazing code like this one using viciously named functions...
What does this code do?

1 February 2021

Working with AI Surveillance

Since several years I am exploring ethics and meaning of my work as software engineer. While it is not a clear cut topic, it lies in the core of our work, similar to all the tech and engineering stuff. Last year Artur Meyster, CTO of Career Karma, contacted me about cooperating on a guest blog post here on my site. (Thank you Artur for your patience.) While guest blogging seems mainly SEO related, I do not mind a fresh perspective on a topic I have been thinking about, much like the series of interviews I am conducting from time to time. This post was mostly written by Maria Elena Gonzalez. Maria is a broadcast journalist and has been working as a tech writer for three years. During this time, her work has been published by companies like TechAccute, Trip University, and Entrepreneur - and now by the Code Cop ;-)

Heavy SurveillanceAI Surveillance
The recent evolution of technology has resulted in many new areas, such as data science, virtual reality or machine learning. As a developer, these are the new and cool topics to explore and dive into. (Maybe start with a data science boot camp.) Applications in these areas have improved our quality of life. They serve the same purpose as AI surveillance: to improve our quality of life. However, as is the case with any invention, AI surveillance has a dark side: we might be losing our right to privacy.

You may think this is a problem only for famous people, but that is not the case. It can also be a problem for you. According to a recent report from the Carnegie Endowment for International Peace, 75 out of 176 countries are using Artificial Intelligence (AI) for surveillance purposes, which makes it hard to know who is safe from AI surveillance. If you would like to know more about the impact of AI surveillance on our daily life, read on.

What is AI Surveillance?
AI surveillance goes beyond AI-driven security cameras. Companies and governmental organizations have been using this technology to track trends and transitions to make better business decisions.

How is it Affecting Us?
AI surveillance has many benefits. For example, it can improve traffic. Have you ever waited for traffic lights to turn green even though there are no cars around? This could be improved by using AI surveillance. AI algorithms can detect movement and change traffic lights phasing based on real-time activity. Financial companies can use AI surveillance to spot malicious activities and minimize fraud. This has the potential to revolutionize the financial industry. The possibilities are endless, and we could continue to show you the positive aspects of AI surveillance, but let's take a look at the darker side.

With the amount of data collected increasing rapidly, the possibility of privacy invasion is rocketing. While some AI surveillance activity falls within the law, others represent privacy violations. Privacy International points out that AI surveillance can be very delicate when it comes to data privacy: "With the proliferation of surveillance cameras, facial recognition, open-source and social media intelligence, biometrics, and data emerging from smart cities, the police now have unprecedented access to massive amounts of data." Governments that take advantage of this technology have access to citizens' personal data, a situation that can quickly turn into a violation of privacy rights if left unchecked.

Which Countries are Adopting AI Surveillance Technology?
Among the leaders in the use of AI surveillance are China and the United States. Out of the two, China is implementing the technology more widely. According to the Carnegie Endowment for International Peace, the organizations that most often use AI surveillance are part of the government. China is not only using the technology to improve traffic, it is also applying AI surveillance to track the activities of the Uighurs, in the northwestern province of Xinjiang, according to LiveWithAI. By using facial recognition based on race and ethnicity characteristics, the Chinese government can easily detect when a Uighur attempts to flee Xinjiang.

Is the US Government using AI surveillance? Yes, of course. Common applications of governmental AI surveillance include military activity and security. For example Palantir might have helped to power Trump's "extreme vetting" of immigrants. However, there are other applications that target improving the performance of cities.

James, I think your cover's blown!Which Companies Provide These Kind of Service?
According to the report from the Carnegie Endowment for International Peace, some of the leading companies providing AI surveillance are Huawei, ZTE and Hikvision. These companies provide AI-powered surveillance to more than 60 countries. A particularly controversial case involving AI surveillance and facial recognition technology is that of Clearview. This company collects and stores publicly available data on every person on the planet. Scraping from social media platforms such as Facebook and Instagram, Clearview has collected more than 3 billion images from social media platforms, making it the largest such library in the world. Using its enormous library of pictures, Clearview has created a search engine for faces that can be used to recognize anyone. This tool has already been used by more than 600 law enforcement agencies, including the FBI and the US Department of Justice. Clearview's willingness to do what no one else dared to do - scrape freely available data and exploit it for profit - has put it in the centre of a moral debate.

Should You Apply at Such Companies?
Now we are getting to the meat of the discussion: As a software professional, should you work for a company that is working on this type of solution? How do you navigate the moral implications of your work? These are complex questions that seem to require deep introspection. However, at the end of the day - at least for the extreme cases mentioned above - the answer is simpler than you expect.

On the personal level, whether working for a company like Clearview is a good idea depends on your values and what you prioritize in life. Some people abhor the whole concept of using free data to spy on people, while others would simply see nothing wrong with this idea. If you are more like the former, then working for a company like Clearview is a bad idea. It goes against your principles - you will simply not be happy in the long run, regardless of the size of that pay check. If, on the other hand, you don't see anything wrong with these practices and everything else about the job looks good, you might think it is OK to do the job. It is not.

In his famous keynote Not just code monkeys Martin Fowler identified two areas where most of our impact and responsibility as a developer lie at the moment: Privacy and avoiding the creation of an alienating atmosphere at our workplaces.

Conclusion
AI surveillance can be beneficial if used wisely and without violating privacy rights. However, there are some organizations that are using this technology to invade our privacy. The key to avoiding this is to create new policies that protect our privacy rights. As a developer, you must be aware of the implication of your work on privacy and steer clear of such violations.