20 March 2021

11 Years of Prime Factors Kata

In this post I want to celebrate the Prime Factors Code Kata. Prime Factors is a small coding exercise first used by Robert C. Martin in 2005. It is my favourite code kata and it has been almost nine years since I last wrote about it - time to change that. Weird enough, the first code kata I ever worked on - outside of university assignments that turned out to be katas later - was Roman Numerals in 2007. The first time I did the Prime Factors was during Christmas holidays 2009 in Java and Ruby:
import java.util.ArrayList;
import java.util.List;

public class PrimeFactors {
  public static List<Integer> generate(int n) {
    List<Integer> primes = new ArrayList<Integer>();

    for (int candidate = 2; candidate <= n; candidate++) {
      for (; n % candidate == 0; n /= candidate) {
        primes.add(candidate);
      }
    }

    return primes;
  }
}
Now the Java code is exactly the code Robert Martin showed, I was following his example. The Ruby version from back then looks pretty similar, just using while instead of for.
module PrimeFactors
  def generate(n)
    prime_factors = []

    candidate = 2
    while n > 1 do
      while n % candidate == 0 do
        prime_factors << candidate
        n /= candidate
      end
      candidate += 1
      candidate = n if candidate > Math.sqrt(n) # performance fix
    end

    prime_factors
  end
end
If you are wondering how I am still able to find the code, I organise my code katas to allow lookup and comparison. Since then I did the kata 141 times and it has many uses.

Learn a language
Prime Factors is one of the first pieces of code I write - Test Driven of course - when revisiting old languages, like Commodore BASIC or looking at a new language, like Forth, using Gforth 0.7:
: prime_factors ( n -- n1 n2 n3 n4 )
  DUP 1 > IF           \ test for ?DO
    DUP 2 ?DO
      BEGIN
        DUP I MOD 0 =  \ test candidate I
      WHILE
        I SWAP I /     \ add candidate, reduce number
      REPEAT
    LOOP
  THEN
  DUP 1 = IF DROP THEN ;

T{ 1 prime_factors -> }T
T{ 2 prime_factors -> 2 }T
T{ 3 prime_factors -> 3 }T
T{ 4 prime_factors -> 2 2 }T
T{ 6 prime_factors -> 2 3 }T
T{ 8 prime_factors -> 2 2 2 }T
T{ 9 prime_factors -> 3 3 }T
Gforth came with a modified testing framework based on John Hayes S1I's tester.fs, defining the functions T{, -> and }T for testing. Note that the given function prime_factors is not realistic as the number of returned arguments is not known by the caller.

When I had a look at Scala, of course I did Prime Factors:
import java.lang.Math.sqrt

object PrimeFactors {
  def generate(number: Int): List[Int] = {

    def fold(current: Pair[Int, List[Int]], candidate: Int): Pair[Int, List[Int]] = {
      if (current._1 % candidate == 0)
        fold((current._1 / candidate, candidate :: current._2), candidate)
      else
        current
    }

    val (remainder, factors) =
      (2 to sqrt(number).intValue).foldLeft((number, List[Int]()))(fold)

    if (remainder > 1)
      (remainder :: factors).reverse
    else
      factors.reverse
  }
}
This is crazy. The code creates a sequence of all candidate primes and folds it starting from left by dividing by the candidate recursively, appending to the begin of the list, which is cheap. Because of that the list is reversed at the end. I have no idea why I created this, probably I was playing around with foldLeft. This is not a good example, please do not copy it. After all these years, the Forth solution seems easier to grasp. ;-)

So which languages are missing? PowerShell looks much like my PHP (shown below) and my Python Prime Factors looks similar too, just with Python specific range(2, number + 1) and //= inside. And of course JavaScript is missing:
PrimeFactors = function() {
  this.factors = [];
};

PrimeFactors.prototype.generate = function(num) {
  var candidate;
  for (candidate = 2; candidate <= num; candidate += 1) {
    num = this.tryCandidate(num, candidate);
  }
  return this.factors;
};

PrimeFactors.prototype.tryCandidate = function(num, candidate) {
  while (num % candidate === 0) {
    num = this.reduceByFactor(num, candidate);
  }
  return num;
};

PrimeFactors.prototype.reduceByFactor = function(num, factor) {
  this.factors.push(factor);
  return num / factor;
};
Isn't that lovely? Again this is not good code, please do not copy it. At least I showed some creativity using prototype functions.

Learn a testing framework
Using TDD to write some known code is a perfect way to learn more about a testing framework. So I explored XSLTunit using Prime Factors in XSLT or NUnit in C#:
using NUnit.Framework;

[TestFixture]
public class PrimeFactorsTest
{
  [TestCase(new int[0], 1)]
  [TestCase(new int[] { 2 }, 2)]
  [TestCase(new int[] { 3 }, 3)]
  [TestCase(new int[] { 2, 2 }, 4)]
  [TestCase(new int[] { 2, 2, 2 }, 8)]
  [TestCase(new int[] { 3, 3 }, 9)]
  public void TestFactors(int[] expected, int number)
  {
    CollectionAssert.AreEqual(expected, PrimeFactors.Generate(number));
  }

  [Test]
  [Timeout(100)]
  public void TestLarge()
  {
    CollectionAssert.AreEqual(new int[] { int.MaxValue },
                              PrimeFactors.Generate(int.MaxValue));
  }
}
Test your own testing framework
Sometimes, when I need to create my own unit testing framework, e.g. TPUnit for old Turbo Pascal, assert-scm (Scheme R5RS) or ASM Unit for Windows Assembly, I use Prime Factors as test cases:
_prime_factors:
  mov     esi, 0          ; esi = number of factors
  mov     edi, ebx        ; edi = address of factors
  mov     ecx, eax        ; ecx = current number
  mov     ebx, 1          ; ebx = candidate

  jmp .not_diviseable

.loop_over_candidates:
  inc     ebx             ; next candidate

.break_if_candidate_is_larger_than_square:
; if candidate * candidate <= number then try candidate
  mov     eax, ebx
  mul     ebx
  cmp     eax, ecx
  jbe     .try_candidate

; else number is a (large) prime and we are done
  mov     [edi + esi * register_size], ecx
  add     esi, 1
  jmp     .done

.try_candidate:
; if number % candidate == 0 then add candidate
  mov     eax, ecx
  xor     edx, edx
  div     ebx
  cmp     edx, 0          ; remainder is 0
  jne     .not_diviseable

.is_diviseable:
  mov     [edi + esi * register_size], ebx
                          ; store candidate in factors
  add     esi, 1          ; we found a factor
  mov     ecx, eax        ; number is remainder of division
  jmp     .try_candidate  ; try current candidate again

.not_diviseable:
; if number > 1 then try next candidate
  cmp     ecx, 1
  jne     .loop_over_candidates

.done:
; return number of factors
  mov     eax, esi
  ret
This piece of assembly calcultes the prime factors of the number passed in EAX into in the dword array address EBX.

TDD Demo
In 2012 I started practising Prime Factors as kata performance, minimising the number of keys I pressed. I ran it around 50 times. In the end I used the practice to calm down when I was anxious - it was like mediation. I still have to perform it somewhere, adding music and all... I have used it demoing TDD in uncounted presentations - actually around 40: during my university guest lectures, user group meetings and at my clients. Most demos were in Java and some in PHP,
<?php
class PrimeFactors {
  static function generate($n) {
    $factors = [];
    for ($candidate = 2; $candidate <= $n; $candidate += 1) {
      while ($n % $candidate == 0) {
        $factors[]= $candidate;
        $n /= $candidate;
      }
    }
    return $factors;
  }
}
and a single demo of test driving R code,
primeFactors <- function(number) {
  factors <- vector(mode="integer")

  candidate <- 2
  while (candidate <= sqrt(number)) {
    while (number %% candidate == 0) {
      factors <- c(factors, candidate)
      number <- number / candidate
    }
    candidate = candidate + 1
  }

  if (number > 1) {
    factors <- c(factors, number)
  }

  factors
}
It seems, most programming languages look the same. The last sentence is not true for NATURAL, Cobol's cousin, which is ugly. I will not show it here as it would destroy this lovely post.

Conclusion
By writing this post, I learned that I still need to create Prime Factors in the programming languages Go, Kotlin, OpenOffice Basic, Oracle PL/SQL and of course TypeScript - I could - and I will, it is just a matter of time. So Prime Factors - in fact any small enough code kata - is a great tool for exploring, studying and practising programming languages, testing frameworks, IDE tools and Test Driven Development in general. I will leave you with my latest addition to my collection of Prime Factors, using C99. Have fun!
#include <math.h>

typedef struct {
  unsigned char count;
  unsigned int values[31];
} PrimeFactors;

void PrimeFactors_init(PrimeFactors* factors)
{
  (*factors).count = 0;
}

void PrimeFactors_add(PrimeFactors* factors, const unsigned int factor)
{
  int count = (*factors).count;
  (*factors).values[count] = factor;
  (*factors).count = count + 1;
}

void generate(const unsigned int number, PrimeFactors* factors)
{
  PrimeFactors_init(factors);

  unsigned int remaining = number;
  for (unsigned int candidate = 2; candidate <= sqrtl(remaining); candidate += 1) {
    while (remaining % candidate == 0) {
      PrimeFactors_add(factors, candidate);
      remaining /= candidate;
    }
  }

  if (remaining > 1) {
    PrimeFactors_add(factors, remaining);
  }
}