## 14 January 2010

### Prime Factors Kata BASIC

The Prime Factors Kata is a small coding exercise first shown by Uncle Bob in Java several years ago. It has been done in C#, Ruby and probably some other languages. Recently Uncle Bob performed it as katacast in Ruby. (Go watch it! It's really cool. I will wait here.) He was followed by others in C# and Groovy. Anyway I'm sure it has not been done in BASIC. (Since my last post about Scala and BASIC I've felt so "retro".) So here is my Prime Factors Kata done in BASIC V2 (Commodore 64). I couldn't use my Scala BASIC DSL because it's lacking necessary features like `GOSUB`. So I used the Versatile Commodore Emulator (VICE) instead.

The First Test.
`1000 CLR1010 PRINT "test one ";1020 i=11030 GOSUB 90001040 IF pf(0)<>0 THEN PRINT "expected no factors" : STOP1050 PRINT "green"`
Number `i` is the value to be factored and the array `pf` is expected to contain the number of prime factors in `pf(0)` followed by the factors themselves on return of the "generate" function (line 9000).
`test one?UNDEF'D STATEMENT ERROR IN 1030`
Add the function.
`8990 END9000 REM ----- function generate9010 REM in ... i ... number9020 REM out ... pf() ... factors9030 RETURN`
Run the test.
`test one green`
The Second Test.
`1100 CLR1110 PRINT "test two ";1120 i=21130 GOSUB 90001140 IF pf(0)<>1 THEN PRINT "expected 1 factors:";pf(0) : STOP1150 IF pf(1)<>2 THEN PRINT "expected factor 2:";pf(1) : STOP1160 PRINT "green"`
`test one greentest two expected 1 factors: 0BREAK IN 1140`
`9000 REM ----- function generate9030 IF i=1 THEN RETURN9040 pf(0)=19050 pf(1)=29060 RETURN`
Unfortunately there is no such thing as "BUnit". So I have to create testing infrastructure along the way. Keeping the tests green helps during the extraction of an "assert" method (line 100).
`80 GOTO 100090 REM ***** test infrastructure *****100 REM ----- method assert equals(int)110 REM in ... me\$ ... message120 REM in ... ex ... expected130 REM in ... ac ... actual140 IF ex=ac THEN RETURN150 PRINT "red"160 PRINT me\$;" expected";ex;" but was";ac170 STOP180 RETURN...1100 CLR1110 PRINT "test two ";1120 i=21130 GOSUB 90001140 ex=1 : ac=pf(0) : me\$="num factors" : GOSUB 1001150 ex=2 : ac=pf(1) : me\$="1st factor" : GOSUB 1001160 PRINT "green"`
The Third Test.
`1220 i=31230 GOSUB 90001240 ex=1 : ac=pf(0) : me\$="num factors" : GOSUB 1001250 ex=3 : ac=pf(1) : me\$="1st factor" : GOSUB 100`
Run the test.
`test one greentest two greentest three red1st factor expected 3 but was 2BREAK IN 170`
Modify the function.
`9050 pf(1)=i`
Green again. After the third test it's getting boring. The tests should be refactored to be more DRY:
`200 REM ----- method teardown210 PRINT "green"220 RETURN300 REM ----- method setup310 REM in ... me\$ ... test name320 PRINT "test ";me\$;" ";330 RETURN400 REM ----- method assert prime factors410 READ me\$420 GOSUB 300430 READ i440 GOSUB 9000450 READ af460 ex=af : ac=pf(0) : me\$="num factors" : GOSUB 100470 IF af=0 THEN GOTO 520480 FOR j=1 TO af490 READ ex500 ac=pf(j) : me\$=STR\$(j)+". factor" : GOSUB 100510 NEXT520 GOSUB 200530 RETURN990 REM ***** test cases *****1000 DATA "one", 1, 01010 GOSUB 4001100 DATA "two", 2, 1, 21110 GOSUB 4001200 DATA "three", 3, 1, 31210 GOSUB 400`
The Fourth Test.
`1300 DATA "four", 4, 2, 2, 21310 GOSUB 400`
`9000 REM ----- function generate9010 REM in ... i ... number9020 REM out ... pf() ... factors9025 REM local ... nf ... number factors9030 nf=09040 pf(0)=nf9050 IF i=1 THEN RETURN9060 IF INT(i/2)*2=i THEN nf=nf+1 : pf(nf)=2 : i=i/2 : GOTO 90409070 nf=nf+1 : pf(nf)=i : i=1 : GOTO 90409080 RETURN`
Line 9070 is more than needed to get the fourth test green, but it's the first thing that came to my mind.

The Fifth Test.
`1400 DATA "five", 6, 2, 2, 3`
Works as well, no changes needed.

The Sixth Test.
`1500 DATA "six", 8, 3, 2, 2, 2`
Again this still works, I "cheated" a bit.

The Seventh Test.
`1600 DATA "seven", 9, 2, 3, 3`
Now I really need the nested loops.
`9000 REM ----- function generate9010 REM in ... i ... number9020 REM out ... pf() ... factors9030 REM mod ... ca ... pf candidate9040 pf(0)=0 : REM special case9050 IF i=1 THEN RETURN9060 IF INT(i/2)*2<>i THEN GOTO 91109070 pf(0)=pf(0)+19080 pf(pf(0))=29090 i=i/29100 GOTO 90509110 FOR ca=3 TO INT(SQR(i)) STEP 29120 IF i=1 THEN RETURN9130 IF INT(i/ca)*ca<>i THEN GOTO 91809140 pf(0)=pf(0)+19150 pf(pf(0))=ca9160 i=i/ca9170 GOTO 91209180 NEXT9190 RETURN`
This includes already two performance optimisations: Handling the special case 2 up front to be able to use `STEP 2` and skip every second prime factor candidate. Second using the square root of `i` as upper bound for the loop. But there is a bug not covered by the tests. Can you spot it? Yet another refactoring to remove duplicate code of adding prime factors to `pf`.
`9040 pf(0)=0 : ca=2 : REM special case9050 IF i=1 THEN RETURN9060 IF INT(i/ca)*ca=i THEN GOSUB 9200 : GOTO 90509070 FOR ca=3 TO INT(SQR(i)) STEP 29080 IF i=1 THEN RETURN9090 IF INT(i/ca)*ca=i THEN GOSUB 9200 : GOTO 90809100 NEXT9110 RETURN9200 pf(0)=pf(0)+19210 pf(pf(0))=ca9220 i=i/ca9230 RETURN`
I could still get rid of the duplicate lines 9060 and 9090...

Another Test.
`1700 DATA "eight", 10, 2, 2, 5`
This reveals the bug in line 9070. If the value contains different prime factors then the last factor is lost. We need another check in line 9110.
`9100 NEXT9110 IF i>1 THEN ca=i : GOSUB 92009120 RETURN`
END.
One last refactoring to add colours and time measurements to methods "setup" (line 300) and "teardown" (line 200). Two more tests to see the performance of larger (32719) values of `i`. The function even works for `2^31-1` (huge), but takes quite some time, although the loop is already optimised. (Download the source.)