14 January 2010

Prime Factors Kata BASIC

The Prime Factors Kata is a small coding exercise first shown by Uncle Bob in Java several years ago. It has been done in C#, Ruby and probably some other languages. Recently Uncle Bob performed it as katacast in Ruby. (Go watch it! It's really cool. I will wait here.) He was followed by others in C# and Groovy. Anyway I'm sure it has not been done in BASIC. (Since my last post about Scala and BASIC I've felt so "retro".) So here is my Prime Factors Kata done in BASIC V2 (Commodore 64). I couldn't use my Scala BASIC DSL because it's lacking necessary features like GOSUB. So I used the Versatile Commodore Emulator (VICE) instead.

The First Test.
1000 CLR
1010 PRINT "test one ";
1020 i=1
1030 GOSUB 9000
1040 IF pf(0)<>0 THEN PRINT "expected no factors" : STOP
1050 PRINT "green"
Number i is the value to be factored and the array pf is expected to contain the number of prime factors in pf(0) followed by the factors themselves on return of the "generate" function (line 9000).
test one
?UNDEF'D STATEMENT ERROR IN 1030
Add the function.
8990 END
9000 REM ----- function generate
9010 REM in ... i ... number
9020 REM out ... pf() ... factors
9030 RETURN
Run the test.
test one green
The Second Test.
1100 CLR
1110 PRINT "test two ";
1120 i=2
1130 GOSUB 9000
1140 IF pf(0)<>1 THEN PRINT "expected 1 factors:";pf(0) : STOP
1150 IF pf(1)<>2 THEN PRINT "expected factor 2:";pf(1) : STOP
1160 PRINT "green"
test one green
test two expected 1 factors: 0
BREAK IN 1140
9000 REM ----- function generate
9030 IF i=1 THEN RETURN
9040 pf(0)=1
9050 pf(1)=2
9060 RETURN
Unfortunately there is no such thing as "BUnit". So I have to create testing infrastructure along the way. Keeping the tests green helps during the extraction of an "assert" method (line 100).
80 GOTO 1000
90 REM ***** test infrastructure *****
100 REM ----- method assert equals(int)
110 REM in ... me$ ... message
120 REM in ... ex ... expected
130 REM in ... ac ... actual
140 IF ex=ac THEN RETURN
150 PRINT "red"
160 PRINT me$;" expected";ex;" but was";ac
170 STOP
180 RETURN
...
1100 CLR
1110 PRINT "test two ";
1120 i=2
1130 GOSUB 9000
1140 ex=1 : ac=pf(0) : me$="num factors" : GOSUB 100
1150 ex=2 : ac=pf(1) : me$="1st factor" : GOSUB 100
1160 PRINT "green"
The Third Test.
1220 i=3
1230 GOSUB 9000
1240 ex=1 : ac=pf(0) : me$="num factors" : GOSUB 100
1250 ex=3 : ac=pf(1) : me$="1st factor" : GOSUB 100
Run the test.
test one green
test two green
test three red
1st factor expected 3 but was 2
BREAK IN 170
Modify the function.
9050 pf(1)=i
Green again. After the third test it's getting boring. The tests should be refactored to be more DRY:
200 REM ----- method teardown
210 PRINT "green"
220 RETURN
300 REM ----- method setup
310 REM in ... me$ ... test name
320 PRINT "test ";me$;" ";
330 RETURN
400 REM ----- method assert prime factors
410 READ me$
420 GOSUB 300
430 READ i
440 GOSUB 9000
450 READ af
460 ex=af : ac=pf(0) : me$="num factors" : GOSUB 100
470 IF af=0 THEN GOTO 520
480 FOR j=1 TO af
490 READ ex
500 ac=pf(j) : me$=STR$(j)+". factor" : GOSUB 100
510 NEXT
520 GOSUB 200
530 RETURN
990 REM ***** test cases *****
1000 DATA "one", 1, 0
1010 GOSUB 400
1100 DATA "two", 2, 1, 2
1110 GOSUB 400
1200 DATA "three", 3, 1, 3
1210 GOSUB 400
The Fourth Test.
1300 DATA "four", 4, 2, 2, 2
1310 GOSUB 400
9000 REM ----- function generate
9010 REM in ... i ... number
9020 REM out ... pf() ... factors
9025 REM local ... nf ... number factors
9030 nf=0
9040 pf(0)=nf
9050 IF i=1 THEN RETURN
9060 IF INT(i/2)*2=i THEN nf=nf+1 : pf(nf)=2 : i=i/2 : GOTO 9040
9070 nf=nf+1 : pf(nf)=i : i=1 : GOTO 9040
9080 RETURN
Line 9070 is more than needed to get the fourth test green, but it's the first thing that came to my mind.

The Fifth Test.
1400 DATA "five", 6, 2, 2, 3
Works as well, no changes needed.

The Sixth Test.
1500 DATA "six", 8, 3, 2, 2, 2
Again this still works, I "cheated" a bit.

The Seventh Test.
1600 DATA "seven", 9, 2, 3, 3
Now I really need the nested loops.
9000 REM ----- function generate
9010 REM in ... i ... number
9020 REM out ... pf() ... factors
9030 REM mod ... ca ... pf candidate
9040 pf(0)=0 : REM special case
9050 IF i=1 THEN RETURN
9060 IF INT(i/2)*2<>i THEN GOTO 9110
9070 pf(0)=pf(0)+1
9080 pf(pf(0))=2
9090 i=i/2
9100 GOTO 9050
9110 FOR ca=3 TO INT(SQR(i)) STEP 2
9120 IF i=1 THEN RETURN
9130 IF INT(i/ca)*ca<>i THEN GOTO 9180
9140 pf(0)=pf(0)+1
9150 pf(pf(0))=ca
9160 i=i/ca
9170 GOTO 9120
9180 NEXT
9190 RETURN
This includes already two performance optimisations: Handling the special case 2 up front to be able to use STEP 2 and skip every second prime factor candidate. Second using the square root of i as upper bound for the loop. But there is a bug not covered by the tests. Can you spot it? Yet another refactoring to remove duplicate code of adding prime factors to pf.
9040 pf(0)=0 : ca=2 : REM special case
9050 IF i=1 THEN RETURN
9060 IF INT(i/ca)*ca=i THEN GOSUB 9200 : GOTO 9050
9070 FOR ca=3 TO INT(SQR(i)) STEP 2
9080 IF i=1 THEN RETURN
9090 IF INT(i/ca)*ca=i THEN GOSUB 9200 : GOTO 9080
9100 NEXT
9110 RETURN
9200 pf(0)=pf(0)+1
9210 pf(pf(0))=ca
9220 i=i/ca
9230 RETURN
I could still get rid of the duplicate lines 9060 and 9090...

Another Test.
1700 DATA "eight", 10, 2, 2, 5
This reveals the bug in line 9070. If the value contains different prime factors then the last factor is lost. We need another check in line 9110.
9100 NEXT
9110 IF i>1 THEN ca=i : GOSUB 9200
9120 RETURN
END.
One last refactoring to add colours and time measurements to methods "setup" (line 300) and "teardown" (line 200). Two more tests to see the performance of larger (32719) values of i. The function even works for 2^31-1 (huge), but takes quite some time, although the loop is already optimised.Prime Factors Kata tests are all green :-)(Download the source.)

13 January 2010

T-shirts, T-shirts, T-shirts

My Personal Branding
Here it is - finally - the official Code Cop T-shirt. In fact it's not just a shirt, but a whole series of variations: There is a black on blue version that has proper contrast and one with a navy-coloured logo which is a bit more discreet.

Dark Blue Code Cop T-shirt
My favourite one is the cosy version of the blue 'Code Cop' shirt. The large logo on the front and the URL on the back are produced using Flock Print. There is a female version - I just don't know when to stop! They all look great and I had each of them produced and sent to me to inspect.

Reversed Cyan Code Cop T-shirt
More Freaky
Who needs such T-shirts? Well, geeks need cool shirts and Jeff has one, too. If you like freaky shirts as much as I do, then you will probably love my coding related quotes shirts. I like to use (more or less) ingenious quotes to bother my colleagues with either small advice or mild criticism. One of my favourite quotes for crappy but fast developed code is Uncle Bob's "rushingtogetawholebunchofshitcodedandsortofrunning". Wearing this shirt you don't even have to say it out loud. ;-)

rushingtogetawholebunchofshitcodedandsortofrunning
Talking of coding: What's your favourite design pattern? Mine is the Singleton pattern. No, I'm joking, I hate it. Alex Miller hates it too and explained some time ago why it's so hateful. (At my previous working place one third of all classes was dependant on singletons. It was awful.) I can't stand them. I'm getting sick when I see one. Therefore I'm sure that Singletons Are Evil.

Singletons Are Evil T-shirt
(Buy Code Cop shirts)

9 January 2010

Scala DSL for BASIC

Sweet holidays - two weeks without work (but unfortunately without any connection to the internet). So what did I do? I did some code katas in Java and Ruby and some reading. Masterminds of Programming is a fine book. After reading the chapter about BASIC, which was the first language I used in the 80ties, I spent some time thinking about the old days of Commodore 64 when GOTO was still state of the art ;-) Winter Holidays

"How about some fun coding or even performing a kata in BASIC?" Without any connection to the internet gwbasic.exe or any other interpreter was out of reach. But BASIC v2 (aka Commodore BASIC) was quite simple. So I started implementing an interpreter myself. The question was if it is possible to implement BASIC v2 as an internal DSL in Scala. DSLs are quite popular in Scala, even for other languages, e.g. Daniel Spiewak's DSL for calling JRuby from Scala. Let's see how far I got...

Hello World
First I want to compile
10 PRINT "Hello World"
RUN
That shouldn't be difficult using implicit conversions.
class Command(val lineNumber:Int, code: =>Unit) {
   def execute = code
}
class NewStatement(line:Int) {
   def PRINT(msg:String) =
      addToMemory(new Command(line) { println(msg) })
}
implicit def Int2NewStmt(line:Int) = new NewStatement(line)

def RUN ...
Note that there has to be a blank after the line numbers. This is not necessary for BASIC. Method addToMemory adds the Command instance containing the code to run (println(msg)) to a list for later execution by the mixed in RUN method. (In case you are not familiar with Scala, the code that finally gets compiled for line number 10 looks like new NewStatement(10).PRINT("Hello World"). Scala is not strict about dots and braces.) By adding PRINT(num:Int) to NewStatement
10 PRINT 12 + 12
20 PRINT 144/12
work as well. Unfortunately I am not able to implement PRINT's concatenating features, i.e.
40 PRINT "a" ; "b"
50 PRINT "a" , "b"
Commodore 64 because ; is the empty statement in Scala and parameter lists containing , need braces.

My old friend where have you gone?
One of the first examples in the Commodore 64 Manual is
10 ? "Commodore 64"
20 GOTO 10
? is just a shortcut for PRINT. So add
class NewStatement(line:Int) {
   ...
   def ?(msg:String) = PRINT(msg)
   def GOTO(target:Int) =
      addToMemory(new Command(line) { lineCounter = target })
}
The implementation of RUN uses lineCounter to keep track of executing line numbers. When the Command with GOTO's code is executed, it changes the number of the next line to be run. Tada! Now I can GOTO Scala ;-)

Assignment and Evaluation
Let's tackle some basic assignments.
10 X% = 15
20 X = 23.5
30 X$ = "SOME STRING"
40 PRINT X
50 PRINT X$
Using the same approach as before (with some refactoring of addToMemory()):
class NewStatement(line:Int) {
   ...
   def X_=(d:Double) = {
      addToMemory(line) { variables.X = d }
   }
   def X:Double = throw new Error()

   def X$_=(s:String) = {
      addToMemory(line) { variables.X$ = s }
   }
   def X$:String = throw new Error()
}
The empty getters are necessary for Scala to recognise the methods as overridden bean properties and are never called. Although % is a valid Scala method name, it's not possible to suffix method names with it. So integer "variables" like def X%_=(i:Int) can't be defined and line 10 will never compile. (One could escape `X%`, but then the BASIC code would have to look like 10 `X%` = 15.)

For evaluation I need a method without parameters called X, in the scope of the current code, like RUN. The method must not return the value of the variable because that would be too early. It should return code that yields the value of the variable when called (yes that's a function ;-).
def X:Function0[Double] = new Function0[Double] {
   override def apply = variables.X
}
def X$:Function0[String] = new Function0[String] {
   override def apply = variables.X$
}
The container variables holds the values of all variables during "BASIC runtime", i.e. when RUN is called. To use the variables in PRINT, it must accept these functions as arguments as well.
class NewStatement(line:Int) {
   ...
   def PRINT(msg:Function0[String]) = {
      addToMemory(line) { println(msg()) }
   }
}
Basic Instinct Variables
To allow arbitrary variables all variable names have to be predefined for assignment and for evaluation. That's impossible. But fortunately early basic used only two letters for variables. [A-Z] and [A-Z][A-Z0-9] yield 962 valid names for each Double and String variables. So all these getters and setters could be generated. The generated trait VariableAssignment with all setters is mixed into NewStatement and the generated trait VariableEvaluation with all getters is mixed into the main code. (Unfortunately the Scala compiler crashes with StackOverflowError when mixing in the trait with its nearly 4000 methods.) The container variables is better replaced with two maps, one for Double and one for String variables.

Express Yourself
To add two variables, i.e. to add the values of two variable evaluations, I change all literals and Function0s to expressions (read custom function instances). This is a major refactoring but Scala makes it easy for me. Literals are still covered by implicit conversions.
abstract class DoubleExpression {
   def value:Double

   def +(d:DoubleExpression) =
      new DoubleExpression {
         override def value = DoubleExpression.this.value + d.value
      }
   def *(d:DoubleExpression) =
      new DoubleExpression {
         override def value = DoubleExpression.this.value * d.value
      }
}
class LiteralDoubleExpression(val value:Double) extends DoubleExpression

implicit def Double2DoubleExpression(value:Double) =
   new LiteralDoubleExpression(value)
implicit def Int2DoubleExpression(value:Int) =
   new LiteralDoubleExpression(value.toDouble)

trait VariableAssignment {

   def assign(name:String, d:DoubleExpression) = {
      addToMemory(line) { memory.set(name, d.value) }
   }
   def unused = throw new Error("never used")

   def X_=(d:DoubleExpression) = assign("X", d)
   def X:DoubleExpression = unused
   ...
}
class NewStatement(line:Int) extends VariableAssignment {
   ...
   def PRINT(msg:StringExpression) =
      addToMemory(line) { println(msg.value) }
}
trait VariableEvaluation {

   class VariableDoubleExpression(val fromVariable:String)
      extends DoubleExpression {
      override def value = memory.get(fromVariable)
   }
   def X = new VariableDoubleExpression("X")
   ...
}
The code above shows the changes needed for Double. Similar classes and methods are needed to cover StringExpressions. Now let's examine what happens for 10 X = X + 1: The method call X of trait VariableEvaluation returns a VariableDoubleExpression (which will yield the value of "X" when called) on which + is invoked with the implicitly converted new LiteralDoubleExpression(1.0). The resulting DoubleExpression is the argument for the setter method X_= of trait VariableAssignment in NewStatement.

Let it Flow (Control)
Till now the Scala BASIC lacks any flow control, e.g. a conditional GOTO like
40 IF X < 5 THEN 20
class NewStatement(line:Int) {
   ...
   def IF(expr: => Boolean) = {
      class ConsumeThen(expr: => Boolean) {
         def THEN(target:Int) = addToMemory {
            if (expr) ... // goto target
         }
      }
      new ConsumeThen(expr)
   }
}
Method IF consumes the boolean expression (as a function) and returns some instance of a class that only allows the following THEN. Method THEN in turn adds the code to execute the GOTO to the list of Commands. BASIC comparison operators are different from Scala's and I have to implement them in DoubleExpression:
abstract class DoubleExpression {
   ...
   def <(d:DoubleExpression) = value < d.value
   def <>(d:DoubleExpression) = value != d.value
}
The implementation of FOR is a bit more complex mainly because the BASIC "runtime" has to keep track of the current loop, so NEXT statements are executed proper.
10 FOR X = 1 TO 5
20 PRINT "COMMODORE 64"
30 NEXT X
To allow 1.TO(5) I define a LoopRangeExpression that handles the boundaries of the loop:
class LoopRangeExpression(val lower:DoubleExpression) {
   var upper:DoubleExpression = null
   def TO(u:DoubleExpression) = {
      upper = u
      this
   }
}
implicit def Int2LoopRangeExpression(value:Int) =
   new LoopRangeExpression(new LiteralDoubleExpression(value.toDouble))
Scala compiles the loop statement into new NewStatement(10).FOR().(X()) = new LoopRangeExpression(new LiteralDoubleExpression(1.0)).TO(new LiteralDoubleExpression(5.0)). That's why method TO has to return its instance, to match the update method (used for setting array elements). So FOR has to return an object that defines update(variable: VariableDoubleExpression, range: LoopRangeExpression). Inside method update all necessary code to run the loop is put into the BASIC runtime part. Both FOR and NEXT use DoubleExpression to capture the variable names but never evaluate these expressions.
class NewStatement(line:Int) {
   ...
   def FOR = {
      new Object() {
        def update(variable:VariableDoubleExpression,
                   range:LoopRangeExpression) = {
           val usedVariable = variable.fromVariable
           addToMemory {
              ... // set usedVariable to range.lower.value
           }
           addLoop(usedVariable) {
              ... // increment and check condition
           }
        }
      }
   }
   def NEXT(d:VariableDoubleExpression) = addToMemory {
      ... // perform next for name d.fromVariable
   }
}
Functions
Support for functions like INT() or LEN$() is simple to add. I use another trait BasicFunctions, that defines the functions in terms of expressions, e.g.
def INT(d:DoubleExpression) =
   new DoubleExpression {
      override def value = d.value.floor
   }
Time for Refactoring
Although I factored out the traits of getters, setters and functions there is still too much stuff in the main class. Parsing BASIC should be separated from the logic that stores and executes the lines of code, the "runtime". I call this logic the BasicMemory.
Scala DSL for BASIC class hierarchyTrait BasicParser is the main parser, which mixes in VariableEvaluation and BasicFunctions. It contains the NewStatement and all implicit conversions. Trait BasicMemory contains the program code as list of Commands and the logic needed to run them. The parser communicates with the runtime by using a small interface LinesOfCode for adding commands, runtime and memory (read variable) operations. BasicApplication is just for convenience, so BASIC programs may be written as
object Demo extends org.codecop.basic.BasicApplication {

   10 PRINT "Hello World"
   RUN
}
The Whole Beast
The shown code was developed with Scala 2.7.7. Download the full source here. There is more implemented than covered in this post, e.g. INPUT, END or STEP. This little DSL has been an experiment. It's far from complete. It lacks important features like DATA/READ or GOSUB/RETURN. I reckon that supporting DIM or DEFFN would be difficult as well. And there are things that can't be done in Scala, e.g. the shown problem with %. Still if this is interesting to people, I may do a proper release into an OSS project somewhere later.