15 September 2025

Von Neumann Turing Machine

On my search for harder coding challenges, join me implementing Fizz Buzz using a Turing machine where I dive into Turing machines from a practical coding perspective using Java. In the previous part I finally understood Turing machines (TM) and created a Universal Turing Machine (UTM) using a Tape and TuringMachine classes. This is the second part.

Fizz Buzz
I want to implement Fizz Buzz but I have no idea how to start. A simple implementation in Java might be:
for (i = 1; i <= 100; i++) { // line 1
  if (i % 3*5 == 0)
    print("FizzBuzz");
  else if (i % 3 == 0)
    print("Fizz");
  else if (i % 5 == 0)
    print("Buzz");
  else
    print(i);
}
Fizz Buzz is an easy problem, still it needs several programming constructs: Looping or sequences, conditional logic, arithmetic with integers including division for the remainder, strings and some form of input and output. The input is the number of lines required, in the code above hardcoded to 100, and the display showing the result, e.g.
1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
What is the end state when I miss a display? For a TM the input will be the number on the tape and the output will be the list of strings on the tape again.

MM Numbers (licensed CC BY-NC-ND by Michele C)Representing Numbers
I am overwhelmed. When working katas, exercises or new languages, I usually create the code bottom up, making things up on my way. Following that approach, I have to figure out individual operations first, then will I combine them into more powerful programs. For Fizz Buzz I first need to count from one to 100, incrementing by one. The smallest (binary) space to represent the number 100 needs seven bits which are seven cells on the tape containing a '0' or '1' followed by a separator '$', e.g. 100 would be 1100100$ on the tape. (To make my life easier, I assume the set of symbols contains arbitrary characters. After all each character is a sequence on bits anyway.) The separator helps because I do not need to keep track of positions inside the sequence of working with bits during some operations. (This is one of many design options. Another approach would be to store the position of each bit next to the bit on the tape.)

Incrementing Numbers
The first operation I need is INC (increment by 1). Increment, an x86 assembly op-code, adds 1 to an unsigned integer operand. (Now - while writing this - I see why I ended up where I ended up. My first idea for a building block was a CPU assembly operation on a sequence of bits. From there I was just following the given direction...) In case you have not done any assembly, let me explain INC operating on a binary numbers. Let's assume a value, e.g. 78, binary 1001110. The increment adds 1,
01001110
+      1
--------
 1001111
Like adding regular (Arabic base 10) numbers, addition starts most left. The 1 adds to the 0 and we are done. In short, if there is a '0' bit, it becomes a '1' and that's it. So 78 becomes 79. Let's increment it again:
01001111
+      1
--------
 1001110
+     1  ... overflow from the first digit
--------
 1001100
+    1   ... overflow from the second digit and so on
--------
 1010000
which is 80. In short, if there is a '1' bit, it becomes a '0' and INC continues incrementing left with the overflow.

INC on the UTM
Working on the tape of the TM, the algorithm to increment the number right of the read/write head is:
  1. Move the read/write head right until the end of the number (i.e. until the '$') to do the increment.
  2. Work backwards, i.e. move left to increment.
  3. If there is a '0', then write a '1' which finishes increment - and start moving right to the end.
  4. If there is a '1', then write a '0' and continue incrementing left with the overflow bit.
  5. In the end move right until the '$' to leave the read/write head in a consistent state.
The last bit must not overflow. Increment's transition table is

Step state symbolnewStatenewSymboldirection
(1) Inc anyInc_MoveRightAndIncsameR
 Inc_MoveRightAndInc'0'Inc_MoveRightAndIncsameR
 Inc_MoveRightAndInc'1'Inc_MoveRightAndIncsameR
(2) Inc_MoveRightAndInc'$'Inc_IncToTheLeft sameL
(3) Inc_IncToTheLeft '0'Inc_DoneMoveRight '1' none
(4) Inc_IncToTheLeft '1'Inc_IncToTheLeft '0' L
(5) Inc_DoneMoveRight '0'Inc_DoneMoveRight sameR
 Inc_DoneMoveRight '1'Inc_DoneMoveRight sameR
(T) Inc_DoneMoveRight '$'Halt samenone

This table is a standalone test program a.k.a. a Turing machine. Line (T) stops after incrementing a given number. I can easily unit test the code comparing tapes, e.g.
@Test
void incZero() {
  transitions.addIncTo(table);
  createMachineWith("0000000$", Q.Inc, table);
  machine.loop();
  assertTapeEquals("0000001$");
}

@Test
void incOne() {
  transitions.addIncTo(table);
  createMachineWith("0000001$", Q.Inc, table);
  machine.loop();
  assertTapeEquals("0000010$");
}

@Test
void incToMax() {
  transitions.addIncTo(table);
  createMachineWith("0111111$", Q.Inc, table);
  machine.loop();
  assertTapeEquals("1000000$");
}
DEC (Decrement by 1) is the same with '0' and '1' switched. Now I know how to build individual operations.

More Operations
Next operation is DUP (duplicate a number). Things get cumbersome because the state of the current bit, i.e. if DUP is copying '0', '1' or '$' eight places to the right has to be encoded in the states. For example the transition table to duplicate a '0' is

state symbolnewState newSymboldirection
Dup '0'Dup_Move7RightAndWrite0 sameR
Dup_Move7RightAndWrite0anyDup_Move6RightAndWrite0 sameR
Dup_Move6RightAndWrite0anyDup_Move5RightAndWrite0 sameR
Dup_Move5RightAndWrite0anyDup_Move4RightAndWrite0 sameR
Dup_Move4RightAndWrite0anyDup_Move3RightAndWrite0 sameR
Dup_Move3RightAndWrite0anyDup_Move2RightAndWrite0 sameR
Dup_Move2RightAndWrite0anyDup_Move1RightAndWrite0 sameR
Dup_Move1RightAndWrite0anyDup_Write0AndMove7Back sameR
Dup_Write0AndMove7Back anyDup_Move7LeftAndStartAgain'0' none

There are 9 state transitions for each "bit" - to "remember" the bit - and additional 7 states to move left again and start over, giving a total of 34 states for my simplified transition table allowing shortcuts for any symbol etc.

I am still implementing line 1 in the code on top of the page:,Counting from one to 100, incrementing by one. To create a loop, I need to start with one (0000001$), duplicate it, increment the new number, and repeat while it is less than 100 or until it is 100. The instructions LESS and EQUAL are similar to DUP. The current symbol has to be "remembered" in the states, the tape has to move eight places right and the comparison with the symbol there determines the result of the operation, then the tape has to move seven places left again and restart the operation.

Dusty Bottle Neck (licensed CC BY-NC-ND by caramand)Combining Operations
I started with x86 operations and I follow the idea further. I dislike duplicating INC or other states so I create a lookup table - an area on the tape that would be holding the program, think Code segment from x86 architecture before x86-64. The read/write head will have to move between the code and the data (think Data segment) and I need markers for both positions. Let 'P' be the Instruction Pointer (usually called IP) and 'C' be a cursor marking the last position the read/write head had when working with data. Each operation will
  1. Move the instruction pointer to the next instruction by switching the letter of the operation with 'P'.
  2. Move the read/write head right to the cursor 'C'. There replace the cursor with the separator '$'.
  3. Perform the operation. Each operation must end with the read/write head positioned at a separator.
  4. Set the new cursor position.
  5. Move the read/write head back to the instruction pointer
  6. and start over.
Storing the program and the data in the same place, i.e. the tape, makes this a Von Neumann architecture. While the UTM is processing instructions in the left side of the tape, it cannot work with data, which is on the right side of the data and vice versa. The term "von Neumann architecture" refers to any stored-program computer in which an instruction fetch and a data operation cannot occur at the same time (since they share a common bus) says Wikipedia.

INC as Instruction
Increment's extended, full transition table is

Step state symbolnewState newSymboldirection
(1) Ip_SwitchRight 'i'Ip_SwitchLeftInc 'P' L
 Ip_SwitchLeftInc 'P'Code_Inc 'i' none
(2) Code_Inc 'C'Inc '$' R
 Code_Inc anysame sameR
(3) Inc anyInc_MoveRightAndIncsameR
 Inc_MoveRightAndInc'0'Inc_MoveRightAndIncsameR
 Inc_MoveRightAndInc'1'Inc_MoveRightAndIncsameR
 Inc_MoveRightAndInc'$'Inc_IncToTheLeft sameL
 Inc_IncToTheLeft '0'DoneMoveRight '1' none
 Inc_IncToTheLeft '1'Inc_IncToTheLeft '0' L
 DoneMoveRight '0'DoneMoveRight sameR
 DoneMoveRight '1'DoneMoveRight sameR
(4) DoneMoveRight '$'Ip_Restart 'C' none
(6) Ip_Restart 'P'Ip_SwitchRight sameR
(5) Ip_Restart anysame sameL

and its test
@Test
void inc() {
  createMachineWith("PihC0000000$", Q.Ip_Restart, transitions.create());
  machine.loop();
  assertTapeEquals("ihP$0000001C");
}
Letters 'i' and 'h' represent operations INC and HALT. After execution the instruction pointer moved two places to the right and the data cursor moved one number entry to the right.

Other Instructions
The extension of the partial transition tables of DUP, LESS, EQUAL and operations to move the cursor left and right as instructions 'd', '<', '=', 'l' and 'r' is the same. States DoneMoveRight and Ip_Restart are generic and used by most of these instructions. See the full transition table here.

This concludes the second part of my Turing Machine exploration. I have spent too much time formatting transition tables. Now I have a machine on top of a machine. I made the problem easier by adding another level of abstraction (an indirection), an application of the Fundamental theorem of software engineering. ;-)

No comments: