22 March 2012

Prime Factors XSLT

Here is a short conversation we had at Hackergarten Vienna yesterday:

I did the Prime Factors Kata in XSLT. Would you like to review it?
You did WHAT?
You know, this small program. It's calculating the prime factors of a number.
But what? In XSLT? Why XSLT?
I know that it sounds a bit crazy. I had performed the kata in all programming languages which I know, e.g. Java, Ruby and even old BASIC or Turbo Pascal, but not using XSL transformations. So I just had to do it.

Initial Version
Using XSLTunit I created a test case that would apply the template to <value> to calculate the prime factors of the given source.
<xsltu:test id="test-template-factors-one">
  <xsl:variable name="source">
  <value>1</value>
  </xsl:variable>
  <xsl:call-template name="xsltu:assertEqual">
  <xsl:with-param name="id" select="'template factors 1'" />
  <xsl:with-param name="nodes1">
    <xsl:apply-templates select="exsl:node-set($source)/value" />
  </xsl:with-param>
  <xsl:with-param name="nodes2"></xsl:with-param>
  </xsl:call-template>
</xsltu:test>

...

<xsltu:test id="test-template-factors-four">
  <xsl:variable name="source">
  <value>9</value>
  </xsl:variable>
  <xsl:call-template name="xsltu:assertEqual">
  <xsl:with-param name="id" select="'template factors 4'" />
  <xsl:with-param name="nodes1">
    <xsl:apply-templates select="exsl:node-set($source)/value" />
  </xsl:with-param>
  <xsl:with-param name="nodes2">3 3</xsl:with-param>
  </xsl:call-template>
</xsltu:test>
and so on. The result looked like that:
<xsl:template match="value">
  <xsl:call-template name="generate">
  <xsl:with-param name="number" select="number(current())" />
  <xsl:with-param name="candidate" select="number(2)" />
  </xsl:call-template>
</xsl:template>

<xsl:template name="generate">
  <xsl:param name="number" />
  <xsl:param name="candidate" />
  <xsl:choose>
  <!-- one is no prime and does not have any factors -->
  <xsl:when test="$number = 1"></xsl:when>
  <!-- candidate == number, so number is a prime -->
  <xsl:when test="$candidate = $number">
    <xsl:value-of select="$number" />
  </xsl:when>
  <!-- number is factored by the candidate, factor it -->
  <xsl:when test="$number mod $candidate = 0">
    <xsl:value-of select="$candidate" />
    <xsl:text> </xsl:text>
    <xsl:call-template name="generate">
    <xsl:with-param name="number" select="$number div $candidate" />
      <xsl:with-param name="candidate" select="$candidate" />
      </xsl:call-template>
    </xsl:when>
    <!-- try again with the next factor -->
    <xsl:otherwise>
      <xsl:call-template name="generate">
        <xsl:with-param name="number" select="$number" />
        <xsl:with-param name="candidate" select="$candidate + 1" />
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
StackOverflowError
Using javax.xml.transform.TransformerFactory from Java 6 to run the XSLT transformation, the last prime that was processed successfully was 7351. Calling generate with the following prime (7369) caused a StackOverflowError inside Apache Xalan due to the recursion. As there was no need to try candidates larger than the square root of number, I replaced the test for $candidate = $number with $candidate * $candidate > $number. This brought me as far as 54184313.

Reducing the Stack Depth
XSLT is a functional language, there are no loops, the only way to iterate is recursion. So the size of the largest prime to process is limited by the size of the stack. Still half of the recursions are useless because no even number larger than two can be a prime.
<!-- try again with the next factor -->
  <xsl:otherwise>
    <xsl:choose>
      <xsl:when test="$candidate = 2">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 1" />
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 2" />
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:otherwise>
This can be improved further by unrolling the (recursive) loop. If number is not factored by candidate + 2 then the next one is candidate + 4 and so on.
<!-- try again with the next factor -->
  <xsl:otherwise>
    <xsl:choose>
      <xsl:when test="$candidate = 2">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 1" />
        </xsl:call-template>
      </xsl:when>
      <xsl:when test="$number mod ($candidate + 2) = 0">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 2" />
        </xsl:call-template>
      </xsl:when>
      <xsl:when test="$number mod ($candidate + 4) = 0">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 4" />
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 6" />
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:otherwise>