Fork me on GitHub

Project Notes

#435 XSLT CSV Transforms

XSLT transformations for converting CSV data.

Notes

CSV source transformation is possible with XSLT, but it requires XSLT 2.0 and 3.0 features, so I will be using Saxon. See LCK#434 Saxon Processor for details.

This is a simple demonstration of CSV transformation to XML and HTML formats.

The Sample File

See sample.csv:

Col1,Col2,Col3,Col4
a1,a2,a3,a4
v1,v2,v3,V4
X1,X2,X3,X4

Transform CSV to XML

See csv2xml.xsl for the transformation:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  expand-text="yes">

<xsl:output method="xml" indent="yes"/>

<!-- Parameter -->
<xsl:param name="csv-uri"/>

<xsl:template match="/">
  <rows>
    <xsl:variable name="lines" select="unparsed-text-lines($csv-uri)"/>
    <xsl:variable name="headers" select="tokenize($lines[1], ',')"/>

    <xsl:for-each select="$lines[position() > 1]">
      <row>
        <xsl:variable name="fields" select="tokenize(., ',')"/>

        <xsl:for-each select="1 to count($headers)">
          <xsl:element name="{$headers[current()]}">
            {$fields[current()]}
          </xsl:element>
        </xsl:for-each>
      </row>
    </xsl:for-each>
  </rows>
</xsl:template>

</xsl:stylesheet>

Note that the CSV file is referenced by parameter csv-uri. Saxon requires a source XML file, so we provide a dummy file null.xml. Running the transformation produces the output file sample.xml:

$ xslt3 -s:null.xml -xsl:csv2xml.xsl csv-uri=sample.csv -o:sample.xml
$ cat sample.xml
<?xml version="1.0" encoding="UTF-8"?>
<rows>
   <row>
      <Col1>
            a1
          </Col1>
      <Col2>
            a2
          </Col2>
      <Col3>
            a3
          </Col3>
      <Col4>
            a4
          </Col4>
   </row>
   <row>
      <Col1>
            v1
          </Col1>
      <Col2>
            v2
          </Col2>
      <Col3>
            v3
          </Col3>
      <Col4>
            V4
          </Col4>
   </row>
   <row>
      <Col1>
            X1
          </Col1>
      <Col2>
            X2
          </Col2>
      <Col3>
            X3
          </Col3>
      <Col4>
            X4
          </Col4>
   </row>
</rows>

Transform CSV to HTML

See csv2html.xsl for the transformation:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="3.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  expand-text="yes">

<xsl:output method="html" indent="yes"/>

<!-- Parameter -->
<xsl:param name="csv-uri"/>

<xsl:template match="/">
  <html>
    <body>
      <h2>CSV Table</h2>

      <xsl:variable name="lines" select="unparsed-text-lines($csv-uri)"/>
      <xsl:variable name="headers" select="tokenize($lines[1], ',')"/>

      <table border="1">
        <tr>
          <xsl:for-each select="$headers">
            <th>{.}</th>
          </xsl:for-each>
        </tr>

        <xsl:for-each select="$lines[position() > 1]">
          <tr>
            <xsl:for-each select="tokenize(., ',')">
              <td>{.}</td>
            </xsl:for-each>
          </tr>
        </xsl:for-each>
      </table>

    </body>
  </html>
</xsl:template>

</xsl:stylesheet>

Note that the CSV file is referenced by parameter csv-uri. Saxon requires a source XML file, so we provide a dummy file null.xml. Running the transformation produces the output file sample.html:

xslt3 -s:null.xml -xsl:csv2html.xsl csv-uri=sample.csv -o:sample.html
open sample.html

sample-html

About tokenize

These transforms rely heavily on the tokenize XSLT function.

The test-tokenize-string.xsl transform is a simple demonstration of its use:

<?xml version="1.0" encoding="windows-1252" ?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <!-- simple test of the tokenize function -->
 <xsl:template match="/">
  <xsl:for-each  select="tokenize('a,b,c',',')">
   <xsl:element name="field">
    <xsl:value-of select="."/>
   </xsl:element>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

The result tokenizes the 'a,b,c' input string:

$ xslt3 -s:null.xml -xsl:test-tokenize-string.xsl
<?xml version="1.0" encoding="UTF-8"?><field>a</field><field>b</field><field>c</field>

About analyze-string

The analyze-string can achoeve a similar goal.

The test-analyze-string.xsl transform is a simple demonstration of its use:

<?xml version="1.0" encoding="windows-1252" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <!-- this demonstrates simple use of analyze-string function -->
  <xsl:template match="/">
  <xsl:analyze-string select="'a,b,c'" regex=",">
    <xsl:matching-substring/>
      <xsl:non-matching-substring>
        <xsl:element name="field">
          <xsl:value-of select="."/>
        </xsl:element>
      </xsl:non-matching-substring>
  </xsl:analyze-string>
  </xsl:template>
</xsl:stylesheet>

The result tokenizes the 'a,b,c' input string:

$ xslt3 -s:null.xml -xsl:test-analyze-string.xsl
<?xml version="1.0" encoding="UTF-8"?><field>a</field><field>b</field><field>c</field>

Credits and References

About LCK#435
XSLTCSV

This page is a web-friendly rendering of my project notes shared in the LittleCodingKata GitHub repository.

Project Source on GitHub Return to the LittleCodingKata Catalog
About LittleCodingKata

LittleCodingKata is my collection of programming exercises, research and code toys broadly spanning things that relate to programming and software development (languages, frameworks and tools).

These range from the trivial to the complex and serious. Many are inspired by existing work and I'll note credits and references where applicable. The focus is quite scattered, as I variously work on things new and important in the moment, or go back to revisit things from the past.

This is primarily a personal collection for my own edification and learning, but anyone who stumbles by is welcome to borrow, steal or reference the work here. And if you spot errors or issues I'd really appreciate some feedback - create an issue, send me an email or even send a pull-request.

Follow the Blog follow projects and notes as they are published in your favourite feed reader