Skip to content

CSV

Purpose

A general purpose comma separated values (CSV) format parser with inbuilt validation and output format templating.

Methods

Binding name: p6.csv


parse

Parse the CSV file specified in the configuration map and calls the given closure with each row processed.

Syntax

void p6.csv.parse(Map configuration, Closure rowNotify)
Parameter: configuration
Configuration Name Description
uri Mandatory. A file to open and use as the CSV input source (expressed as a string).
separator Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
quoteChar Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
escape Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
skipLines Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
strictQuotes Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
ignoreLeadingWhiteSpace Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
useFirstLineHeaders Optional. Set to true to read column descriptions from the first line of the csv. These column names will be used in all row Maps returned. Otherwise column name are col0..colN. The default is false.
encoding Optional. The encoding to use when reading the file. The default is UTF-8.
validation Optional. A Map of column numbers and the name of a valid content type. Type names are: date, byte, short int,long, double float, email, url and creditcard. See: http://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/GenericValidator.html. If a value is null or empty, validation is not applied. If validation is applied and fails then a P6Exception is thrown.
funcException Optional closure. Parameters (row, column, value, message). Called each time the validation of a value in a row fails. Return true continue processing on the next row. Return false to exit the processor with a P6Exception.
funcValidate Optional closure. Paramaters (row, column, value). Called each time a value in a row is validated. Return null or an empty string to signify the value is valid. Or return a message detailing why the validation has failed.
Parameter rowNotify

A single parameters containing an array of rows is passed to the closure. Each row is a map that uses the column names as keys if useFirstLineHeaders is set to true.

Examples

Simple parse of CSV file with notification per row.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv"
]

p6.csv.parse(cnf) { row ->
    println(row)
    // Could validate the row content in here and return false to halt the parse
    true
}

Simple parse of CSV file with notification per row and validation or column vales 1 and 17.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv",
    validation: [1: 'int', 17: 'date']
]

p6.csv.parse(cnf) { row ->
    println(row)
    true
}

Simple parse of CSV file with notification per row and complex validation rule and exception handler.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv",
    funcValidate: { row, column, value ->
        if(column == 8 && value == "1946437"){
            "This is not the droid you are looking for! row:" + row
        }
        else { "" }
    },
    funcException: { row, column, value, message ->
        println("Ooops! " + message)
        // true to continue processing
        true
    }
]

p6.csv.parse(cnf) { row ->
    println(row)
    true
}

parseToList

Parses the CSV file specified in the configuration map returning the processed values as a List or Maps.

Syntax

List p6.csv.parseToList(Map configuration)
Parameter: configuration
Configuration Name Description
uri Mandatory. A file to open and use as the CSV input source (expressed as a string).
separator Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
quoteChar Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
escape Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
skipLines Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
strictQuotes Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
ignoreLeadingWhiteSpace Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
useFirstLineHeaders Optional. Set to true to read column descriptions from the first line of the csv. These column names will be used in all row Maps returned. Otherwise column name are col0..colN. The default is false.
encoding Optional. The encoding to use when reading the file. The default is UTF-8.
validation Optional. A Map of column numbers and the name of a valid content type. Type names are: date, byte, short int,long, double float, email, url and creditcard. See: http://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/GenericValidator.html. If a value is null or empty, validation is not applied. If validation is applied and fails then a P6Exception is thrown.
funcException Optional closure. Parameters (row, column, value, message). Called each time the validation of a value in a row fails. Return true continue processing on the next row. Return false to exit the processor with a P6Exception.
funcValidate Optional closure. Paramaters (row, column, value). Called each time a value in a row is validated. Return null or an empty string to signify the value is valid. Or return a message detailing why the validation has failed.
Example

In memory parse of CSV file. Results returned as a List or Map.

def cnf = [
    skipLines: 1,
    separator: ',',
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv"
]
p6.csv.parseToList(cnf).inject(0) { i, entry->
 println(entry)
}


parseToXml

Syntax

int p6.csv.parseToXml(Map configuration, Closure docNotify)

Parses the CSV file specified in the configuration map and calls the given docNotify closure with each XML document generated. This method returns the number of XML documents generated.

Parameter: configuration
Configuration Name Description
uri Mandatory. A file to open and use as the CSV input source (expressed as a string).
groovyTemplate One of five groovy template types. Mandatory for parseToXml. See: http://docs.groovy-lang.org/next/html/documentation/template-engines.html
separator Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
quoteChar Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
escape Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
skipLines Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
strictQuotes Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
ignoreLeadingWhiteSpace Optional. See: http://opencsv.sourceforge.net/apidocs/com/opencsv/CSVParserBuilder.html
useFirstLineHeaders Optional. Set to true to read column descriptions from the first line of the csv. These column names will be used in all row Maps returned. Otherwise column name are col0..colN. The default is false.
encoding Optional. The encoding to use when reading the file. The default is UTF-8.
validation Optional. A Map of column numbers and the name of a valid content type. Type names are: date, byte, short int,long, double float, email, url and creditcard. See: http://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/GenericValidator.html. If a value is null or empty, validation is not applied. If validation is applied and fails then a P6Exception is thrown.
funcNext Optional closure. Parameters (lastrow, currentrow). Used by parseToXml. Return true to generate Xml document false to continue accumulating data
funcValidate Optional closure. Paramaters (row, column, value). Called each time a value in a row is validated. Return null or an empty string to signify the value is valid. Or return a message detailing why the validation has failed.
funcException Optional closure. Parameters (row, column, value, message). Called each time the validation of a value in a row fails. Return true continue processing on the next row. Return false to exit the processor with a P6Exception.
Parameter: docNotify

A single parameters containing an array of rows is passed to the closure. Each row is a map that uses the column names as keys if useFirstLineHeaders is set to true.

Alternative Groovy template engines

Groovy currently provides five template engines. Each engine supports a different template syntax and is suited to a different task:

  • SimpleTemplateEngine
  • StreamingTemplateEngine
  • XmlTemplateEngine
  • GStringTemplateEngine
  • MarkupTemplateEngine

For further details see: http://docs.groovy-lang.org/next/html/documentation/template-engines.html

Here is an example of using the SimpleTemplateEngine.

Example
def tpl2 = '''
     <response>
         <value>
             <addresses>
                 <% rows.eachWithIndex{row,index-> %>
                     <address id="${index}"><uniqueid>${row.UniqueName}</uniqueid><name id="${index}">${row.Name}</name></address>
                 <% } %>
             </addresses>
         </value>
     </response>
'''

def cnf2 = [
    skipLines: 1,
    useFirstLineHeaders: true,
    uri:"file:/Users/user/Documents/temp/test.csv",
    groovyTemplate: new groovy.text.SimpleTemplateEngine().createTemplate(tpl2),
    funcNext: { lastrow, currentrow ->

        if(null != lastrow){
            if(lastrow.Name != currentrow.Name){
                // Change of name so build xml
                return true
            }
        }
        false
    }
]

def docs = p6.csv.parseToXml(cnf2){ gpath ->
    println(groovy.xml.XmlUtil.serialize(gpath))
}
Example

Parse CSV file to XML with document stepping control via the script; notifications per document generated.

def tpl = '''
     <response version-api="2.0" xmlns:gsp="http://groovy.codehaus.org/2005/gsp">
         <value>
             <addresses>
                 <gsp:scriptlet>rows.eachWithIndex{row,index-></gsp:scriptlet>
                     <address id="${index}">
                         <!-- You can use GString expressions -->
                         <uniqueid>${row.UniqueName}</uniqueid>
                         <name id="${index}">
                             <!-- Or you can use expression tags as well -->
                             <gsp:expression>row.Name</gsp:expression>
                         </name>
                     </address>
                 <gsp:scriptlet>}</gsp:scriptlet>
             </addresses>
         </value>
     </response>
'''

def cnf = [
    skipLines: 1,
    useFirstLineHeaders: true,
    uri:"file:./src/test/resources/test1.csv",
    groovyTemplate: new groovy.text.XmlTemplateEngine().createTemplate(tpl),
    funcNext: { lastrow, currentrow ->

        if(null != lastrow){
            if(lastrow.Name != currentrow.Name){
                // Change of name so build xml
                return true
            }
        }
        false
    }
]

def docs = p6.csv.parseToXml(cnf){ gpath ->
    println(XmlUtil.serialize(gpath))
}