Skip to content

PDF

Purpose

Generate PDFs from HTML files and merge PDFs.

Methods

Binding name: p6.pdf


fromHtml

Generates a PDF from an HTML string(required) at location specified by targetUri. We can also add metadata to the PDF if required.

More information about the arguments

  • html (required) – The HTML content to be converted into a PDF.
  • targetUri (optional, default: null) – The destination URI where the generated PDF should be saved. If null, it creates the temp file.
  • metadata (optional, default: null) – A map containing metadata such as title, author, subject and keywords for the generated PDF.

Returns the URI written to.

Syntax

String p6.pdf.fromHtml(String html, String targetUri = null, MapString, String> metadata = null)

Warning

The CSS of the HTML must be version 2.1 max.

The targetUri must point to a local file (e.g. protocol file: only).

Tip

To use a temporary file, set null to the parameter targetUri

Example

Temporary file

p6.pdf.fromHtml('<div><b>Bold</b> text</div>')

Incorrect html as string return in an error

p6.pdf.fromHtml('<b>Bold</b> text')

Specify the target

p6.pdf.fromHtml('<div><b>Bold</b> text</div>', 'p6file://${P6_DATA}/path/to.pdf')

Specify the target with metadata

def metadataMap = [
    "Title"   : "ABC",
    "Keywords": "p6",
    "Author"  : "author",
    "Subject" : "Invoice"
]
p6.pdf.fromHtml('<div><b>Bold</b> text</div>', 'p6file://${P6_DATA}/path/to.pdf', metadataMap)
Specify the metadata without target
def metadataMap = [
    "Title"   : "ABC",
    "Keywords": "p6",
    "Author"  : "author",
    "Subject" : "Invoice"
]
p6.pdf.fromHtml('<div><b>Bold</b> text</div>', metadataMap)


merge

Merges PDFs specified in the List of sourceUris and write the result to targetUri. Returns the URI written to.

Syntax

String p6.pdf.merge(List<String> sourceUris[, String targetUri])

Warning

The targetUri must point to a local file (e.g. protocol file: only).

Tip

To use a temporary file, set null to the parameter targetUri

Example

Temporary file

p6.pdf.merge(['p6file://${P6_DATA}/path/pdf1.pdf', 'p6file://${P6_DATA}/path/pdf2.pdf'])

Specify the target

p6.pdf.merge(['p6file://${P6_DATA}/path/pdf1.pdf', 'p6file://${P6_DATA}/path/pdf2.pdf'], 'p6file://${P6_DATA}/path/to.pdf')


parse

Parses the PDF file specified in the configuration map and calls the given closure with each row processed.

Syntax

void p6.pdf.parse(Map<String, Object> configuration, Closure rowNotify)
Example
def cnf = [
    area0: '402.89,17.24,550.29,64.89',
    area1: '30.6,346.29,195.95,150.07',
    pages: '1,2',
    uri: 'p6file://${P6_DATA}/00140_Facture Alfa.pdf'
]

p6.pdf.parse(cnf) { pageNumber, row ->
    p6.log.debug( pageNumber + ': ' + row )
    if ( pageNumber == 2) false         // Returning false will halt page iteration
    else true
}
def cnf = [
    columns0: '0,25.0,71.3,180.53,462.91,504.42,535.45,585.68,643.15,714.6',
    uri: 'p6file://${P6_DATA}/00140_Facture Alfa.pdf'
]

p6.pdf.parse(cnf) { pageNumber, row ->
    p6.log.debug( pageNumber + ': ' + row )
}

parseToList

Parses the PDF file specified in the configuration map returning the processed values as a List of Tuples (pageNumber, row).

Syntax

List<Tuple> p6.pdf.parseToList(Map<String, Object> configuration)
Parameter: configuration
Configuration Name Description
password (Optional) Password to use to decrypt the pdf
spreadsheetDisabled (Optional) Force PDF not to be extracted using spreadsheet-style extraction (if there are ruling lines separating each cell, as in a PDF of an Excel spreadsheet). The default is true.
areaFail (Optional) If a configured area does not select text on a page a P6Exception is thrown, unless this value is false. The default is true
areaN (Optional) N is a zero based numeric. If no area(s) are given, the whole of each page will be used as the bounding area. All areas defined will be applied to each page specified. Area format is defined in ‘Points’ and can be identified using OSX Preview via ‘Rectangular Selection’ mode. A comma separated string is required: '{top},{left},{width},{height}'
columnsN (Optional) N is a zero based numeric. A comma separated list of X coordinates of column boundaries.
uri (Mandatory) The URI of the source PDF file to parse.
pages (Optional) If not specified, all pages in the source file will be processed. A comma separated string list of page numbers is required.
Example
def cnf = [
    area0: '402.89,17.24,550.29,64.89',
    areaFail: false,
    uri: 'p6file://${P6_DATA}/00140_Facture Alfa.pdf'
]

def lstTuples = p6.pdf.parseToList(cnf)

lstTuples.each { tup ->
    p6.log.debug( tup.get(0) + ": " + tup.get(1) )
}

split

Copy pages from a source PDF file to a destination PDF file.

Syntax

void p6.pdf.split(Map<String, Object> configuration)
Parameter: configuration
Configuration Name Description
password (Optional) Password to use to decrypt the pdf
keepAnnotations (Optional) true to retain any annotations in the destination (default: false)
startPage (Mandatory) A one based numeric specifying the first page to copy to the new destination
endPage (Mandatory) A one based numeric specifying the last page (and all pages in between) to copy to the new destination
sourceUri (Mandatory) The URI of the source PDF file
destinationUri (Mandatory) The URI of the destination PDF file. Destination will always be overwritten
Example
def cnf = [
    startPage: 3,
    endPage: 4,
    sourceUri: 'p6file://${P6_DATA}/00140_Facture Alfa.pdf',
    destinationUri: 'file:/tmp/page4.pdf'
]

p6.pdf.split(cnf)

sign

Sign the PDF file specified in the configuration map and write the result to the targetUri. Returns the URI written to.

Syntax

String p6.pdf.sign(Map<String, Object> configuration)

Warning

The targetUri must point to a local file (e.g. protocol file: only).

Tip

To use a temporary file, set null to the parameter targetUri

Parameter: configuration
Configuration Name Description
keyStoreUri (Mandatory) The URI of the KeyStore file (PKCS12)
keyStorePassword (Optional) Password to open the KeyStore
keyStoreAlias (Optional) Alias to use in the KeyStore. (First one will be used by default)
uri (Mandatory) The URI of the source PDF file to parse.
password (Optional) Password to use to decrypt the pdf
tsa (Optional) URL of the TSA server to timestamp the signed file
reason (Optional) The signature reason.
targetUri (Optional) The URI of the target signed PDF file.
Example
def cnf = [
    keyStoreUri: 'file://${P6_DATA}/keystore.p12',
    keyStorePassword: '123456',

    uri: 'file://${P6_DATA}/source.pdf',
    reason: 'Signed on Platform6',
    targetUri: 'file://${P6_DATA}/signed.pdf'
]

p6.log.debug "Signed PDF path:" + p6.pdf.sign(cnf)

Tip

You can generate a p12 file for your tests using the command line:

openssl req -x509 -newkey rsa:1024 -keyout key.pem -out cert.pem -days 365
openssl pkcs12 -export -out keyStore.p12 -inkey key.pem -in cert.pem -name test

fromXml

Generates a PDF from the given XML and XSLT strings/filePaths.

Returns the URI written to.

Optional params:

  • targetUri – The destination URI where the generated PDF should be saved. By Default, it creates the temp file.
  • metadata – A map containing metadata such as title, author, subject and keywords for the generated PDF.

The targetUri must point to a local file (e.g. protocol file: only).

Syntax

String p6.pdf.fromXml xml schema xsl 

Input XML

    <?xml version="1.0" encoding="UTF-8"?>
    <employees>
    <employee>
            <name>Alice</name>
            <role>Developer</role>
        </employee>
    </employees>
Input XSL

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/">
        <html>
            <body>
                <h2>Employee Details</h2>
                <table border="1">
                    <tr>
                        <th>Name</th>
                        <th>Role</th>
                    </tr>
                    <xsl:for-each select="employees/employee">
                        <tr>
                            <td><xsl:value-of select="name"/></td>
                            <td><xsl:value-of select="role"/></td>
                        </tr>
                    </xsl:for-each>
                </table>
            </body>
        </html>
    </xsl:template>

</xsl:stylesheet>
Example

Temporary file

p6.pdf.fromXml xml schema xsl

Specify the target

p6.pdf.fromXml xml schema xsl, {targetUri 'p6file://${P6_DATA}/path/to.pdf'}

Specify the target with metadata

def metadataMap = [
    "Title"   : "ABC",
    "Keywords": "p6",
    "Author"  : "author",
    "Subject" : "Invoice"
]
p6.pdf.fromXML xml schema xsl, {targetUri 'p6file://${P6_DATA}/to.pdf'; metadata metadataMap}
Specify the metadata without target

def metadataMap = [
    "Title"   : "ABC",
    "Keywords": "p6",
    "Author"  : "author",
    "Subject" : "Invoice"
]
p6.pdf.fromXml xml schema xsl {metadata metadataMap}

Specify xml and xsl using filepath

p6.pdf.fromXml 'p6file://${P6_DATA}/xml.xml' schema 'p6file://${P6_DATA}/xslt.xml'

compress

Since 6.10.11

Compress a PDF file using different approches:

  • Image compression
  • Font compression
  • Image manipulation (quality, dpi, greyscale)
  • Remove annotations

It returns statistics about the compression:

Key Description Type
source.path The path of the source file String
source.size The size of the source file in bytes Long
source.size.pretty The size of the source file in human readable size (e.g. 1.2 MB) String
compression.success True if the compression is successful Boolean

Extra parameters are returned if the compression is successful:

Key Description Type
compression.level Percentage of compression of the PDF String
compression.duration Compression duration String
target.path The path of the target file String
target.size The size of the target file in bytes Long
target.size.pretty The size of the target file in human readable size (e.g. 1.2 MB) String
annotation.compression.enable True if PDF annotation are removed Boolean
font.compression.enable True if PDF font are compressed Boolean
image.compression.enable True if PDF images are compressed Boolean
image.compression.parameters The compression parameters used for PDF images String
image.count Number of PDF found images Int
image.cached True if the duplicate images should be cached Boolean
image.source.size Sum of all the images size Long
image.target.size Sum of all the compressed images size Long
image.compression.level Percentage of compression of the PDF images String

Syntax

Map<String, Object> p6.pdf.compress '/path/to/source.pdf'
Map<String, Object>p6.pdf.compress '/path/to/source.pdf', {
    destination null
    threshold null
    silent false
    replace true
    compressFonts true
    compressImages true
    quality 0.5f 
    dpi 72
    greyscale false
    removeAnnotations false
}

Note

The default values of the previous parameters are the default ones and all the parameters are optional.

  • destination (String) - if not set or empty, a temporary file will be created
  • threshold (Long) - The limit threshold size in bytes for the compressed file. If the size is bigger than threshold a P6Exception will be throw (unless using the silent mode)
  • silent parameter is set to false by default. If set to true, no exception will be thrown if the result of the compression does not match the expectations
  • replace parameter will be ignored if a destination is specified. Otherwise, the source file will be replaced by the compressed one

Warning

The compression DSL method can throw a P6Exception if an unexpected error occurs during the compression process.

Example
# The compressed file will override the source file 
p6.pdf.compress '/path/to/source.pdf'

# The compressed file will be saved to a temporary file
p6.pdf.compress '/path/to/source.pdf', { destination '' }
p6.pdf.compress '/path/to/source.pdf', { replace false }

# The compressed file will be saved to the destination filepath
p6.pdf.compress '/path/to/source.pdf', { destination '/path/to/target.pdf' }

# Compression options
p6.pdf.compress '/path/to/source.pdf', { quality 0.1f; dpi 36; greyscale true }    
p6.pdf.compress '/path/to/source.pdf', { removeAnnotations: true; compressFonts: false }

# Compression with threshold
try {
    p6.pdf.compress '/path/to/source.pdf', { threshold 1000000 }
} catch (P6Exception e) {
    p6.log.error "Compression failed: ${e.message}"
}