Saad Bin Kader :: HTML to PDF Generation

An overview of HTML to PDF Generation in Java

OCTOBER 11, 2014

#WORK

PDFBox: Apache PDFBox is a Java library for working with PDF documents. It allows you to create new PDF documents and extract data from existing documents. This library is fitting for manipulating existing PDF files. See Example

FOP: The Apache FOP uses an XSLT stylesheet to convert the XML (or XHTML) into XSL-FO. Later, the FOP is used to read the XSL-FO document to format it into a PDF document. See Example

PDFClown: PDFClown has similar features like PDFBox. See Documentation

PDFjet: PDFjet provides basic PDF creation features which failed to convert an HTML document into a PDF effectively. See Details

JPod: JPod provides functionality to read and verify the document against the PDF specification. It can create new PDFs and do incremental updates. See Details

Flying Saucer & iText: Flying Saucer uses iText to renders PDFs from HTML. Some favorable features: Strong support for the CSS 2.1 specification that includes extensions for better paged-media support. Good performance. Support for XHTML including forms.

PD4ML: PD4ML is similar to Flying Saucer with an added set of taglibs . It provides a richer API as expected from a paid library.

Prince XML: This seemed the most impressive HTML to PDF conversion library according to the promised features by far. However, the need for a dedicated server running the provided jar makes it unconvincing.

Summary: iTextRenderer(Flying Saucer) still seems the best for rendering HTML to PDF documents, considering its popularity and provided support. However, iText and Price XML(paid version) are worthy of consideration.