An overview of HTML to PDF Generation in Java
OCTOBER 11, 2014
PDFBox: Apache PDFBox is a Java library for working with PDF documents. It allows you to create new PDF documents and extract data from existing documents. This library is fitting for manipulating existing PDF files.
See Example
FOP: The Apache FOP uses an XSLT stylesheet to convert the XML (or XHTML) into XSL-FO. Later, the FOP is used to read the XSL-FO document to format it into a PDF document.
See Example
PDFClown: PDFClown has similar features like PDFBox.
See Documentation
PDFjet: PDFjet provides basic PDF creation features which failed to convert an HTML document into a PDF effectively.
See Details
JPod: JPod provides functionality to read and verify the document against the PDF specification. It can create new PDFs and do incremental updates.
See Details
Flying Saucer & iText: Flying Saucer uses iText to renders PDFs from HTML. Some favorable features:
Strong support for the CSS 2.1 specification that includes extensions for better paged-media support.
Good performance.
Support for XHTML including forms.
PD4ML: PD4ML is similar to Flying Saucer with an added set of
taglibs
. It provides a richer
API
as expected from a paid library.
Prince XML: This seemed the most impressive HTML to PDF conversion library according to the promised
features
by far. However, the need for a dedicated server running the provided jar makes it unconvincing.
Summary: iTextRenderer(Flying Saucer) still seems the best for rendering HTML to PDF documents, considering its popularity and provided support. However, iText and Price XML(paid version) are worthy of consideration.