PDFBox

Java, PDF Add comments

In my experience there are a two well known libraries for generating PDFs using java, iText and FOP.

FOP is an XSL:FO implemenation. The only time I really use FO is to convert docbook files to PDF. FO doesn’t really suite the sort of page generation that I’m interested in.

I’ve used iText alot in the past but as time rolls on I find that I’m more and more disapointed in it. Don’t get me wrong, it works. My issue is that it’s really badly put together. If iText doesn’t do something the way you want it to you need to write the raw PDF using the PdfContentByte. There seems to be very little thought put into the layers of abstraction. By the time I’ve done the things I needed to do most of my code consists of calls to a PdfContentByte instance. It’s close to writing PDF by hand. I get some page and XObject managment but that’s about it.

So I was considering writing my own library, then I came across PDFBox. PDFBox isn’t as all encompassing as iText but it does have some really nice layers of seperation. It layers PDF “objects” like page, font etc, upon a series of low level objects like dictionary, number and array. This is precisely what is on my whiteboard at home. It also has explicit support for reading and extracting information from PDF files (it even comes with a Lucene indexer). I’ve walked the source and done a couple of small tests and it looks pretty good.

Leave a Reply

Couldn't find your convert utility. Check that you have ImageMagick installed.