Mark Stephens

• https://www.idrsolutions.com Mark founded the company and has worked with Java and PDF since 1997. The original creator of the core code, he is also a NetBeans enthusiast who enjoys speaking at conferences and reading. He holds an Athletics Blue and an MA in Mediaeval History from St. Andrews University.

203 Stories by Mark Stephens

How to identify a PDF file

The best way to identify a PDF file is to scan the first line of the file. In theory the first line of a...

December 13, 2012 49 sec read

Multiple trailers in a PDF file

TL;DR Multiple trailers allow for Incremental Updates in PDFs. New changes (data/objects) are appended to the file end, preventing a full rewrite. Each trailer...

November 14, 2012 2 min read

Table order in OTF fonts

As part of our TrueType to OpenType font conversion (we need this for PDF to HTML5 conversion to ensure fonts display on all browsers),...

August 24, 2012 49 sec read

How to extract Structured text from PDF files in Java (Tutorial)

TL;DR: PDFs use complex binary/compressed data that standard text editors can’t read. To inspect the internal structure, use JPedal (for debugging content streams), RUPS...

June 28, 2012 2 min read

Avoid transparency when printing in Java

Java has a print mechanism called Java Print Services. In most cases this works brilliantly, but beware the use of transparency in anything you...

May 22, 2012 1 min read

How are Embedded CMAP tables defined in a PDF File?

Every glyf inside a PDF file can have a display value and a different extraction value. This is useful because often you need to...

May 18, 2012 2 min read