Mark Stephens

  https://www.idrsolutions.com Mark founded the company and has worked with Java and PDF since 1997. The original creator of the core code, he is also a NetBeans enthusiast who enjoys speaking at conferences and reading. He holds an Athletics Blue and an MA in Mediaeval History from St. Andrews University.

   



203 Stories by Mark Stephens

How to identify a PDF file

The best way to identify a PDF file is to scan the first line of the file. In theory the first line of a...
49 sec read

Multiple trailers in a PDF file

TL;DR Multiple trailers allow for Incremental Updates in PDFs. New changes (data/objects) are appended to the file end, preventing a full rewrite. Each trailer...
2 min read

Table order in OTF fonts

As part of our TrueType to OpenType font conversion (we need this for PDF to HTML5 conversion to ensure fonts display on all browsers),...
49 sec read

How to extract Structured text from PDF files in Java (Tutorial)

TL;DR: PDFs use complex binary/compressed data that standard text editors can’t read. To inspect the internal structure, use JPedal (for debugging content streams), RUPS...
2 min read

Avoid transparency when printing in Java

Java has a print mechanism called Java Print Services. In most cases this works brilliantly, but beware the use of transparency in anything you...
1 min read

How are Embedded CMAP tables defined in a PDF File?

Every glyf inside a PDF file can have a display value and a different extraction value. This is useful because often you need to...
2 min read