It’s safe to say that if someone designed fonts from scratch today they’d be very different on the inside. As with many technologies, the formats have evolved to allow for backwards compatibility and new requirements in ways which make them a bit, well, weird.
What’s weird with fonts on the web?
Modern web fonts contain multiple sets of values for positioning text which were added to the format over time. Partly this is due to OpenType unifying TrueType and Type 1, and part of it is due to Microsoft deciding to add two more sets of values when jumping on board with TrueType in the early 90s.
The result is that web fonts have 10-12 values which may or may not affect their vertical positioning depending on your platform and browser of choice. There’s no clearly defined way of using these values to position text – and especially once you add CSS’s text positioning to the mix, it’s a recipe for confusion.
Even Google, who did a full survey of browsers when creating their set of free web fonts, doesn’t have definite answers about which values effect which browsers.
Why is putting PDF fonts on the web particularly hard?
There’s two key reasons why you can’t trust any of these values when they’re found in fonts in PDF files.
- There’s a good chance that the font has been subsetted since its original values were calculated, meaning many of these characters are now missing and the values should be different.
- PDF very much has its own way of positioning text, which pretty much ignores all the values in the font. There’s a pretty good chance the values are just wrong or set to zero as a result.
So you have to (and we do!) completely regenerate these values when converting the font. We can compute some of these from the shapes of the glyphs in the font, but unfortunately some are supposed to be set by the font’s original designer.
At IDR Solutions at the moment we’re hard at work improving this in our PDF to HTML5 converter. We’re rewriting the way we calculate glyph metrics for those we can calculate, and on finding ways to make the designer-set values work on the web. You should expect to see ongoing improvements from now until February.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.