Sam Howard Sam is a developer at IDRsolutions who specialises in font rendering and conversion. He's also enjoyed working with SVG, Java 3D, Java FX and Swing.

Font Conversion for PDF2HTML – dotsection

1 min read

We recently released support for converting Type1c (otherwise known as CFF) fonts to OpenType for use within our PDF to HTML converter. I thought it would make an interesting blog article because it gives you an insight into the world of fonts and also highlights some of the continuing issues with font compatibility in different browsers.

OpenType fonts consist of a bunch of tables containing different data about the font. Luckily for us, it also extends two previous formats – TrueType and CFF. This means that you can create an OpenType font by either using a number of tables from an existing TrueType font or by including a CFF font as a table, then adding a number of other required tables.

This means a large part of the new font, including the glyph outlines, can be generated simply by copying the binary straight out of the PDF – great, less work for us! Or so you’d assume…

Well, most of the time, yes. Unfortunately, though, it’s not always that simple.

CFF glyph outlines are made up of a series of instructions for drawing the glyph. Early versions of the specification included one instruction called ‘dotsection’ for specifying that a new section of a glyph – such as the dot on an i – was about to begin. It had no technical usage, and was completely ignored by all parsers. At some point it was removed from the specification, and continues to be completely ignored by the vast majority of parsers.

Unfortunately, Google Chrome isn’t one of them! Chrome has a bit of code called OTS (the OpenType Sanitiser) which goes through OpenType fonts Chrome is trying to use and checks them for potential problems which could cause the font engine being used (which varies by platform) to crash. When it finds a problem, it quite often fixes it, but in the case of dotsection commands it simply rejects the font outright.

So due to this, we have no way of ensuring a font is accepted by Chrome except by either completely rewriting the CFF data from scratch, or going through the CFF data and stripping out dotsection commands, keeping track of a large number of offsets and updating them accordingly. We chose the latter option, and now every CFF font for conversion is quickly scanned for potential issues, and just as quickly fixed if any rogue commands are found.

All of this could easily be avoided by proper backwards compatibility with fonts among browsers, but until that happens strange quirks like this are sure to keep popping up. Do you think browsers should be strict (like Chrome) or just try and make things work (like Adobe Acrobat)?

This post is part of our “Fonts Articles Index” in these articles we explore Fonts.

Do you need to solve any of these problems?

Display PDF documents in a Web app
Use PDF Forms in a web browser
Convert PDF Documents to an image
Work with PDF Documents in Java

Are you a Developer working with PDF files?

Learn more about PDF file format
Sam Howard Sam is a developer at IDRsolutions who specialises in font rendering and conversion. He's also enjoyed working with SVG, Java 3D, Java FX and Swing.

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2022. All rights reserved.