There are 2 main font technologies used in PDF font files (Postscript/Type1 and Truetype). There is also a ‘merged’ format which borrows features from both (OpenType). Both formats have their own advantages and represent politics and stategy in the last 20 years as much as any technical advances.
Both however are very good for displaying European style text (ie French, Engligh, German) with limited numbers of characters. They are less suited to languages such as Chinese or Japanese. This is where CID fonts come in – they are extensions of these font technologies to provide better support for these languages. CidFontType0 extends Type1 (Postscript) while CidFontType2 extends TrueType.
The main features that CID fonts add are the ability to have 16bit values (so 65535 separate glyfs rather than 256) and much more sophisticated and more flexible unicode settings for extraction. Predefined CMAPs (or custom ones embedded by the user) allow for text extraction to provide appropriate values.
Encoding is far more elaborate for CID fonts with the CIDSystemInfo key allowing a number of preset values for common languages (ie Korean, Japanese, Chinese) and the CIDtoGIDMap in Type2 fonts allowing custom control.
CID fonts are also better at allowing for text which does not have a left to right flow. There is even a vertical writing mode.
Adding these features onto the technically tried and test Type1/Truetype font technologies offers a very elegant way to display Chinese and Japanese glyfs. Which is another reason for the PDF file format’s popularity.
This post is part of our “Understanding the PDF File Format” series. In each article, we aim to take a specific PDF feature and explain it in simple terms. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
This post is part of our “Fonts Articles Index” in these articles we explore Fonts.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.