Text refers to the font/ text and/or is styled with some of the text formatting properties.

Text

How to extract Structured text from PDF files in…

Developers hoping to extract content from PDF documents whilst maintaining the structure of the text should follow this tutorial. Some (but not all) PDF...
Mark Stephens
1 min read

What does the ActualText dictionary tag do?

Text is defined in the PDF file format as a display value (normally what you see onscreen) and an extraction value. It is useful...
Mark Stephens
29 sec read

What are subsetted fonts in PDF files?

What are subsetted fonts? Subsetted fonts are fonts which  only include certain values. If you look at the fonts on your Computer you will...
Mark Stephens
2 min read

Problems caused by Arial font in PDF files

The Arial Font Arial is a very popular font with a contemporary sans serif design. The font family is also on the font families...
Mark Stephens
1 min read

What are CID fonts?

Understanding CID Fonts in PDFs CID fonts play a crucial role in supporting Asian and multi-byte character sets within PDF files. This article explains...
Mark Stephens
1 min read

How is text stored in a PDF file?

Text is defined in PDF files by a Font object and a set of TJ commands. So you will see something like this in...
Mark Stephens
55 sec read