Search results for: marked content

What is Marked Content?

What is marked content? Marked Content is the way that semantic structure can be preserved inside PDF files. In this article, I will explain...
Zain
2 min read

How to find out if a PDF file has…

Because it turned out that people wanted to make PDF files accessible and extract content from PDF documents (and not just view them), Adobe...
Mark Stephens
51 sec read

How to convert PDF files to ePUB

Not all PDFs are created equally. Some go beyond simple visual layouts and include internal tags that describe the document’s structure. These are known...
Jacob Collins
1 min read

How to extract text from a PDF as JSON

Some PDF files can be “tagged” which means they contain information about the structure of the file. This structure is embedded as metadata within...
Jacob Collins
1 min read

How to extract Structured text from PDF files in…

Developers hoping to extract content from PDF documents whilst maintaining the structure of the text should follow this tutorial. Some (but not all) PDF...
Mark Stephens
1 min read

What does the ActualText dictionary tag do?

Text is defined in the PDF file format as a display value (normally what you see onscreen) and an extraction value. It is useful...
Mark Stephens
29 sec read