Mark Stephens Mark founded the company and has worked with Java and PDF since 1997. The original creator of the core code, he is also a NetBeans enthusiast who enjoys speaking at conferences and reading. He holds an Athletics Blue and an MA in Mediaeval History from St. Andrews University.

What is a Linearized PDF?

2 min read

(PDF logo)

(PDF logo)

A linearized PDF is a special way to organize a PDF file.

What are Linearized PDF files?

In general, the PDF is a very elegant and well-designed format. A PDF consists of lots of PDF objects which are used to create the pages. This information is stored in a binary tree which also stores the location of each object in the file.

So only the tree needs to be loaded when the file is opened, and it can then be used to load the required objects to display a page.

The whole file itself does not need to be read, only the tree. The location of the tree is always stored at the end of the file so it is easy to find and also simple to modify the file just by appending new information and a new tree.

However, if the file is read via the web, it is accessed as a stream of bytes. This means the reference (which is at the end of the file) cannot be read until the whole PDF file has been transferred. This can take some time with large files.

How to create a Linearized PDF

So Adobe created a new way to layout the PDF called Linearized PDF. The file format is still the same, but there is a special tag at the start of the file and all the objects needed to create the first page (and a mini binary tree describing them) are stored at the START of the file.

As soon as this data has been read, the first page can be displayed, while the rest of the file is downloaded. This makes the whole thing seem much faster and gives the user something to look at almost immediately even on huge files.

What is the difference between linearized PDF and non-linearized PDF

Linearized PDF are used to organize internal components in an ordered, page-by-page basis which helps in enhancing web viewing experience by allowing users to view the most desired pages as quickly as possible.

Whereas non-linearized PDF have objects scattered across the entire file and need to be fully downloaded before viewing.

FeatureLinearized PDFNon-Linearized PDF
Page Access (Web)First page can be viewed immediately, even before full file downloadMust download entire file before any page is viewable
OrganizationData arranged page-by-page with first-page content and index at file startObjects and page data scattered throughout the file
Web Streaming SupportOptimized for streaming and progressive renderingNot optimized; requires full file for access
Use CaseIdeal for large PDFs shared or viewed onlineBetter suited for small, local, or offline files
User ExperienceFaster initial load, better performance on slow networksSlower loading, especially for large documents

How to check if a PDF is linearized?

In Adobe Acrobat and Adobe Reader, the best way to see if a PDF is Linearized is to look at the Document properties. If the file is a linearized PDF, the item Fast Web View will display ‘Yes’.

In JPedal PDF Viewer, we have added a similar option so show if the file is Linearized to the Document properties. If it is Linearized, the word linearized appears in the general section after the PDF version.

and the Java code to see if a PDF file is linearized…

In JPedal you can also check programmatically to see if a file is Linearized by seeing if the Linearized object exists – if it does it is a Linearized PDF.

 

Why are Linearized PDFs important?

In a nutshell, a Linearized PDF is a way of organizing a PDF file so that if it is going to be accessed over the Internet it will appear to load much faster. And it does this very well!

FAQs

Q: Can linearized PDFs be created or optimized using free tools?

A: Yes, several free and open-source tools, such as Ghostscript and PDFtk, offer options to linearize PDF files for fast web viewing.

Q: Does linearizing a PDF affect the file size or quality?

A: Linearization reorganizes the file structure but does not compress or reduce quality by itself; file size stays roughly the same, and the document appearance is unchanged.

Q: Are linearized PDFs compatible with all PDF viewers and browsers?

A: Most modern PDF viewers and browsers support linearized PDFs, but some older or specialized viewers might not fully utilize the optimized streaming capabilities.



The JPedal PDF library allows you to solve these problems in Java


Mark Stephens Mark founded the company and has worked with Java and PDF since 1997. The original creator of the core code, he is also a NetBeans enthusiast who enjoys speaking at conferences and reading. He holds an Athletics Blue and an MA in Mediaeval History from St. Andrews University.