Do I have to download the whole PDF if I view it across the Internet?

When you access a PDF file across the Internet (using a URL), it can take some time to open the file. This is down to the way a PDF file is designed – it consists of lots of PDF objects (which describe the pages) and a table linking these objects to each page.

This makes it very fast on a file system – the PDF viewer just reads the table (at the end of the PDF file) and loads just the required objects for any page using Random Access. A file system allows you to access any bytes in a file without having to start at the beginning. With a URL stream you cannot do this, you have to read them in order from the start. But an internet connection does not allow for Random Access. And to read the end of the file, you need to download the whole file – you cannot just skip to the end of the stream.

However you can create PDF files so that they store the table and all the objects for the first page at the start of the file. This means that the PDF can be displayed much faster. This is known as Linearized PDF. This mode allows you to view the PDF before it is fully downloaded and access the pages as soon as they are available.

So the answer to the question depends on how the PDF files are made. If they are linearized, you can access them much faster. Otherwise you will have to download the whole file because all the important information is stored at the end of the file.

Do you have any questions about the PDF file format? If you would like us to try and answer them in a blog post, contact us.

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Related Posts:

  • No Related Posts
The following two tabs change content below.

Mark Stephens

System Architect and Lead Developer at IDRSolutions
Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX. He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.
Markee174

About Mark Stephens

Mark Stephens has been working with Java and PDF since 1999 and has diversified into HTML5, SVG and JavaFX.

He also enjoys speaking at conferences and has been a Speaker at user groups, Business of Software, Seybold and JavaOne conferences. He has a very dry sense of humor and an MA in Medieval History for which he has not yet found a practical use.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>