Many of the issues you see in PDF file come from the interaction of different parts. Here is a really good example I came across while debugging a file…
The CropBox is usually the visible page area within a larger MediaBox. It is often used to hide things like printers crop marks and is generally the visible part you see in a PDF viewer. So what happens if the CropBox is larger than the MediaBox? Does the PDF even work?
Here is the raw Root object from an example PDF file I have been looking at.
3 0 obj<</CropBox[0 0 595.22 842] /Parent 2 0 R
/B[347 0 R] /Contents 4 0 R /Rotate 0 /BleedBox[0 0 595.22 842] /ArtBox[0 0 595.22 842] /MediaBox[56.6929 56.6929 476.22 651.969] /TrimBox[0 0 595.22 842] /Resources<< /Font<> /ProcSet[/PDF/Text/ImageB/ImageC/ImageI] /Properties<> /ExtGState<>>>/Type/Page>>
As you can see the MediaBox is actually inside the CropBox. Do we:-
1. Use the CropBox value and display a ‘margin’ around the actual page data.
2. Use the smaller MediaBox as the CropBox value.
3. Throw an error.
As usual, our guide is how Acrobat behaves – the correct answer is 2. Did you guess correctly?
This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!
Latest posts by Mark Stephens (see all)
- My experience of a Turkish bath (visiting Istanbul for DevFest) - November 24, 2017
- My 5 key takeaways from JavaOne 2017 - October 6, 2017
- My notes and pictures from thursday JavaOne 2017 - October 5, 2017
- My notes and pictures from Wednesday JavaOne 2017 - October 5, 2017
- My notes and pictures from Tuesday JavaOne 2017 - October 4, 2017