Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

Interesting PDF bugs – PDF text is really a tiny image with a big smask

57 sec read

Today’s article is a variation on an old favourite I have written about before. I have been investigating a PDF file which did not display correctly some text in our Java PDF viewer…

So I opened up the PDF file (which claimed to be created by Microsoft Publisher 2010) and found the raw commands relating to the text. It turned out that it was not actually text but a tiny 2×2 pixel dot applied to a much larger SMask.

Most of the time, an Smask is used to add a clip to an image. But the trick in this case was to actually have the inverted image on the stencil and then draw it onto a solid image. It is a slightly ‘odd’ way to do things but it is not disallowed in the PDF spec so we need to handle it correctly without slowing down or breaking the generic code to handle the more usual way to apply an SMask.

Our generic code did not allow for this case. The fix is to spot this usage case and substitute a new image the same size as the stencil before applying it.

This fixes this ‘pdf bug’ nicely. But it is a reminder that you can never take anything for granted in PDF files if you want to avoid ‘pdf bugs’.

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

IDRsolutions develop a Java PDF Viewer and SDK, an Adobe forms to HTML5 forms converter, a PDF to HTML5 converter and a Java ImageIO replacement. On the blog our team post anything interesting they learn about.

Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

3 reasons Java developers switch to JDeli from ImageIO

ImageIO is build into the JDK and provides basic image support in Java. JDeli is a commercial image library for Java Developers from IDRsolutions....
Mark Stephens
1 min read

Why we wrote our own Java jpeg2000 libraries

JPEG2000 is an important image file format which offers significant benefits over JPEG. For our specific usage it does generate significantly smaller file sizes...
Mark Stephens
52 sec read

How to choose JPG versus JPEG2000 for image files

Since we started to support both JPG and JPG2000 as image file outputs in our software, we have found that this is a very...
Mark Stephens
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2019. All rights reserved.