I have been looking at a customer PDF file which seems to ‘hang’ on a page. It actually shows up the limitations of Java Shape code performance.
The page in question contains a couple of very high resolution Vector graphics – the command stream is 252 million characters long! Seems a bit excessive but it opens quickly in Acrobat. Most of the stream commands consist of clipping commands and shapes to draw the shading on a picture of a giraffe – it does look really cool if you zoom in to 6400% in Acrobat! We handle shapes and clipping in Java using the Shape and Area classes.
When we profiled the page, we found that the page contained 9000 complex clips and that the ‘hanging’ was actually the time executing the Shape code. In particular the Area methods equals() and intersect() are VERY slow on complex shapes. The performance hit is down to the complexity of the shapes not the number of shapes.
So the fix is to use the outline of the Shapes, not the shapes themselves if they are too complex. It is a compromise and it would be nice to see Oracle improve the performance of these classes…
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.