I came across an interesting issue with PDF Text fields while debugging a file this week. We were sent a 2 page document created with IText, containing some text fields and we were displaying both pages with text fields containing identical values – they appear different in Acrobat. Obviously Acrobat is always right (even when it disagrees with the PDF specification) so we dug deeper to see what was going on….
With PDF Acroforms, all form objects can share common Parent objects and they can then inherit values from them. So if a text field does not have a text value, it can inherit its Parent’s value. This is really useful because you can avoid having to repeat common values.
In this PDF, the Text fields on both pages shared the same Parent and because they had no text values, we were inheriting the value from the Parent. So our viewer displayed the same text value on both pages. However, form objects can also have an Appearance Stream which defines the display of the form object. This is what accounts for the different appearance.
So I found out that it is “allowed” to have 2 forms with different Appearance Streams, with a single parent that defined the text value for the field. So they both had the same text value but the appearance was different.
So either the appearance overrides the text value in read only text fields, or the child value is more important in defining the display of the form. So in this example the appearance streams are more important than the text value of the form object.
It is not an ideal way to work, because any software reading the text value for the form will not get the value which the user sees. For reading text values, the file is essentially broken. But our viewer now displays it as Adobe would (which is all most users care about at the end of the day).
So that is another mystery solved for me, and yet another way to interpret the spec. Have you come across any interesting and mysterious PDF files where things are not as they should be?
Our software libraries allow you to
|Convert PDF files to HTML
|Use PDF Forms in a web browser
|Convert PDF Documents to an image
|Work with PDF Documents in Java
|Read and write HEIC and other Image formats in Java