The PDF file format is a very ‘flexible’ file format. You can put returns into the middle of a most objects. There is a certain PDF creation tool which believes lines should never exceed 80 characters and so inserts a breaks if the line is too long…
So when parsing a PDF file you need to be very ‘flexible’. In the following cases, we have escaped octal character sequences and string objects with some returns. Some of them need to be ignored and some are legitimate parts of the data. So in which cases should we use the value and which should we ignore?
I have added (13) so show the exact byte.
2 0 obj << /Title (\376\377\000U\000m\000l\000a\000u\000t\000e\000:\000\344\000,\000 \000\304\(13)\000,\000 \000\366\000,\000 \000\326\000,\000 \000\374\000,\000 \000\334\(13))
/V (\376\377\000N\000\260\000 \000i\000d\000e\000n\000t\000i\000f\000i\000a\000n\000t\000 \0009\0009\000\(13)9\0009\0009\0009\0009\000X)
/V (\376\377\000O\000b\000j\000e\000t\000 \000:\000 \000A\000t\000t\000e\000s\000t\000a\000t\000i\000o\000\(13)
n\000 \000d\000e\000 \000p\000a\000i\000e\000m\000e\000n\000t\000 \000d\000\351\000l\000i\000v\000r\000\(13)
\351\000e\000 \000p\000a\000r\000 \000p\000o\000l\000e\000-\000e\000m\000p\000l\000o\000i\000.\000f\000\(13)
Over to you???
Did you know...
IDRsolutions offers a whole range of online file converters to convert PDF and Microsoft Excel, Word and Office Documents to HTML5, SVG or image formats?
It is free to use for single file conversions and also includes Developer links if you want to use our commercial software for bulk conversions. Find out more on this page