The actual XML Structure
let us take a simple example in order to start to build the parser from scratch
<?xml version=”1.0″ encoding=”UTF-8“?>
<!– this is comment on defining the class of subforms –>
<subform id=’class1′ name=”class”>
<field name=”student” id=”s112″></field>
<field name=”student” id=”s113″></field>
<subform id=’class2 name=”class”>
<field name=”student” id=”k200″></field>
Now I will examine each section in turn with some suggestions.
1. Processing instructions:
In the above example xml version number and template designer are processing instructions. They Start with <% and ends with %>, Either you need to ignore it or you have to delete it to proceed forward
Do not consider comments as xml node, and ignore it. comments starts with <!– and ends with –> notation.
3. CData section:
Some xml files may contain CData and Doctype definitions. You can skip it unless you need to do any validation on the files.
4. Empty Nodes:
Some XFA files consist of empty nodes with or without attributes such as script in the above example.
5. white spaces between nodes
6. Handling attributes:
Attributes are separated by spaces and attribute nodename is seprated by “=” sign from its value.
If you are viewing XML in pretty printed format you may end up with whitespaces (tabs, linebreaks and spaces) between two nodes. Use a regular expression to remove it.
Unlike w3c dom parser, ecmascript parser follow object, array related notation to access child elements:
for example: to access second student of class1 subform in root element
1. in w3c dom:
2. in ecmascript:
However the “draw” child of root element should be accessed as root.draw without array notation; So object property has to be defined as array if it has more than one child with the same name otherwise it has to be treated as single object.
Learning more about ECMA
You can find more information on events, attributes and methods under LiveCycle® Designer ES Scripting Reference. This reference is divided into three sections which are known as methods, objects and properties. We need to implement properties and methods in our xml parser to support XFA events.
For the moment however, I have chosen to go with a Java solution for our Java PDF Viewer because I have found performance issues with my current approach when using Nashorn. I will try to document this (and possible solutions) in a later article.
This post is part of our “XFA Articles Index” in these articles, we aim to help you understand XFA.
Are you a Developer working with PDF files?
Our developers guide contains a large number of technical posts to help you understand the PDF file Format.
Do you need to solve any of these problems?
|Display PDF documents in a Web app|
|Use PDF Forms in a web browser|
|Convert PDF Documents to an image|
|Work with PDF Documents in Java|