The actual XML Structure
let us take a simple example in order to start to build the parser from scratch
<?xml version=”1.0″ encoding=”UTF-8“?>
<!– this is comment on defining the class of subforms –>
<subform id=’class1′ name=”class”>
<field name=”student” id=”s112″></field>
<field name=”student” id=”s113″></field>
<subform id=’class2 name=”class”>
<field name=”student” id=”k200″></field>
Now I will examine each section in turn with some suggestions.
1. Processing instructions:
In the above example xml version number and template designer are processing instructions. They Start with <% and ends with %>, Either you need to ignore it or you have to delete it to proceed forward
Do not consider comments as xml node, and ignore it. comments starts with <!– and ends with –> notation.
3. CData section:
Some xml files may contain CData and Doctype definitions. You can skip it unless you need to do any validation on the files.
4. Empty Nodes:
Some XFA files consist of empty nodes with or without attributes such as script in the above example.
5. white spaces between nodes
6. Handling attributes:
Attributes are separated by spaces and attribute nodename is seprated by “=” sign from its value.
If you are viewing XML in pretty printed format you may end up with whitespaces (tabs, linebreaks and spaces) between two nodes. Use a regular expression to remove it.
Unlike w3c dom parser, ecmascript parser follow object, array related notation to access child elements:
for example: to access second student of class1 subform in root element
1. in w3c dom:
2. in ecmascript:
However the “draw” child of root element should be accessed as root.draw without array notation; So object property has to be defined as array if it has more than one child with the same name otherwise it has to be treated as single object.
Learning more about ECMA
You can find more information on events, attributes and methods under LiveCycle® Designer ES Scripting Reference. This reference is divided into three sections which are known as methods, objects and properties. We need to implement properties and methods in our xml parser to support XFA events.
For the moment however, I have chosen to go with a Java solution for our Java PDF Viewer because I have found performance issues with my current approach when using Nashorn. I will try to document this (and possible solutions) in a later article.
This post is part of our “XFA Articles Index” in these articles, we aim to help you understand XFA.
Latest posts by suda (see all)
- Lancsoz3 algorithm as a way to produce better image downscaling - January 5, 2016
- How to read Tiff images in Java - August 13, 2015
- How to generate smaller PNG files in Java - July 1, 2015
- PDF XFA – Sending data as an Email - February 24, 2015
- PDF Coons Shading – find color value of given point - January 27, 2015