TL;DR
XFA is a deprecated Adobe technology that uses XML structures to handle PDF form data. While it relies on a template for layout and datasets for values, its removal from the official PDF specification means it is often converted to HTML for modern compatibility.
In this article I aim to provide an general introduction to XFA forms, the XML Forms Architecture, and how the data is stored in a PDF file.
What is an XFA form?
XFA forms are a (now deprecated) technology introduced into the PDF specification by Adobe. Unlike the original AcroForms, data is stored inside separate XML structures within the PDF file.
XFA has now been removed from the PDF file format. So you may need to work with existing XFA documents, but should use AcroForms for new documents and workflows.
Locating XFA Data in a PDF
How do you find XFA forms in a PDF?
In the PDF structure, the AcroForm tag serves as the primary container for interactive form data. Within this tag, you will find the Fields entry, which defines the standard collection of individual form elements (like text boxes and buttons).
Additionally, the AcroForm tag may contain an XFA (XML Forms Architecture) entry, which provides a more modern, XML-based method for handling dynamic form layouts and data.
What if there are is no XFA entry?
Don’t worry if you cannot that just means you do not have any XFA forms in that PDF Document.
If you do then you have found the XFA forms inside your PDF, Congratulations that is the easier bit, now have a look what is inside it…
XFA Structure and Components
What XML elements does an XFA form contain?
Well it is a set of XML documents:
- preamble (which can be ignored)
- postamble (which again can be ignore)
- config (which is defined to have some permissions in, and may have some in newer versions but we have found that ignoring this really does not affect things much at all)
- template (the most useful Document out of them all, it details everything about the appearance of the XFA fields)
- datasets (which hold various values to the fields defined in the template Document, not all fields will have values defined here)
– there may be other documents defined here but as yet we have not found examples of any usages that alter the XFA forms appearance or use, though I am sure there will be some in the future, with the XFA architecture ever improving and developing.
Understanding the Template Document
As you look through the ‘template’ XML Document, you will see objects within object, there are a lot of objects that can be used including PageSets (which define the page dimensions) and then, Buttons (which define a Button field).
The structure that you read the Document in is vital to allocating the field dimensions, values, attributes, actions etc to the correct fields.
XFA vs Acroforms
The table below summarises the main differences between XFA and Acroforms:
| Feature | AcroForms | XFA Forms (XML Forms Architecture) |
|---|---|---|
| Data Format | Uses standard PDF objects and key-value pairs. | Uses separate XML structures stored within the PDF. |
| Current Status | The standard for new documents and workflows. | Deprecated; removed from the official PDF specification. |
| Structure | Defined within the Fields entry of the AcroForm tag. | Defined within an XFA entry inside the AcroForm tag. |
| Layout Method | Static form elements. | Dynamic layouts based on a Template document. |
| Data Handling | Values are generally stored within the field objects. | Values are stored in a specific Datasets XML document. |
| Modern Usage | Native support in almost all PDF readers. | Often needs conversion to HTML for modern compatibility. |
Modern Solutions and Conversion
Is it possible to convert to HTML?
Yes. Our FormVu software converts XFA forms into standalone HTML forms which look and feel exactly like the original XFA document. FormVu is the best tool for filling PDF forms in HTML.
FAQs
Q: Why was XFA removed from the PDF specification?
A: XFA was a proprietary Adobe technology that lacked universal support. Because it was too complex for most third-party viewers and mobile browsers to render, the industry shifted back to the more compatible AcroForms standard.
Q: Can I open XFA forms in web browsers like Chrome?
A: Most browsers cannot render XFA’s dynamic XML, often showing a “Please wait…” error. Users usually need Adobe Acrobat or a conversion tool to view these forms properly.
Q: Are XFA forms less secure than AcroForms?
A: XFA supports complex scripting and external XML data injections, which creates a larger attack surface for malicious scripts compared to the simpler, more static AcroForms.
FormVu allows you to
| Use Interactive PDF Forms in the Web Browser |
| Integrate fillable PDF Forms into Web Apps |
| Parse PDF forms as HTML5 |
What is FormVu?
FormVu is a commercial SDK for converting PDF Form files into standalone HTML with interactive form components.
Why use FormVu?
FormVu allows you to integrate PDF forms into your web application effortlessly while retaining all their interaction and functionality.
What licenses are available?
We have 3 licenses available:
Cloud for form conversion using the shared IDRsolutions cloud server, Self hosted server option for your own cloud or on-premise servers, and Enterprise for more demanding requirements.
How to use FormVu?
Want to learn more about FormVu and how to use it, we have plenty of tutorials and guides to help you.
