This tutorial shows you how to extract data from PDF AcroForms in simple steps using JPedal Java PDF library. JPedal is the best Java PDF library for developers.
JPedal includes extensive support for Interactive Forms and Components which it converts into Java Object representations and also allows access to Forms names and the GUI representations. The data can be accessed with a single call on a page or document basis.
How to Extract PDF Form Data in Java
- Add JPedal to your class or module path. (download the trial jar).
- Create a File handle, InputStream or URL pointing to the PDF file
- Include a password if file password protected
- Open the PDF file
- Select the data type required
- Close the PDF file
and the Java code to extract PDF Form data…
File file = new File("/path/to/document.pdf;
PdfFormUtilities extract=new PdfFormUtilities(file);
//extract.setPassword("password");
if (extract.openPDFFile()) {
//all formNames
Object[] names=extract.getFormComponentsFromDocument(null, ReturnValues.FORM_NAMES);
// all forms in document called Mabel
Object[] PDFObjectsAsPoJos=extract.getFormComponentsFromDocument("Mabel", ReturnValues.FORMOBJECTS_FROM_NAME);
//a form with PDF Reference 25 0 R
Object[] PDFObjectsAsPoJos=extract.getFormComponentsFromDocument("25 0 R", ReturnValues.FORMOBJECTS_FROM_REF);
//all Swing versions of the Form objects
Object[] swingComponents=extract.getFormComponentsFromDocument(null, ReturnValues.GUI_FORMS_FROM_NAME);
//all formNames on page 5
Object[] names=extract.getFormComponentsFromPage(null, ReturnValues.FORM_NAMES,5);
// all forms in document called Mabel on page 5
Object[] PDFObjectsAsPoJos=extract.getFormComponentsFromPage("Mabel", ReturnValues.FORMOBJECTS_FROM_NAME,5);
//a form with PDF Reference 25 0 R on page 5
Object[] PDFObjectsAsPoJos=extract.getFormComponentsFromPage("25 0 R", ReturnValues.FORMOBJECTS_FROM_REF,5);
//all Swing versions of the Form objects on page 5
Object[] swingComponents=extract.getFormComponentsFromPage(null, ReturnValues.GUI_FORMS_FROM_NAME,5);
}
extract.closePDFfile();