To read PDF files in Java, you will need a Java PDF Reader to help you. PDF files are not handled natively by Java, and decoding the raw data in a PDF file is very complex. The contents of the PDF generally has to be parsed to extract anything meaningful from it.
The good news is that Java has a wide range of commercial and Open Source PDF Readers available to choose from. Which is right for you will depend on what you want to do, along with your budget, support needs, speed requirements, etc.
What do you want to do with the PDF file?
Which Java PDF Reader you use is going to depend on what exactly you want to do.
- Do you want to read the raw PDF Objects and edit them, you will need a tool such as iText.
- If you are creating PDF documents iText or FOP are worth investigating.
- If you want to access the Adobe libraries from Java, Datalogics offers some solutions for this.
- If you are trying to debug a broken PDF to understand what the issues are, you will need a PDF inspector to view the PDF objects and tree (personally I use a mixture of Rups and our own Inspector in JPedal). There are lots of Open Source and Commercial tools which can do this. Many people still use a text editor to look at the raw PDF structure.
- If you want to print, extract content or rasterize the PDF pages as images, you should consider PdfBox, JPedal or one of the other commercial Java libraries out there.
What can you do with JPedal?
JPedal is our commercial Java PDF Reader. It allows you to:-
- Convert PDF documents to images (including Heic)
- Extract images, text and Metadata.
- Search text.
- Add a PDF Viewer to Java applications.
- Print PDF files
- Access and edit Form and Annotation data.
- Inspect a PDF file
Are you a Developer working with PDF files?
Free: The Developer's Guide to PDF |
Convert PDF files to HTML |
Use PDF Forms in a web browser |
Convert PDF Documents to an image |
Work with PDF Documents in Java |