To work PDF files in Java, you will need a Java PDF library to help you. PDF files are not handled natively by Java, and decoding the raw data in a PDF file is very complex. The contents of the PDF generally has to be parsed to extract anything meaningful from it.
The good news is that Java has a wide range of commercial and Open Source PDF Libraries available to choose from. They provide a range of Java PDF apis to allow you to work with PDF documents directly from your Java code. Which is right for you will depend on what you want to do, along with your budget, support needs, speed requirements, etc.
What do you want to do with the PDF file?
Which Java PDF Library you use is going to depend on what exactly you want to do. All the examples below offer a Java PDF api.
- Do you want to read the raw PDF Objects and edit them, you will need a tool such as iText.
- If you are creating PDF documents iText or FOP are worth investigating.
- If you want to access the Adobe libraries from Java, Datalogics offers some solutions for this.
- If you are trying to debug a broken PDF to understand what the issues are, you will need a PDF inspector to view the PDF objects and tree (personally I use a mixture of Rups and our own Inspector in JPedal). There are lots of Open Source and Commercial tools which can do this. Many people still use a text editor to look at the raw PDF structure.
- If you want to print, extract content or rasterize the PDF pages as images, you should consider PdfBox, JPedal or one of the other commercial Java libraries out there.
What can you do with JPedal?
JPedal is our commercial Java PDF Library. It provides a Java PDF api allowing you to:-
- Convert PDF documents to images (including Heic)
- Extract images, text and Metadata.
- Search text.
- Add a PDF Viewer to Java applications.
- Print PDF files
- Access and edit Form and Annotation data.
- Inspect a PDF file
Our software libraries allow you to
Convert PDF files to HTML |
Use PDF Forms in a web browser |
Convert PDF Documents to an image |
Work with PDF Documents in Java |
Read and write HEIC and other Image formats in Java |