How to extract images from PDF file in Java (Tutorial)

Table of Contents show

PDF files are not directly supported by Java, you will need external Java PDF libraries. This tutorial shows you how to extract images from a PDF file in 5 simple steps using the JPedal Java PDF library. JPedal is the best Java PDF library for developers.

Extracting Image from PDF using Java

Add JPedal to your class or module path (download the trial jar).
Create a File handle, InputStream or URL pointing to the PDF file
Include a password if file password protected
Open the PDF file
Iterate over the images on each page
Close the PDF file

and the Java code to extract images from a PDF Document…

File file = new File("/path/to/document.pdf"));

ExtractImages extract = new ExtractImages(file);

//extract.setPassword("password");

if (extract.openPDFFile()) {

int pageCount = extract.getPageCount();

for (int page = 1; page <= pageCount; page++) {

int imagesOnPageCount = extract.getImageCount(page);

for (int image = 0; image < imagesOnPageCount; image++) {

BufferedImage img = extract.getImage(page, image, true);

}

}

}

extract.closePDFfile();

Related tutorials

If you are looking to use JPedal to extract images from PDF files, we recommend you start with these tutorials:-

The JPedal PDF library allows you to solve these problems in Java

//Convenience static method (see class for additional options)
ExtractClippedImages.writeAllClippedImagesToDir("inputFileOrDirectory", "outputDir", "outputImageFormat", new String[] {"imageHeightAsFloat", "subDirectoryForHeight"});

final PdfManipulator pdf = new PdfManipulator();
pdf.loadDocument(new File("inputFile.pdf"));
pdf.addPage(1, PaperSize.A4_LANDSCAPE);
pdf.addText(1, "Hello World", 10, 10, BaseFont.HelveticaBold, 12, 1, 0.3f, 0.2f);
pdf.addImage(1, new BufferedImage(), new float[] {0, 0, 100, 100});
pdf.rotatePage(1, 90);
pdf.apply();
pdf.writeDocument(new File("outputFile.pdf"));

Viewer viewer = new Viewer();
viewer.setupViewer();
viewer.executeCommand(ViewerCommands.OPENFILE, "pdfFile.pdf");

//Convenience static method (see class for additional options)
ExtractTextAsWordList.writeAllWordlistsToDir("inputFileOrDirectory", "outputDir", -1);

PdfMerge.mergeFiles(new File("inputFile1.pdf"), new File("inputFile2.pdf"), new File("outputFile.pdf"));

PdfManipulator.splitInHalf(new File("inputFile.pdf"), new File("outputFolder"), pageToSplitAt);

PrintPdfPages print = new PrintPdfPages("C:/pdfs/mypdf.pdf");

if (print.openPDFFile()) {
    print.printAllPages("Printer Name");
}

//Convenience static method (see class for additional options)
ExtractClippedImages.writeAllClippedImagesToDir("inputFileOrDirectory", "outputDir", "outputImageFormat", new String[] {"imageHeightAsFloat", "subDirectoryForHeight"});

//Convenience static method (see class for additional options)
ArrayList resultsForPages = FindTextInRectangle.findTextOnAllPages("/path/to/file.pdf", "textToFind");

java -jar jpedal.jar --inspect "inputFile.pdf"

PdfSigner.signPdf(
        "inputFile.pdf",
        "outputFile.pdf",
        "keystorePassword",
        "keystoreFile.p12",
        "signerName",
        "signerLocation",
        "signingReason",
        ACCESS_PERMISSION.P1
);

What is JPedal?

JPedal is a commercial Java PDF Library that makes it easy for Java developers to work with PDF Documents in Java.

Why use JPedal?

JPedal makes it much easier to work with PDF files from Java. Because we have been actively developing our Java PDF Toolkit for over 20 years, it works with all those problem PDF files out there.

What licenses are available?

We have 2 licenses available:
'Server' for on premises and cloud servers and 'OEM' for use in a named end user applications. Both are one time fees with options support renewal after 12 months.

How to use JPedal?

Want to learn more about JPedal and how to use it, we have plenty of tutorials and guides to help you.

How to extract images from PDF file in Java (Tutorial)

Extracting Image from PDF using Java

and the Java code to extract images from a PDF Document…

Related tutorials

The JPedal PDF library allows you to solve these problems in Java

What is JPedal?

Why use JPedal?

What licenses are available?

How to use JPedal?

How to add a table of contents to a…

New options for our PDF merger

Manipulate PDF files in the JPedal Viewer

How to extract images from PDF file in Java (Tutorial)

Extracting Image from PDF using Java

and the Java code to extract images from a PDF Document…

Related tutorials

Related posts:

The JPedal PDF library allows you to solve these problems in Java

What is JPedal?

Why use JPedal?

What licenses are available?

How to use JPedal?

How to add a table of contents to a…

New options for our PDF merger

Manipulate PDF files in the JPedal Viewer