How to extract clipped images from PDF file in Java (Tutorial)

Table of Contents show

This tutorial shows you how to extract clipped images from a PDF file in 5 simple steps using the JPedal PDF library. JPedal is the best Java PDF library for developers. Clipped images are raw images that have had their formats edited, this includes cropping, flipping, resizing and more.

How to Extract clipped images from PDF files?

Add JPedal to your class or module path. (download the trial jar).
Create a File handle, InputStream, or URL pointing to the PDF file
Include a password if file password protected
Open the PDF file
Iterate over the images on each page
Close the PDF file

and the Java code to extract clipped images…

File file = new File("/path/to/document.pdf"));

ExtractClippedImages extract = new ExtractClippedImages(file);

//extract.setPassword("password");


if (extract.openPDFFile()) {

int pageCount = extract.getPageCount();

for (int page = 1; page <= pageCount; page++) {


int imagesOnPageCount = extract.getImageCount(page);

for (int image = 0; image < imagesOnPageCount; image++) {

BufferedImage img = extract.getClippedImage(page,

image, true);

}

}

}

extract.closePDFfile();

Why use a third-party library to handle PDF files?

PDF files are a very complex binary/text hybrid data structure. The image data, color information, clipping and scaling details are all stored separately in a compressed format and need to be extracted and combined together.

A third-party library handles all the for you automatically. In this example, we will use our JPedal PDF library. This provides an easy to use Java PDF APi so you can work with PDF files easily in Java.

Extract clipped images from a PDF file with JPedal

If you are looking to use JPedal to extract clipped images from PDF files, we recommend you start with these tutorials:-

The JPedal PDF library allows you to solve these problems in Java

//Convenience static method (see class for additional options)
ExtractClippedImages.writeAllClippedImagesToDir("inputFileOrDirectory", "outputDir", "outputImageFormat", new String[] {"imageHeightAsFloat", "subDirectoryForHeight"});

final PdfManipulator pdf = new PdfManipulator();
pdf.loadDocument(new File("inputFile.pdf"));
pdf.addPage(1, PaperSize.A4_LANDSCAPE);
pdf.addText(1, "Hello World", 10, 10, BaseFont.HelveticaBold, 12, 1, 0.3f, 0.2f);
pdf.addImage(1, new BufferedImage(), new float[] {0, 0, 100, 100});
pdf.rotatePage(1, 90);
pdf.apply();
pdf.writeDocument(new File("outputFile.pdf"));

Viewer viewer = new Viewer();
viewer.setupViewer();
viewer.executeCommand(ViewerCommands.OPENFILE, "pdfFile.pdf");

//Convenience static method (see class for additional options)
ExtractTextAsWordList.writeAllWordlistsToDir("inputFileOrDirectory", "outputDir", -1);

PdfMerge.mergeFiles(new File("inputFile1.pdf"), new File("inputFile2.pdf"), new File("outputFile.pdf"));

PdfManipulator.splitInHalf(new File("inputFile.pdf"), new File("outputFolder"), pageToSplitAt);

PrintPdfPages print = new PrintPdfPages("C:/pdfs/mypdf.pdf");

if (print.openPDFFile()) {
    print.printAllPages("Printer Name");
}

//Convenience static method (see class for additional options)
ExtractClippedImages.writeAllClippedImagesToDir("inputFileOrDirectory", "outputDir", "outputImageFormat", new String[] {"imageHeightAsFloat", "subDirectoryForHeight"});

//Convenience static method (see class for additional options)
ArrayList resultsForPages = FindTextInRectangle.findTextOnAllPages("/path/to/file.pdf", "textToFind");

java -jar jpedal.jar --inspect "inputFile.pdf"

PdfSigner.signPdf(
        "inputFile.pdf",
        "outputFile.pdf",
        "keystorePassword",
        "keystoreFile.p12",
        "signerName",
        "signerLocation",
        "signingReason",
        ACCESS_PERMISSION.P1
);

How to extract clipped images from PDF file in Java (Tutorial)

How to Extract clipped images from PDF files?

and the Java code to extract clipped images…

Why use a third-party library to handle PDF files?

Extract clipped images from a PDF file with JPedal

The JPedal PDF library allows you to solve these problems in Java

What is JPedal?

Why use JPedal?

What licenses are available?

How to use JPedal?

How to remove unused objects from PDF file (Tutorial)

How to extract text from a PDF as JSON

How to add a CMYK image to a PDF…

How to extract clipped images from PDF file in Java (Tutorial)

How to Extract clipped images from PDF files?

and the Java code to extract clipped images…

Why use a third-party library to handle PDF files?

Extract clipped images from a PDF file with JPedal

Related posts:

The JPedal PDF library allows you to solve these problems in Java

What is JPedal?

Why use JPedal?

What licenses are available?

How to use JPedal?

How to remove unused objects from PDF file (Tutorial)

How to extract text from a PDF as JSON

How to add a CMYK image to a PDF…