Rob Rob is a multi-language developer. In his spare time, he enjoys riding his motorcycle and playing guitar in his band.

Are you a Java Developer working with PDF files?

Find out why you should be using JPedal

How to run Java applications from other programming languages

3 min read

Java icon

Here at IDRSolutions, we are always looking for new and innovative ways to make our products better. If you’ve been keeping up to date with our blog you’ll have seen our goals for 2017, including lists of cool features in our products JPedal, JDeli and BuildVu we would like to see added in the future (Although much to my dismay, I was told BuildVu won’t be making me tea or coffee or telling jokes while it converts my files anytime soon).

A more reasonable feature though would be to making our products available to developers who aren’t using Java – enable them to interact with our software regardless of programming language. There are 2 ways we could do this:

Using Java in other languages:

The first idea that came to mind was to use a sort of wrapper that acts as a ‘bridge’ between your chosen language and Java. For example, let’s look at using our PDF to HTML5 converter alongside the best my favourite language, Python, using Py4J.

According to their homepage, “Py4j enables Python programs running in a Python interpreter to dynamically access Java objects in a Java Virtual Machine”. This is cool because it means that all I have to do is write a small API for the converter in Java, provide some details on how each function works, then anyone can write a Python script to interact with it through Py4J.

So – first of all, let’s take our PDF to HTML5 converter. Using the examples on the Py4J homepage and our BuildVu JavaDocs as a reference, I set up a very basic converter object that takes a PDF file from a given location and outputs the converted HTML5 wherever the user pleases:

import java.io.File;
import org.jpedal.examples.html.PDFtoHTML5Converter;
import org.jpedal.exception.PdfException;
import org.jpedal.render.output.ContentOptions;
import org.jpedal.render.output.html.HTMLConversionOptions;
import py4j.*;

public class Pdf2Html {

    public void convertToHTML5(String input, String output) {
        //setup converter options
        HTMLConversionOptions conversionOptions = new HTMLConversionOptions();
        conversionOptions.setDisableComments(true);
        ContentOptions contentOptions = new ContentOptions();

        File pdfFile = new File(input)
        File outputDir = new File(output);

        PDFtoHTML5Converter converter = new PDFtoHTML5Converter(pdfFile, outputDir, conversionOptions, contentOptions);

        try {
            converter.convert();
        } catch (PdfException e) {
            e.printStackTrace();
        }
    }

   
    public static void main(String[] args) {
        // TODO code application logic here
        GatewayServer server = new GatewayServer(new Pdf2Html());
        server.start();
    }

}

So how does Py4J work here? Without getting too technical, it allows Python programs to communicate with the JVM through a local network socket. Looking at the 2 lines of Py4J specific code in the main method, it’s actually pretty simple – we set up and start a new GatewayServer with a copy of the object we want to access, which will listen for any requests from Python and deal with them accordingly. Next we compile the class, run it, and it’s good to go.

The Python side of things was also really easy to set up. All we need to do is set up a Gateway to access the JVM our converter is running on, find the instance of our converter, then call the method we want. Here’s my script:

from py4j.java_gateway import JavaGateway
gateway = JavaGateway() #connect to JVM
converter = gateway.entry_point #get instance of converter
converter.convertToHTML5(path to pdf file here, output directory here) #call convertToHTML5 method

I ran the Python script, and bam! My PDF file had been converted to HTML. All it required was a couple of lines of code to make my converter Python-compatible.

Running an application server

Another way would be to provide an option to set up an application server that runs our converter. The server could be communicated with using a RESTful service, (which involves HTTP requests and some JSON) so all you need is a language that can take advantage of that. The best bit about this approach is that it’s language-independent – no need to set up and configure different frameworks like Py4j for each individual language. However, setting up an application server would take a bit more time than writing ~6 lines of code. But I’d say that’s a fair trade-off.

Edit (14th March 2018):
If you’re interested in setting up our converter on your own application server, check out our BuildVu Microservice Example – our fancy new open source project that allows you to run BuildVu as an online service, which can be interacted with via the REST API (No Java coding needed!).

So which languages?

Of course, using the first method means you are limited to just using Python. For each different language, you would need the respective ‘bridges’, and a separate API for each one. For example:

Ruby – JRuby
.NET – jni4net
Python – Py4J or Jython

On the other hand, as aforementioned the second method is language-independent. All that’s needed is to write a class that deals with sending requests to the application server, and you’re good to go.

Would you be interested in using something like this? Maybe you’re already doing something similar yourselves. Leave a comment below – we’d love to know your trade secrets if you’ve had any experience working with tools like this.



Our software libraries allow you to

Convert PDF to HTML in Java
Convert PDF Forms to HTML5 in Java
Convert PDF Documents to an image in Java
Work with PDF Documents in Java
Read and Write AVIF, HEIC, WEBP and other image formats
Rob Rob is a multi-language developer. In his spare time, he enjoys riding his motorcycle and playing guitar in his band.