Tag Archives: pdf to html5 conversion

What size is 100% scaling in PDF?

As with the majority of the PDF specification, what determines the physical size of your PDF is very complicated but also very powerful, and done in the way that it is for very good reason – so that your PDF looks exactly the same regardless of what platform or device you are using.

Device Space is the space in which the content of the PDF ultimately ends up. This could be on a range of devices, for example a monitor or printer. The problem with outputting to these devices is that one particular devices coordinate system can vary greatly from another, and this means that the image you see on your monitor as 8 inches tall could end up 1 inch tall when output through a printer.

The way that PDF solves this issue is to define the PDF in “User Space“. This is device-independent, and allows the PDF reader to adjust the CTM (Current Transformation Matrix) so that the PDF content appearance is the same on all devices.

An illustration from the PDF specification showing the complexity and range of spaces used to display PDF content

Something that the reader controls is the pixels per inch. For some time, Adobe’s reader has used 110 pixels/inch as the default. The reader also allows you to alter the setting – changing to something large will result in the PDF appearing very large, and changing to a low figure will result in the PDF appearing very small. Note that the zoom (or scaling) value has not changed. So 100% scaling could be very large or very small depending on what the pixels per inch setting is.

In our PDF to HTML5 converter we have followed Adobe’s lead, so setting the scaling to 1 will result in the same physical (on screen) PDF size as Adobe reader at scaling 100%. Interestingly, this has resulted internally of multiplying the scaling value by 1.528 (110/72) to allow us to output at the correct size.

It is also worth mentioning that our converter also allows you to provide a pixel width or height which it will match. This can be done by setting org.jpedal.pdf2html.scaling= fitwidth100 or fitheight100, which would result in the width or height (depending on which you set) to be 100px.

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (Twitter, Facebook and Google+) or the  Blog RSS.

Related Posts:

Free PDF Guides updated – PDF to HTML5 in 5 Easy Steps for Non-Technical users

Trying to convert PDF’s to HTML5 can be very confusing…

So we have put a lot of work into building a version which does not require lots of technical knowledge. In recent months we have tried to increase the accessibility of our PDF to HTML5 converter by allowing users to access it online through a simple web based interface, and ideal for low usage, and for those that like to fine tune things there is quite a bit of customization options available.

To help make it as easy as possible to use, we have also written some short, simple guides avoiding Jargon to help you make sense of the options available and guide you through using the converter. You can download and even printout our handy simple, illustrated guide. It comes with step by step instructions on using the simple to use web based online PDF converter available for download to help get you started.

5Easy Steps BlogWe have been updating our other guides as well and over the next few weeks will be releasing some new ones, if you have any suggestions or feel there is something we should cover, please let us know. Your Feedback helps us to keep improving our guides and help others.

What would you like to see us cover?

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (Twitter, Facebook and Google+) or the  Blog RSS.

 

Related Posts:

jpedalPDF

Creating an Android PDF Viewer

If you have looked at our conversion pages recently you may of noticed our very snazzy logos demonstrating what formats you can currently convert your PDF files to using our cloud service. On this logo you may have noticed the ever so familiar Android logo lurking at the end. You may also recall a previous blog article about what our converted pages look like on the default Android browser and how good they looked, since then we have done a lot of work trying to optimise our output for mobile viewing in preparation for…

The addition of a PDF to Android App to our PDF to HTML5/SVG converter software.

jpedalPDFUnlike most PDF viewers on the Android platform our conversion software utilises HTML5 as it’s display format making its source code relatively simple as it displays converted documents within an embedded WebView with some Android specific code on top. We’ve spoken in length before as to why you should consider HTML5 for PDF conversion in several other blog posts.

This feature is suited for publishers who want to publish their documents or magazines as stand alone Android applications as you can control what is displayed and only use the document you provide and we provide the generated source code for you to modify to fit your use case. This gives you control over your content and what you let your users view as well as the potential to publish to the Google Play store.

We also give you the option to compile the generated source code into an Android .apk immediately for you to install on your devices or emulators and try out yourself without touching the source code!

The basic features of this first release include:

  • A navigation bar displaying thumbnails of each page within the document and allows you to navigate to them easily that can also be hidden if desired.
  • The ability to swipe left and right to go back and forward.
  • Text search using the default Android search functionality.
  • Customisable icons and application name.
  • The ability to localize the application using a custom XML file.

And those are all without modifying the actual source code of the Android PDF Viewer that’s generated for you! The source code itself is well documented so for those more technically inclined or those wanting to rework parts of it you can reach in and change how it works.

Running the PDF2Android converter is as simple as running our other converters since it utilizes our main PDF2HTML5 software. The files generated are in the correct source tree for an Android application so if you have in house developers they should be able to pick it up and start working with it quickly. And as a consequence of being part of our main conversion software it will be under constant improvement, with any bugs fixed in our main library reflected in the output of the converter.

Currently we are working hard to add the option to use this generator to our cloud converter page so that you can try it out yourselves without downloading the PDF2HTML5 trial.

A few example screen shots of a basic generated app can be seen below, the first two are from a Nexus tablet and the third an Android phone:

This is the app in portrait mode

This is the app in portrait mode on a Nexus tablet

This is an image of the app showing off the default search functionality

This is an image of the app showing off the default search functionality

This is it running on an Android phone

This is it running on an Android phone

We have tested on many different versions of Android to ensure it works on as many as possible and will be constantly trying to extend our testing, so it should work on virtually all modern devices (Android 2.3.3+) on the market, tablets and phones alike.

The Future

In future iterations we hope to add a many more options for customisation so that you will not need to touch the generated source code making it easier to create applications for your files.

Once such idea is that using generated framework you could potentially place your generated HTML5 files on your own web server and have the app download them whenever you update them giving you a native android app and also the ability to display the converted files on your own website using your own CMS systems, catering for both mobile device users and desktop users.

We also hope to reduce the need to edit the source code for non technical users so feel free to suggest any features that you can think of!

Related Posts:

Why your API needs web services

A few days ago an interesting query was raised by one of our customers.

They queried whether it was possible for us to write our PDF2HTML5 Java api in dot net languages or php (Hypertext Preprocessor).

According to my colleague Suda, as many of you may be aware integrating the Java api into dot net frameworks consumes a considerable amount of time.

In order to answer this query we have decided to provide a PDF to HTML5 Cloud based design solution in our products using web services in order to fulfil our client’s demands.

This encouraged us to move to jax-ws plugins in our online conversion tool.
Now this has resulted in the conversion process becoming much more simpler than we imagined and our tool started helping all the other programming languages and platforms.

//jax-ws subroutine implementation
public byte[] convert
(@WebParam(name = “email”) String email,
@WebParam(name = “password”) String password,
@WebParam(name = “fileName”) String fileName,
@WebParam(name = “dataByteArray”) byte[] byteDataArray,
@WebParam(name = “conversionType”) String conversionType,
@WebParam(name = “conversionParams”) String[] conversionParams,
@WebParam(name = “xmlParamsByteArray”) byte[] xmlParamsByteArray,
@WebParam(name = “isDebugMode”) boolean isDebugMode) {
……………………………………………………….}

Since this method is stateless session we stopped worrying about multiple client request and security threads.

The wsdl for this service can be found at:

http://glassfish.idrsolutions.com:8282/HTML_Page_Extraction/IDRConversionService?wsdl

Feel free to try this out (It is free to use at the moment).

An example of the usage:
email: clouduser
password; pdf
fileName: {your pdf file name eg: myfile.pdf}
byteDataArray: {your pdf file data in bytes}
conversionType: html5
conversionParams: null;(if you like more additional styling you can try our jvm options as string array http://www.idrsolutions.com/html5-jvm-options/).
xmlParamsByteArray: null;
isDebugMode: false (make it true if you find your conversion returns null data and if it is set to true then you will obtain a string bytes of error messages which describes why your conversion was failed);

As an outcome of this method you will be able to obtain bytes array of a zipped file which contains the html converted data and directory hierarchy of the converted data.

Related Posts:

PDF and PDF2HTML5: The Best of Both Worlds for Teachers

In a change from the usual blog posts, today we have a guest blog post from Dr. Gary D. Theilman an Associate Professor of Pharmacy and Practice at the University of Mississippi School of Pharmacy. Many thanks to Gary for taking the time to share his experiences with HTML5.

For several years, we have required our students to upload their written assignments in PDF format. Requiring assignments in PDF has solved several problems:

  • It has cut down on arguments about what “a one page limit” means. When using DOCX or another word processor format, what appears as “one page” on the student’s computer sometimes displays as two pages on the faculty’s computer. Even limiting length of assignments by word count does not really help as word count calculations differ depending on which word processor is used.
  • PDF’s reputation for being “unchangeable” (even though it really isn’t) reduces student claims that “my assignment was somehow altered after I uploaded it”.
  • We need a fairly quick turn-around time for grading and we wanted faculty to be able to bring the assignments up in a web browser and grade them online. There are a number of ways of displaying PDFs in a web browser.

The major problem we’ve had with PDF-based assignments is having faculty add written comments to the page while grading. While PDF does support annotations, we wanted the written comments and metadata to be stored in our own database rather than in the PDF file itself. This led us to a work-around where we had the students include line numbers on their PDF documents. When leaving comments in the web-based grading program, faculty would refer to the line numbers on the PDF so that students could find the relevant text.

pdf content

We looked at a number of different options for annotating directly on the assignment itself without the crutch of referring to line numbers. The Open Knowledge Foundation has developed an open-source Javascript library called Annotator which allows the user to use the mouse to “highlight” text in a webpage and write comments in a pop-up text widget. The comments are then stored in a database on a website’s backend. The problem is that Annotator is designed to work with text in webpages, not in PDFs.

Enter PDF2HTML5.

We rewrote our online grading program such that when students upload their PDF assignments, the document is converted into formatted HTML text so that it can be used with Annotator. While PDF2HTML5 does preserve the original formatting of the student’s assignment, we also provide a link to the original PDF so that the grader (and the student) can see whether mistakes are artifacts of the conversion (which they usually are not) or if the student actually made the mistake on the original PDF.

PDF HTML5 Content

The grading process we use is similar to that which is performed by the website [paper grader]. While that site is public, our grading software is only used within our school. Also, [paper grader] allows uploading of assignments in a variety of formats (.doc, .docx, .html, .odt, .rtf, .sxw, txt). The notable exception to the formats supported is PDF. PDF2HTML5 provides the bridge that allows us to add that functionality to our own website.

Converting PDFs to HTML has also allowed us to make use of a number of different Javascript DOM manipulation functions which would be difficult to implement using PDF.

    • Using jQuery.ScrollTo we can link drop-down boxes with DIVs containing assignment section headers. The document will automatically scroll to the section that the grader is reviewing.

PDF HTML5 Content

  • We can use jQuery’s :contains() Selector to seek out and highlight certain text within a DIV (such as reference numbers) to associate citations with linked documents.

PDF HTML5 Content

When the students have their graded assignments “returned”, they go to a webpage where they can see their scores and faculty comments using a “read-only” adaptation of Annotator. Again, they are provided a way to download their original PDF so that they can be confident that the text conversion by PDF2HTML5 accurately reflects their original assignment.

PDF HTML5 Content

Using PDF2HTML5 has allowed us to preserve the advantages of PDF while also enabling us to use the tools available through Annotator and jQuery.

Is there is something you’d like to blog about connected to Java, HTML5, SVG, JavaFX or PDF files? Any tips, tricks or recommendations? Contact us and we would be happy to feature you in our new ‘Guest Blogger’ series.

Related Posts: