Tag Archives: HTML

Embedded base64 images, html and svg differences

HTML and SVG files do not have to reference external files should you wish to include an image in your file. Instead you can include them as embedded base64 images.

Base64 encoding is a method used when there is a need to transfer or store binary data via means designed to handle text data. This is done by dividing the binary data into groups of 6 bits. This gives 64 different characters that are used to encode the data. The 64 values are encoded base on a index containing characters common to  most encodings. Should the binary data not be divisible by 6 padding is added to the end of the encoded data. This of course inflates the size of the file by around 1 third of the original size as every 6 bits are replaced by 8 bits.

Both html and svg are able to contain embedded base64 image data. Despite this I have encountered some differences in the use of the base64 encoded data within html and svg.

SVG can handle new lines in the middle of embedded base64 images and will ignore them allowing the data to be split across multiple lines making it easier to read.

HTML can not handle the new lines as svg does. Should the embedded base64 image contain new lines the image will not appear as the binary data being read would be incorrect.

These differences may not seem to great but they are important. Several ways of converting images into base64 strings can leave the string containing new lines. These new lines will cause issues in some cases. For this reason I find it best to ensure any new line characters are removed from embedded base64 images as doing so does not break anything and in fact makes the image usable in both html and svg.

 

So, whenever you use embedded base64 images, ensure any new lines are removed from the image data. This will make the string reusable regardless of where you plan on using it.

Related Posts:

Testing generated HTML on multiple platforms with XAMPP

As any web developer knows, all browsers have their quirks. Here at IDR we aim to support as many browsers as possible on as many platforms as possible. The result is that we are constantly testing files in Chrome and Firefox on PC and Mac;  Safari on Mac, iPhone and iPad; Internet Explorer; and Android’s browser.

It turns out the fastest way of testing all of these is to turn your development PC into a web server. Web servers have a bit of a reputation for being complex to set up, but a number of pre-configured packages are available. One of these, XAMPP, is cross-platform and regularly updated, which suits our needs.

Once installed using the instructions provided, you can either place the files to test directly into your xampp/htdocs folder, or set up an alias for another directory using this handy guide. Assuming your firewall is properly configured (unblock port 80!) your files should now be visible over your local network, ready to test on any number of devices.

I’m sure a few people are wondering whether a web server is really necessary for this task. People might suggest using shared folders or Dropbox instead.

Personally, I’ve always found shared folders, particularly on Windows, needlessly complex and unreliable. They also don’t help on Android or iOS. You only need to set up a web server on your main development machine and that’s it – it won’t care about what operating systems you’re using elsewhere on the network.

Dropbox results in duplicate data, and although it can transfer files directly over LAN it still synchronizes to their servers, using up your space allowance. Since the results of our tests are over a gigabytes, it seems sensible to leave them in one place but allow access, which Dropbox cannot do.

Truthfully, I don’t think a web server is necessary for cross platform testing, but I do think it’s better. Packages like XAMPP and WampServer have made it quick and easy to do. It also lets you play with technologies like AJAX which won’t work without it. Perhaps a better question than whether it’s necessary is why not?

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (TwitterFacebook and Google+) or the  Blog RSS.

Related Posts:

Cleaning up our HTML5 Output – Text Spacing adjustment without JavaScript

If you are familiar with the PDF file format specification, you will know just how powerful the text handling capabilities are. A range of parameters (9 in total) can be set on the Text State giving very fine control of text display. Here is a list of those parameters:

TextState

In addition to the above parameters, it’s also possible to adjust kerning, providing individual glyph positioning in a very concise way:

Kerning

As an end display file format, there is no concept of “justify this line of text”. In PDF, a justified line of text is as a result of carefully setting the Text State and Kerning for that line of text. This removes the potential of applications having differing definitions of what “justify this line of text” means, meaning that regardless of where or how you are viewing the PDF, the line of text will appear exactly as intended.

Unfortunately, if you are familiar with the HTML spec, you will know just how powerful the text handling capabilities are not. In addition to this, HTML is not known for its rigid guidelines that all applications follow that mean your document will appear exactly as intended however you view it.

If you are aiming to exactly replicate a PDF in HTML where the PDF has complex custom spacing, you are in for a hard time. This is why the vast majority of the applications that claim to convert PDF to HTML are actually fooling you by rasterizing the text to image, and providing invisible text on top that is positioned somewhere near to where it needs to be.

We offer a text mode that allows you to do this too, but are also proud of the fact that we also offer a text mode that outputs real text (with converted PDF fonts) whilst still maintaining a very close representation of the original PDF. We do have an option that can be used to position each glyph individually as a PDF would, but unfortunately HTML is quite a bit more verbose than PDF, resulting in impractically large HTML files.

Previously we have used some slightly hacky JavaScript to adjust the text’s spacing in order to compensate for HTML’s shortcomings, but this is not a popular solution as it increases file complexity and makes it more difficult to integrate our converted files. So there are various workarounds for some cases, but not an elegant global solution.

In Friday’s release, we are pleased to say that we have found a workaround, and no longer will our converted files require JavaScript to be executed when viewing. Our solution is now CSS based and offers many advantages over the previous solution.

1. It provides a more accurate representation of the PDF
2. It’s instant – no waiting around while the JavaScript updates the spacing
3. Our files now no longer require any JavaScript to be executed
4. It even works in IE6!

This update will be available in Friday’s release.

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (Twitter, Facebook and Google+) or the  Blog RSS.

Related Posts:

Good and bad news on rollouts…

The good news is that we have been upgrading our servers this week with some much more powerful kit. We are seeing an ever increasing load on our GlassFish converter and we need the capacity to make sure this meets the growing usage customers are making of it.

The bad news is that this has delayed our big HTML/SVG update. Changing a box can still be problematic – we need to make some changes to the Internet nameserver values so that the new box is used. These are the values which the Internet uses to tell which physical box is hosting which site. The change can take time to travel round the entire internet (and for sometime some people will see the new box while others will still see the old). This will effect will potentially effect friday and over the weekend, so we have moved the rollout until next week. Our apologies but it will be worth the wait!

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (Twitter, Facebook and Google+) or the  Blog RSS.

Related Posts:

How to access external HTML resources in the GlassFish server

An alternate document root (docroot) allows for a web application to serve requests for certain resources from outside its own docroot.

This is a really handy feature when two or more of your applications try to access sources in the same directory: (usually you can use web-inf directory if you are running a single Java EE application), more usage information about this can be found at GlassFish documentation page.

However if you access an HTML file (which is not part of an application and resides outside the application) then the file will be delivered as a single content and neither CSS or image would not come along with that particular HTML file, eventually you would see only the text version of an HTML file.

We encountered the same situation where we needed to access PDF files and converted HTML, CSS, and image elements those reside outside the application scope. So I thought of using file server class in such exceptional cases.

    import java.io.BufferedInputStream;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import javax.servlet.ServletException;
    import javax.servlet.ServletOutputStream;
    import javax.servlet.annotation.WebServlet;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
 
    @WebServlet(name = "FileServer", urlPatterns ={"/output/*"})
    public class FileServer extends HttpServlet {
        protected void doGet(HttpServletRequest request,HttpServletResponse response)
                throws ServletException, IOException {
 
            String fileName = request.getRequestURI();
            String justName = fileName.substring(fileName.lastIndexOf("/"),
                    fileName.length());
            int pos = request.getContextPath().length();
            fileName = "../docroot" + fileName.substring(pos, fileName.length());
            BufferedInputStream buf = null;
            ServletOutputStream myOut = null;
 
            try {
                myOut = response.getOutputStream();
                File myfile = new File(fileName);
 
                response.setContentLength((int) myfile.length());
 
                //statement useful in IE browsers
                if (fileName.endsWith(".css")) {
                    response.setContentType("text/css");
                } else if (fileName.endsWith(".js")) {
                    response.setContentType("text/javascript");
                } else {
                    response.setContentType(request.getContentType());
                }
 
                FileInputStream input = new FileInputStream(myfile);
                buf = new BufferedInputStream(input);
                int readBytes = 0;
 
                //read from the file; write to the ServletOutputStream
                while ((readBytes = buf.read()) != -1) {
                    myOut.write(readBytes);
                }
 
            } catch (IOException ioe) {
                throw new ServletException(ioe.getMessage());
            } finally {
                //close the input/output streams
                if (myOut != null) {
                    myOut.close();
                }
                if (buf != null) {
                    buf.close();
                }
            }
        }
    }

If you’re a first-time reader, or simply want to be notified when we post new articles and updates, you can keep up to date by social media (Twitter, Facebook and Google+) or the  Blog RSS.

Related Posts: