Chika Okereke Chika is a Java developer. When not experimenting with the new features of Java, he is a keen basketball player (he is the tall guy you might see at Devoxx).

CCITT encoding in PDF files – G31D CCITT data overview

58 sec read

In my second article on CCITT encoding I am going to explain exactly how 1D decoding works. Just to make life complicated, this can have several names. I will be referring to the Group 3 One-Dimensional as G31D. This has also been referred to as 1D CCITT in our office (why complicate things ey..?).

A PDF file data stream encoded in this mode is one of the easier CCITT data structures to decode. Firstly here are some keywords that would make it easier to understand how G31D works.

  • Pixel run- Usually 1-bit, 1 for Black and 0 for White. A block of pixels all the same.

  • Scan line– The width of data from one end of the page to the other.

  • Code Words– This contains information regarding what the data does eg makeup or Terminating.

  • Run Length– Block of either White or Black bits to be decoded/ encoded.

  • End of line(EOL)- Unique 12-bit code word used to determine the start and end of a scan line.

  • Return to control(RTC)- Six EOL code words occurring consecutively usually determines the end of the file. EOL & RTC would become more obvious in later blogs.

That is quite a lot of jargon so in my next article I will explain how it all works and how we read all this data. Any questions so far?

This post is part of our “Understanding the PDF File Format” series. In each article, we discuss a PDF feature, bug, gotcha or tip. If you wish to learn more about PDF, we have 13 years worth of PDF knowledge and tips, so click here to visit our series index!

Did you know...

IDRsolutions offers a whole range of online file converters to convert PDF and Microsoft Excel, Word and Office Documents to HTML5, SVG or image formats?

It is free to use for single file conversions and also includes Developer links if you want to use our commercial software for bulk conversions. Find out more on this page

Chika Okereke Chika is a Java developer. When not experimenting with the new features of Java, he is a keen basketball player (he is the tall guy you might see at Devoxx).

How to read HEIC image files in Java with…

In this article, I will explain how to read HEIC files into Java as a BufferedImage. ImageIO does not read HEIC file types so...
Mark Stephens
1 min read

How to convert WMF files to SVG in java…

This article will show you how to convert WMF files into SVG files using our JDeli Java Image library. What is WMF? WMF is...
Amy Pearson
1 min read

How to write WebP images in Java

In this article, I will walk you through how to write out images as WebP images in Java. ImageIO does not support WebP images...
Mark Stephens
1 min read

Leave a Reply

Your email address will not be published. Required fields are marked *

IDRsolutions Ltd 2020. All rights reserved.