What is CCITT data?
CCITT is used to compress black and white image data. Using Huffman encoding, the data is squeezed into a much smaller compressed stream.
CCITT is also a compression format used in the TIFF file format. By adding some additional bytes to your raw CCITT data, and saving it in a file ending .tif, you can create a TIFF Image from raw CCITT data. My example is written in Java (but it should be easy to recode in any language). It will take the raw data and add the required bytes.
CCITT data in PDF files
CCITT is used as a compression format in PDF files for images in XObjects. You can manually extract the CCITT data and the Dictionary values (K, isBlack, etc) from PDF files if you want to reuse the images. If you have extracted the CCITT data from a PDF, there may be some differences between the raw image and the image in the PDF – remember this is the raw image which may be inverted, coloured, clipped, etc.
How to convert CCITT to a Tiff
- Get the CCITT parameters
- Create a metadata header
- Append the raw CCITT data
and the Java code to write TIFF…
/**
* default values (these may be set in a PDF DecodeParms dictionary)
**/
boolean isBlack = false; //flag to show if default is black/white
int k = 0;
w = -1;
/**
* build the image
**/
ByteArrayOutputStream bos=new ByteArrayOutputStream();
/**
* tiff header (id, version, offset)
**/
String[] headerValues={"4d","4d","00″,"2a","00″,"00″,"00″,"08″};
for(int i=0;i<headervalues.length;i++) bos.write(integer.parseint(headervalues[i],16));="" int="" tagcount="9;" appears="" to="" be="" minimum="" needed="" writeword="" and="" write="" tag="" are="" convenience="" methods="" add="" the="" values="" as="" bytes="" stream="" **="" *="" ifd="" image="" file="" directory="" writeword(string.valueof(tagcount),bos);="" num="" of="" entries="" writetag("256″,="" "04",="" "01",="" string.valueof(w),="" bos);="" **width*="" writetag("257″,="" string.valueof(h),="" **length*="" **bitspersample="" 258="" -="" b&w="" 1="" bit="" image*="" writetag("258″,="" "03",="" "00010000h",="" if="" (k="=" 0){="" writetag("259″,="" "00030000h",="" compression="" }else=""> 0)
writeTag("259", "03", "01", "00020000h", bos); //compression
else if (k < 0)
writeTag("259", "03", "01", "00040000h", bos); //compression
//photometricInterpretation
if(!isBlack)
writeTag("262", "03", "01", "00000000h", bos);
else
writeTag("262", "03", "01", "00010000h", bos);
//stripOffsets -start of data after tables
writeTag("273", "04", "1","122″, bos);
//samplesPerPixel
writeTag("277", "03", "01", "00010000h", bos);
//rowsPerStrip – uses height
writeTag("278", "04", "01", String.valueOf(h), bos);
//stripByteCount – 1 strip so all data
writeTag("279", "04", "1", String.valueOf(data.length),bos);
// write next IOD offset zero as no other table
writeDWord("0",bos);
/**
* write the CCITT image data at the end
**/
try{
bos.write(data);
bos.close();
} catch (IOException e) {
LogWriter.writeLog("[PDF] Tiff exception "+e);
}
/**save data as image */
try {
FileOutputStream fos=new FileOutputStream(fileName);
fos.write(bos.toByteArray());
fos.close();
} catch (Error err) {
LogWriter.writeLog("[PDF] Tiff error "+err);
} catch (Exception e1) {
LogWriter.writeLog("[PDF] Tiff exception "+e1);
}
}
Are you a Java Developer working with Image files?
// Read an image
BufferedImage bufferedImage = JDeli.read(dicomImageFile);
// Read an image
BufferedImage bufferedImage = JDeli.read(heicImageFile);
// Write an image
JDeli.write(bufferedImage, "heic", outputStreamOrFile);
// Read an image
BufferedImage bufferedImage = JDeli.read(jpegImageFile);
// Write an image
JDeli.write(bufferedImage, "jpeg", outputStreamOrFile);
// Read an image
BufferedImage bufferedImage = JDeli.read(jpeg2000ImageFile);
// Write an image
JDeli.write(bufferedImage, "jpx", outputStreamOrFile);
// Write an image
JDeli.write(bufferedImage, "pdf", outputStreamOrFile);
// Read an image
BufferedImage bufferedImage = JDeli.read(pngImageFile);
// Write an image
JDeli.write(bufferedImage, "png", outputStreamOrFile);
// Read an image
BufferedImage bufferedImage = JDeli.read(tiffImageFile);
// Write an image
JDeli.write(bufferedImage, "tiff", outputStreamOrFile);
// Read an image
BufferedImage bufferedImage = JDeli.read(webpImageFile);
// Write an image
JDeli.write(bufferedImage, "webp", outputStreamOrFile);
Why do developers choose JDeli over free alternatives?
- Works with newer image formats such as AVIF, HEIC, JPEG XL, WEBP
- Better support than alternatives for JPEG, PNG, TIFF.
- Prevent JVM crashes caused by native code in other image libraries
- Better performance than other popular Java image libraries
Can you publish the code for writeWord and writeTag?
Here is the whole class (we also use it to decode CCITT with JAI)
/**
* ===========================================
* Java Pdf Extraction Decoding Access Library
* ===========================================
*
* Project Info: http://www.jpedal.org
* (C) Copyright 1997-2008, IDRsolutions and Contributors.
*
* This file is part of JPedal
*
@LICENSE@
*
* —————
* TiffDecoder.java
* —————
*/
package org.jpedal.io;
import java.awt.image.DataBuffer;
import java.awt.image.DataBufferByte;
import java.awt.image.Raster;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.Map;
import org.jpedal.objects.raw.PdfDictionary;
import org.jpedal.objects.raw.PdfObject;
import org.jpedal.utils.LogWriter;
/**
* converts CCITT stream into either an image of bytestream
*
* Many thanks to Brian Burkhalter for all his help
*/
public class TiffDecoder {
private byte[] bytes;
/**
* called with values from PDF
* Map contains values from PDF as stream pair
*/
public TiffDecoder(int w, int h,Map values,byte[] data){
//return value
bytes=null;
/**
* get values from stream
*/
//flag to show if default is black or white
boolean isBlack = false;
//int columns = 1728; //in PDF spec
int k = 0;
//boolean isByteAligned=false; //in PDF spec
//get k (type of encoding)
String value = (String) values.get(“K”);
if (value != null)
k = Integer.parseInt(value);
/**
//get flag for white/black as default
value = (String) values.get(“EncodedByteAlign”);
if (value != null)
isByteAligned = Boolean.valueOf(value).booleanValue();*/
//get flag for white/black as default
value = (String) values.get(“BlackIs1”);
if (value != null){
isBlack = Boolean.valueOf(value).booleanValue();
}
/**not used but in Map from PDF
value = (String) values.get(“Rows”);
if (value != null)
rows = Integer.parseInt(value);
value = (String) values.get(“Columns”);
if (value != null)
columns= Integer.parseInt(value);*/
buildImage(w, h, data, isBlack, k);
}
public TiffDecoder(int w, int h,PdfObject DecodeParms,byte[] data){
//Map values=new HashMap();
//return value
bytes=null;
/**
* get values from stream
*/
//flag to show if default is black or white
boolean isBlack = false;
//int columns = 1728; //in PDF spec
int k = 0;
//boolean isByteAligned=false; //in PDF spec
if(DecodeParms!=null){
//get k (type of encoding)
k = DecodeParms.getInt(PdfDictionary.K);
int columnsSet = DecodeParms.getInt(PdfDictionary.Columns);
if(columnsSet!=-1)
w=columnsSet;
//get flag for white/black as default
isBlack=DecodeParms.getBoolean(PdfDictionary.BlackIs1);
}
/**not used but in Map from PDF
value = (String) values.get(“Rows”);
if (value != null)
rows = Integer.parseInt(value);
value = (String) values.get(“Columns”);
if (value != null)
columns= Integer.parseInt(value);*/
buildImage(w, h, data, isBlack, k);
}
/**
* convenience method to add a header to a CCITT data block so it can be viewed as a TIFF
* @param w
* @param h
* @param DecodeParms
* @param data
* @param fileName
*/
public static void saveAsTIFF(int w, int h,PdfObject DecodeParms,byte[] data, String fileName){
/**
* get values from stream
*/
boolean isBlack = false; //flag to show if default is black or white
int k = 0;
if(DecodeParms!=null){
//get k (type of encoding)
k = DecodeParms.getInt(PdfDictionary.K);
int columnsSet = DecodeParms.getInt(PdfDictionary.Columns);
if(columnsSet!=-1)
w=columnsSet;
//get flag for white/black as default
isBlack=DecodeParms.getBoolean(PdfDictionary.BlackIs1);
}
/**
* build the image
*/
ByteArrayOutputStream bos=new ByteArrayOutputStream();
/** 0) 0)>8)); //high byte
* tiff header (id, version, offset)
* */
final String[] headerValues={“4d”,”4d”,”00″,”2a”,”00″,”00″,”00″,”08″};
for(int i=0;i
writeTag(“259”, “03”, “01”, “00020000h”, bos); /**compression 259 */
else if (k < 0) writeTag("259", "03", "01", "00040000h", bos); /**compression 259 */ if(!isBlack) writeTag("262", "03", "01", "00000000h", bos); /**photometricInterpretation 262 */ else writeTag("262", "03", "01", "00010000h", bos); /**photometricInterpretation 262 */ writeTag("273", "04", "1","122", bos); /**stripOffsets 273 -start of data after tables */ writeTag("277", "03", "01", "00010000h", bos); /**samplesPerPixel 277 */ writeTag("278", "04", "01", String.valueOf(h), bos); /**rowsPerStrip 278 - uses height */ writeTag("279", "04", "1", String.valueOf(data.length),bos); /**stripByteCount 279 - 1 strip so all data */ writeDWord("0",bos); /** write next IOD offset zero as no other table*/ /** * write the CCITT image data at the end */ try{ bos.write(data); bos.close(); } catch (IOException e) { if(LogWriter.isOutput()) LogWriter.writeLog("[PDF] Tiff exception "+e); } /**save image */ try { java.io.FileOutputStream fos=new java.io.FileOutputStream(fileName); fos.write(bos.toByteArray()); fos.close(); } catch (Error err) { if(LogWriter.isOutput()) LogWriter.writeLog("[PDF] Tiff error "+err); } catch (Exception e1) { if(LogWriter.isOutput()) LogWriter.writeLog("[PDF] Tiff exception "+e1); } } private void buildImage(int w, int h, byte[] data, boolean isBlack, int k) { /** * build the image */ByteArrayOutputStream bos=new ByteArrayOutputStream(); /** * tiff header (id, version, offset) * */final String[] headerValues={"4d","4d","00","2a","00","00","00","08"}; for(int i=0;i
writeTag(“259”, “03”, “01”, “00020000h”, bos); /**compression 259 */
else if (k < 0) writeTag("259", "03", "01", "00040000h", bos); /**compression 259 */ if(!isBlack) writeTag("262", "03", "01", "00000000h", bos); /**photometricInterpretation 262 */ else writeTag("262", "03", "01", "00010000h", bos); /**photometricInterpretation 262 */ writeTag("273", "04", "1","122", bos); /**stripOffsets 273 -start of data after tables */ writeTag("277", "03", "01", "00010000h", bos); /**samplesPerPixel 277 */ writeTag("278", "04", "01", String.valueOf(h), bos); /**rowsPerStrip 278 - uses height */ writeTag("279", "04", "1", String.valueOf(data.length),bos); /**stripByteCount 279 - 1 strip so all data */ writeDWord("0",bos); /** write next IOD offset zero as no other table*/ /** * write the CCITT image data at the end */ try{ bos.write(data); bos.close(); } catch (IOException e) { if(LogWriter.isOutput()) LogWriter.writeLog("[PDF] Tiff exception "+e); } /**setup image */try { /**write out to debug* System.out.println("mac_"+data.length+".tiff"); java.io.FileOutputStream fos=new java.io.FileOutputStream("mac_"+data.length+".tiff"); fos.write(bos.toByteArray()); fos.close(); /***/JAIHelper.confirmJAIOnClasspath(); com.sun.media.jai.codec.ByteArraySeekableStream fss=new com.sun.media.jai.codec.ByteArraySeekableStream(bos.toByteArray());//.wrapInputStream(bis,true); javax.media.jai.RenderedOp op = (javax.media.jai.JAI.create("stream",fss)); Raster raster=op.getData(); //Raster raster = img2.getData(); DataBuffer db = raster.getDataBuffer(); DataBufferByte dbb = (DataBufferByte) db; bytes=dbb.getData(); if(!isBlack){ //invert if needed int bcount=bytes.length; for(int i=0;i
bos.write(value & 0xFF); //low byte
}
/**write Dword (4 bytes to stream) */
private static void writeDWord(String i, ByteArrayOutputStream bos) {
int value=0;
//allow decimal,octal or hex
if(i.endsWith(“h”))
value=Integer.parseInt(i.substring(0,i.length()-1),16);
else if(i.endsWith(“o”))
value=Integer.parseInt(i.substring(0,i.length()-1),8);
else
value=Integer.parseInt(i);
bos.write((value>>24) & 0xff); //high byte
bos.write((value>>16) & 0xff);
bos.write((value>>8) & 0xff);
bos.write(value & 0xFF); //low byte
}
/**write a tag to stream*/
private static void writeTag(String TagId, String dataType, String DataCount, String DataOffset, ByteArrayOutputStream bos) {
writeWord(TagId,bos);
writeWord(dataType,bos);
writeDWord(DataCount,bos);
writeDWord(DataOffset,bos);
}
}
Mark,
Thanks for the post. I noticed that because the big-endian encoding you had to write the tag values in hex encoding, i.e. value of 1 = 00010000h, etc. I changed the methods to write in a litttle-endian in case somone is interested.
/**
*
* @param data
* @param w
* @param h
* @param encodedType
* @param isBlack
* @param fileName
*/
private static void saveAsTIFF(byte[] data, int w, int h, int encodedType, Boolean isBlack, String fileName) throws IOException {
String[] headerValues = {“49”, “49”,”2a”,”00″,”08″,”00″,”00″,”00″};
/* build the image
**/
ByteArrayOutputStream bos = new ByteArrayOutputStream();
/**
* tiff header (id, version, offset)
**/
for(int i = 0; i 0)
writeTag(259, 3, 1, 2, bos); //compression
else if (encodedType > 8)); //high byte
}
/**
* write Dword (4 bytes to stream)
*
* */
private static void writeDWord(int value, ByteArrayOutputStream bos) {
bos.write(value & 0xff); //low byte
bos.write((value >> 8) & 0xff);
bos.write((value >> 16) & 0xff);
bos.write((value >> 24) & 0xff); //high byte
}
/**
* write a tag to stream
*
* */
private static void writeTag(int TagId, int dataType, int DataCount, int DataOffset, ByteArrayOutputStream bos) {
writeWord(TagId, bos);
writeWord(dataType, bos);
writeDWord(DataCount, bos);
writeDWord(DataOffset, bos);
}
Thanks for the alternative suggestion.
Hi guys, Did either of you ever get this to work? I am trying to implement it via C++, using the little endian style. It looks like it is working (sizes are right, file is “complete” in that all data is there) but I keep getting an indication that the .tif file is not valid. Am I missing something? Note, I have a routine (GetFilterParams) that does what it says. Here is the code.
long PDFHelper::sDecodeCCITT(char* sInput, int nK, int nCols, int nHeight,
int nSize, bool bIsBlack)
{
char*sTIFF;
intnComma, i, j;
sTIFF = new char [TIFF_HEADER_SIZE];
ZeroMemory(sTIFF, TIFF_HEADER_SIZE);
CString sHeaderValues = “73,73,42,00,08,00,00,00”;
nComma = sHeaderValues.Find(‘,’);
i = 0;
while (nComma != -1)
{
sTIFF[i] = atoi(sHeaderValues.Mid(nComma-2, nComma));
nComma = sHeaderValues.Find(‘,’, nComma +1);
i++;
}
sTIFF[i] = atoi(sHeaderValues.Mid(sHeaderValues.GetLength() – 2));
i++;
i = writeWord(9 , i, sTIFF);// num of directory entries
// nID, nType, nDataCnt, nOffset
i = writeTag(256, 4, 1, nCols, i, sTIFF);// width
i = writeTag(257, 4, 1, nHeight, i, sTIFF);// height = length = scan lines = rows
/**BitsPerSample 258 – b&w 1 bit image*/
i = writeTag(258, 3, 1, 1, i, sTIFF);
if (nK == 0)
{
i = writeTag(259, 3, 1, 3, i, sTIFF);//compression
}
else if (nK > 0)
{
i = writeTag(259, 3, 1, 2, i, sTIFF);//compression
}
else if (nK < 0)
{
i = writeTag(259, 3, 1, 4, i, sTIFF);//compression
}
//photometricInterpretation
if(!bIsBlack)
{
i = writeTag(262, 3, 1, 0, i, sTIFF);
}
else
{
i = writeTag(262, 3, 1, 1, i, sTIFF);
}
//stripOffsets -start of data after tables
i = writeTag(273, 4, 1, 122, i, sTIFF);
//samplesPerPixel
i = writeTag(277, 3, 1, 1, i, sTIFF);
//rowsPerStrip – uses height
i = writeTag(278, 4, 1, nHeight, i, sTIFF);
//stripByteCount – 1 strip so all data
i = writeTag(279, 4, 1, nSize, i, sTIFF);
// write next IOD offset zero as no other table
i = writeDWord(0, i, sTIFF);
// Copy sTmp data to sTIFF
for (j = 0; j < i; j++)
{
m_sStream[j] = sTIFF[j];
}
// write the CCITT image data at the end
for (j = i; j >8;//high byte
i++;
return(i);
}
int PDFHelper::writeTag(int nID, int nType, int nDataCnt,
int nOffset, int i, char* sStream)
{
//writeTag(256, 04, 01, nCols, i, sTIFF);// width
i = writeWord(nID, i, sStream);
i = writeWord(nType, i, sStream);
i = writeDWord(nDataCnt, i, sStream);
i = writeDWord(nOffset, i, sStream);
return(i);
}
int PDFHelper::writeDWord(int nID, int i, char* sStream)
{
sStream[i] = nID & 0xFF;//low byte
i++;
sStream[i] = nID >>8;
i++;
sStream[i] = nID >16;
i++;
sStream[i] = nID >>24;//high byte
i++;
return(i);
}
No. We already have a Java implementation which works.
Hi Mark Stephens,
>No. We already have a Java implementation which works.
Could you share me your code?
I follow up your code above, but there is issues of format, so it is error.
Thank you.
This code is just putting a header onto the Tiff data and asking Java to decode it. Java has some issues with some Tiff Data.
There are some improvements to Java support for Tiffs in Java9 and we also now offer our own commercial image library with much better Tiff support (https://www.idrsolutions.com/jdeli).