Part 4: Hello World PDF (click to expand)
Back when dinosaurs roamed the earth I talked about the different objects that are used to form a Pdf file. One type I mentioned were stream objects. Stream objects are the objects that contain all the instructions describing what a Pdf page is going to look like. By the end of this article we are going to be able to make a Hello World Pdf. I’m going to have to make use of a stream object so I can put some text in a Pdf document.
If you open up any old Pdf in a text editor the majority of text you will see will be contained in stream objects. Its format is slightly different than the other objects: Its starts with a dictionary. This must have a /Length mapping saying how long the stream is in bytes. The length of the stream is everything between the keywords stream and endstream (minus the final end-of-file characters if the stream has one). Normally when you open a Pdf the stuff in the stream is compressed. You can tell what kind of compression by the /Filter key in the streams main dictionary.
10 0 obj<</Length 40 /Filter /FlateDecode>>
stream
...bunch of compressed stuff...
endstream
endobjIf you went to the trouble of uncompressing this stuff you would find a list of instructions. The list of instructions are the commands that create all the content in a Pdf. Here is the contents of the stream uncompressed:
BT
/F1 24 Tf
175 720 Td
(Hello World!)Tj
ETBT means Begin Text and ET means End Text. The stuff in between sets the font, position and what its going to say. The instructions are Tf, Td and Tj. Note how the values that these instructions need are written first.
Before I add that to my Pdf document we have to sort that reference to /F1 out. In streams you can’t reference objects in the same way you do when outside a stream (ie 10 0 R) you have to map /F1 to a object and make that available to the /Resources dictionary.
3 0 obj<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 500 800] /Contents 7 0 R>>
endobj
4 0 obj<</Font 5 0 R>>
endobj
5 0 obj<</F1 6 0 R>>
endobj
6 0 obj<</Type /Font /Subtype /Type1 /BaseFont /Helvetica>>
endobj
7 0 obj<</Length 40>>
stream
BT
/F1 24 Tf
.....
endstream
endobjSo we are making use of a /Page object. The pages /Contents entry points to a Stream object that prints our text. The stream needs to know about what object /F1 points to.
Anyway put it all together and you get, possibly, a world first: How to make a “Hello World” pdf document!
%PDF-2.0
1 0 obj <</Type /Catalog /Pages 2 0 R>>
endobj
2 0 obj <</Type /Pages /Kids [3 0 R] /Count 1>>
endobj
3 0 obj<</Type /Page /Parent 2 0 R /Resources 4 0 R /MediaBox [0 0 500 800] /Contents 6 0 R>>
endobj
4 0 obj<</Font <</F1 5 0 R>>>>
endobj
5 0 obj<</Type /Font /Subtype /Type1 /BaseFont /Helvetica>>
endobj
6 0 obj
<</Length 44>>
stream
BT /F1 24 Tf 175 720 Td (Hello World!)Tj ET
endstream
endobj
xref
0 7
0000000000 65535 f
0000000009 00000 n
0000000056 00000 n
0000000111 00000 n
0000000212 00000 n
0000000250 00000 n
0000000317 00000 n
trailer <</Size 7/Root 1 0 R>>
startxref
406
%%EOFOur software libraries allow you to
| Convert PDF files to HTML |
| Use PDF Forms in a web browser |
| Convert PDF Documents to an image |
| Work with PDF Documents in Java |
| Read and write HEIC and other Image formats in Java |