Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

What are Hyperlinks in PDF files?

1 min read

jpedal

Hyperlinks are external, cliackable links which appear on web pages and other documents and allow you to go web pages or download files. PDF files can include hyperlinks and they are stored as an Annotation. Here is 2 examples of the raw data from inside a PDF file

12 0 obj<</Subtype/Link/Rect[ 205.39 637.21 320.54 651.01] /BS<</W 0>>/F 4/A<</Type/Action/S/URI/URI(http://www.yahoo.com/) >>>>endobj

13 0 obj<</Subtype/Link/Rect[ 201.4 609.61 304.55 623.41] /BS<</W 0>>/F 4/A<</Type/Action/S/URI/URI(http://www.cnn.com/) >>>>endobj

First of all, the Link is identified by its subtype (/Link). The Rect defines the area of the page which it applies to (using standard PDF co-ordinates). Clicking on this box will cause the link to open.

Because Hyperlinks are Annotations, it is relatively easy to edit or delete them using a Tool like IText. You just need to know the object reference numbers. So here is some code to extract the PDFobject details from a PDF file.

How to view Hyperlink Data in JPedal

PdfDecoder decodePdf = new PdfDecoder(false);

decodePdf.openPdfFile(file_name);

/**
* form code here
*/

//new list we can parse
for(int ii=1;ii<decodePdf.getPageCount()+1;ii++){
PdfArrayIterator annotListForPage = decodePdf.getFormRenderer().getAnnotsOnPage(ii);

if(annotListForPage!=null && annotListForPage.getTokenCount()>0){ //can have empty lists

while(annotListForPage.hasMoreTokens()){

//get ID of annot which has already been decoded and get actual object
String annotKey=annotListForPage.getNextValueAsString(true);

Object rawObj=decodePdf.getFormRenderer().getCompData().getRawForm(annotKey);
if(rawObj==null){
//no match found
System.out.println(“no match on “+annotKey);
}else{

//each PDF annot object – extract data from it
FormObject annotObj=(FormObject)rawObj;

int subtype=annotObj.getParameterConstant(PdfDictionary.Subtype);

if(subtype==PdfDictionary.Link){
System.out.println(“link object is “+annotKey);
float[] coords=annotObj.getFloatArray(PdfDictionary.Rect);
System.out.println(“Rect= “+coords[0]+” “+coords[1]+” “+coords[2]+” “+coords[3]);

//text in A subobject
PdfObject aData=annotObj.getDictionary(PdfDictionary.A);
if(aData!=null && aData.getNameAsConstant(PdfDictionary.S)==PdfDictionary.URI){
String text=aData.getTextStreamValue(PdfDictionary.URI); //+”ZZ”; deliberately broken first to test checking
System.out.println(“text=”+text);
}
}

}
}
}
}

}



FormVu allows you to

Use Interactive PDF Forms in the Web Browser
Integrate fillable PDF Forms into Web Apps
Parse PDF forms as HTML5
Mark Stephens Mark has been working with Java and PDF since 1999 and is a big NetBeans fan. He enjoys speaking at conferences. He has an MA in Medieval History and a passion for reading.

2 Replies to “What are Hyperlinks in PDF files?”

  1. Hi Mark,

    I have some issue in URI action included in pdf file to be redirected to any website. The URI added will be opened as soon as we open the pdf. When I observe the pdf in a pdf exposure tool, I see below formation of URI action.
    “””
    <>
    “””
    My first question is that, when I open this pdf in Chrome/any browser, this URI is opened in the same tab. So is there a way that the URI opens in new tab? How can that be done?

    Second question is, when we open pdf files containing URI action in pdf viewer tools, it gives security alert if we want to allow this website or we want to block it. I don’t want this security alert. How can we do this? [Can we grammatically add the external website in the trusted list so that it doesn’t ask, however I am working in Python? Or any other way?]

    Thanks in advance. Can you suggest something asap?

Comments are closed.