Hyperlinks are external, cliackable links which appear on web pages and other documents and allow you to go web pages or download files. PDF files can include hyperlinks and they are stored as an Annotation. Here is 2 examples of the raw data from inside a PDF file
12 0 obj<</Subtype/Link/Rect[ 205.39 637.21 320.54 651.01] /BS<</W 0>>/F 4/A<</Type/Action/S/URI/URI(http://www.yahoo.com/) >>>>endobj
13 0 obj<</Subtype/Link/Rect[ 201.4 609.61 304.55 623.41] /BS<</W 0>>/F 4/A<</Type/Action/S/URI/URI(http://www.cnn.com/) >>>>endobj
First of all, the Link is identified by its subtype (/Link). The Rect defines the area of the page which it applies to (using standard PDF co-ordinates). Clicking on this box will cause the link to open.
Because Hyperlinks are Annotations, it is relatively easy to edit or delete them using a Tool like IText. You just need to know the object reference numbers. So here is some code to extract the PDFobject details from a PDF file.
How to view Hyperlink Data in JPedal
PdfDecoder decodePdf = new PdfDecoder(false);
decodePdf.openPdfFile(file_name);
/**
* form code here
*/
//new list we can parse
for(int ii=1;ii<decodePdf.getPageCount()+1;ii++){
PdfArrayIterator annotListForPage = decodePdf.getFormRenderer().getAnnotsOnPage(ii);
if(annotListForPage!=null && annotListForPage.getTokenCount()>0){ //can have empty lists
while(annotListForPage.hasMoreTokens()){
//get ID of annot which has already been decoded and get actual object
String annotKey=annotListForPage.getNextValueAsString(true);
Object rawObj=decodePdf.getFormRenderer().getCompData().getRawForm(annotKey);
if(rawObj==null){
//no match found
System.out.println(“no match on “+annotKey);
}else{
//each PDF annot object – extract data from it
FormObject annotObj=(FormObject)rawObj;
int subtype=annotObj.getParameterConstant(PdfDictionary.Subtype);
if(subtype==PdfDictionary.Link){
System.out.println(“link object is “+annotKey);
float[] coords=annotObj.getFloatArray(PdfDictionary.Rect);
System.out.println(“Rect= “+coords[0]+” “+coords[1]+” “+coords[2]+” “+coords[3]);
//text in A subobject
PdfObject aData=annotObj.getDictionary(PdfDictionary.A);
if(aData!=null && aData.getNameAsConstant(PdfDictionary.S)==PdfDictionary.URI){
String text=aData.getTextStreamValue(PdfDictionary.URI); //+”ZZ”; deliberately broken first to test checking
System.out.println(“text=”+text);
}
}
}
}
}
}
}