As security is an ever present issue we thought we would highlight some security features offered by the PDF file format that can be enabled if you want to control or limit which actions users can perform. This blog post will cover what type of actions you can prevent and some different authentication techniques that can be implemented to increase the security of your PDF.
What can be controlled?
Before we dive into the different security options available I thought it would be useful to provide examples of what actions you can control which include limiting who can:
- View the content
- Add/modify text annotations
- Fill in interactive form fields (including signature fields)
- Print the document
- Add comments
- Modify the content
- Extract content including graphics and text
It is worth noting that you cannot stop people from saving a PDF.
The PDF File structure already provides a certain level of security for the content before being encrypted due to the data being stored in a binary format. This means it can’t be easily hacked like a normal text format. You can open it in a text reader to view but it can be difficult to understand if you do not understand how PDF files are structured internally.
On top of that you can encrypt your files to help prevent further unauthorized access. Encrypting a PDF file means that the majority of Strings and Streams that make up your content will be encrypted. There are some String/Stream exceptions though that cannot be encrypted like the String values in the Encrypt dictionary as it needs to be able to read these in order to decrypt the file.
How is data encrypted you ask? Well PDFs utilize a security handler which is an extension of the encryption object that implements various aspects of the encryption process. It also enforces your set permissions to control access/actions on the encrypted document.
It looks at values set in the extended Encryption Dictionary entries to ascertain the User Access Permissions to allow. Some values read would include:
- Filter – Name of the preferred Security Handler to use for decryption (has to be the one used to encrypt the document otherwise it won’t be able to open the file)
- V – An optional flag that can be used to specify the algorithm to use for encrypting/decrypting
- R – which sets the revision of the Security Handler to use
- O – Stores a 32 byte string based on user and owner passwords. Used to create an encryption key and validate the owner password
- U – Stores a 32 byte string based on just the User password. Used in determining whether to prompt the user for a password and, if so, whether a valid user or owner password was entered.
- P – A set of flags specifying which actions are allowed when the document is opened with user access
- EncryptMetadata – An optional flag you can set when encryption version 5 is used to encrypt
There are different methods you can use to allow certain groups/individuals to authenticate themselves as both the sender of a document and a recipient:
- User Password Protection – you can use a user password to simply restrict viewing of the PDF. This is the simplest method and just requires the password to be shared beforehand. The downside is that is also the most vulnerable with some tools being able to strip the password requirement from the PDF so they can view it anyway. That is why encrypting sensitive data is recommended as it adds an extra barrier to ward off unauthorized access.
- Owner Password Protection – can be set to control permissions. You cannot add or modify permissions without this password.
- Digital Signatures – Authenticates both document and sender, and protects the documents integrity by ensuring it hasn’t been modified.
- Certificates – If you know in advance the people or groups that will be allowed access to the PDF then you could also use certificates. These are used to encrypt documents with a different mechanism from a password and is more flexible as you can provide multiple certificates with different permissions for multiple users and groups. Only Recipients who are certified in advance can open PDF’s with this security feature enabled and.
And that concludes a brief overview of security features available for a PDF. For a more extensive list you can view the last publicly available version of the PDF Specification.
IDRsolutions develop a Java PDF library, a PDF forms to HTML5 converter, a PDF to HTML5 or SVG converter and a Java Image Library that doubles as an ImageIO replacement. On the blog our team post about anything interesting they learn about.