iText tells me a PDF is corrupt
Submitted by Anonymous Newbie on Thu, 03/24/2011 - 10:13
I'm using PdfStamper to add a watermark to an existing PDF document. This works fine, except for a couple of specific PDF documents. iText reader says it can't read them. However: I can open these files in Adobe Reader without any problem.
- Login to post comments
Content © 2010 1T3XT BVBA

Other means of detection
Submitted by MarkStorer on Thu, 03/24/2011 - 17:56.If you have Acrobat Pro, you can use the "Report PDF Syntax Issues" preflight profile.
In Acrobat Pro X, you have to do some digging to find it:
(hey! Keyboard shortcut! ctrl+shift+x, then skip down to steps 5 & 6)
Whew. You can also right-click on the toolbar and add "preflight" to your toolbar so you don't have to go digging for it next time. I did (along with the AcroForm editor button).
In previous versions of Acrobat, it was much easier to find. Advanced->Preflight. Done. This is where I found the keyboard shortcut, and it worked in Acrobat Pro X. Joy!
The good news: This syntax checker is pretty good. It catches Lots Of Things, some subtle, other glaring.
The bad news: PDFs that had to be fixed just to open sometimes fail to be analyzed at all. This has been reported to Adobe via Leonard Rosenthol, their PDF Developer Relations guy. He knows his stuff and gets results. In the past, bugs I've reported to him have been fixed within a point release or two. Have patience.
Adobe Reader probably shouldn't open those files
Submitted by Bruno Lowagie on Thu, 03/24/2011 - 10:19.Please read this Editorial at Planet PDF:
Some time ago, Adobe made the decision to automatically repair these malformed PDF files in Adobe Reader and Adobe Acrobat when they encountered them in the wild. Usually, users don't even know that a PDF has been repaired. It happens silently in the background as the PDF is opened. Often, the only clue that the PDF has been repaired is that the Save button is enabled. [...]
Many times when a 3rd party PDF vendor points out to the user that the reason that a particular PDF fails to process correctly in their product is that it is a malformed PDF. To which the user usually replies, "But it opens in Adobe Reader." Well, yes, that is true, but should it be?
iText can, to some extent, apply corrections to a malformed PDF. You can check this with the
isRebuilt()method. There are cases where iText can't "rebuild" the document. For instance: you can't manipulate a PDF in append mode when it's malformed.