jPDFText is a handy Java library that can be used to extract words and text from a PDF file.
jPDFText can also be used to process PDF documents and extract their textual content.
jPDFText doesn't requires any additional third party software and drivers.
Here are some key features of "jPDFText":
· Load PDF documents from files, network drives, URLs or input streams
· Extract text
· Extract words as a vector of Strings
· Written entirely in Java - allows your application to remain platform independent
Limitations:
· Maximum 10 pages per document will be processed and some words will be replaced with 'DEMO'.
What's New in This Release: [ read full changelog ]
New Features:
Time Stamp Server for Digital Signatures:
· Timestamps are now supported in digital signatures.
· By default, JavaScript is disabled. Look at JavaScriptSettings class to allow JavaScript for all documents or per document.
· Read more about JavaScript support in Qoppa's PDF library in our knowledge base to learn about what JavaScript events and objects are supported.
Enhancements and Fixes:
· Libraries have been tested with the latest version of the Bouncy castle libraries (used for encryption).
· Handle bookmarks with contain only 2 items in the destinations: pageref and FitH, no "top" value.
· Fix null pointer exception when importing an FDF file created from PDF annotated with Adobe XI under specific circumstances (after adding a review status then deleting an annotation).
· For barcode fields which type is not supported in Qoppa's PDF engine, a message is shown "barcode type not found". In previous versions, the message would be written in the file when saving the document.