Integration tests: Expect exception for old word documents and still run the text extraction for them. Also add executing HPSFPropertiesExtractor where possible
* Add text-extraction verification to integration-tests via a new abstract base FileHandler
* Fix NullPointerException found in some documents when running against the test-data
* Add support for extracting text from Dir-Entries WORKBOOK and BOOK to support some old/strangely formatted XLS files.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1662652 13f79535-47bb-0310-9956-ffa450edef68
Charset.forName() for known encodings makes catching UnknownEncodingException obsolete
Unify UTF-16LE conversion to StringUtil
BugFix for RecordInputStream.readFully in combination with continuing records
BugFix for integration tests - fix pathname for handler/exclude lookup on windows
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1648032 13f79535-47bb-0310-9956-ffa450edef68
Add a test-suite which performs integration/stress tests which load and handle all stored test files in various ways.
It works by using handlers for each type of file which perform various operations on the files, e.g. loading,
iterating content, modify, ... This will trigger changes which break working with the available test-files and
thus provides another layer of regression testing which hopefully prevents some failures from making it into
releases.
It is runnable via a new ant-target 'test-integration' and also added to the jenkins-target.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1647885 13f79535-47bb-0310-9956-ffa450edef68