* Add text-extraction verification to integration-tests via a new abstract base FileHandler
* Fix NullPointerException found in some documents when running against the test-data
* Add support for extracting text from Dir-Entries WORKBOOK and BOOK to support some old/strangely formatted XLS files.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1662652 13f79535-47bb-0310-9956-ffa450edef68
Charset.forName() for known encodings makes catching UnknownEncodingException obsolete
Unify UTF-16LE conversion to StringUtil
BugFix for RecordInputStream.readFully in combination with continuing records
BugFix for integration tests - fix pathname for handler/exclude lookup on windows
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1648032 13f79535-47bb-0310-9956-ffa450edef68
Add a test-suite which performs integration/stress tests which load and handle all stored test files in various ways.
It works by using handlers for each type of file which perform various operations on the files, e.g. loading,
iterating content, modify, ... This will trigger changes which break working with the available test-files and
thus provides another layer of regression testing which hopefully prevents some failures from making it into
releases.
It is runnable via a new ant-target 'test-integration' and also added to the jenkins-target.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1647885 13f79535-47bb-0310-9956-ffa450edef68
Applying the copy2license.pl script (with tiny modification to allow for more whitespace than it expects in the POI header) to all files. ant jar succeeds, and the svn diff has been verified by eye. Still more files to do, this is the first pass.