Do not return null for POITextExtractor.getMetadataTextExtractor() for old Excel files
To adhere to the JavaDoc of the POITextExtractor interface which does not document a
possible null return.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1889205 13f79535-47bb-0310-9956-ffa450edef68
Pictures can now be edited by calling HSLFPictureData#setData(byte[]). The byte[] should contain the image data as an image viewer might read it.
To enable this functionality, a tighter coupling between the EscherBSERecords of the slideshow and the HSLFPictureData was required. This ensures that changes in image data size are accurately recorded in the records.
In the course of coupling the records and the HSLFPictureData, various scenarios arose where a mapping of records to pictures was non-trivial. Accordingly, the HSLFSlideShowImpl#matchPicturesAndRecords(...) function was added to perform a more sophisticated matching pass. This function is heavily exercised by org.apache.poi.hslf.usermodel.TestBugs.testFile[5] and PPTX2PNG.render[2], as well as the new TestPictures#testSlideshowWithIncorrectOffsets().
Closes #225
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1887017 13f79535-47bb-0310-9956-ffa450edef68
Bug 64132: Fix regression introduced by changing factories
We need to return EscherBlipRecord for all types in the defined range of recordId,
not UnknownEscherRecord.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1885092 13f79535-47bb-0310-9956-ffa450edef68
ooxml-schema: trigger loading stress-documents as part of the normal unit-tests
The integration-tests are not executed for determining the parts of the schema
to include in the "lite" package and so we need to have a normal unit-test as well.
Add more documents which drag in some more parts from the ooxml-schema
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1885057 13f79535-47bb-0310-9956-ffa450edef68
Bug 64450: Allow to parse a file where the relationship-id is an empty string
Also improve exception reporting by including more information and not
"hiding" the actual exception-cause in a log-statement.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1885056 13f79535-47bb-0310-9956-ffa450edef68
Bug 64750: Do not use CTDataValidations.getCount(), instead only rely on getDataValidationArray
Field "count" seems to be optional according
to the schema, so by using only the array
Apache POI can read data validations from
workbooks that do not populate the
"count" field properly.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1885011 13f79535-47bb-0310-9956-ffa450edef68
Improve the speed of the styles-optimiser when there are many styles in a Workbook
Profiling showed that the call to isUserDefined() took most of the time,
so we now fetch this only once for each style and remember the result and
use it in the inner loop.
The attached sample-file took aprox. 1m30s on a decent machine, now it is down
to around 2s.
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1884917 13f79535-47bb-0310-9956-ffa450edef68
Bug 64322: Optimize performance of reading ole2 files
Remember channel-size to no re-read it for every read-access,
but reset it if we extend the size of the file
profiling indicates Channel.size() sometimes has similar runtime
overhead as position() or read()!
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@1877816 13f79535-47bb-0310-9956-ffa450edef68