From: Nick Burch Date: Sun, 31 Aug 2008 17:24:10 +0000 (+0000) Subject: HPBF docs update X-Git-Tag: REL_3_2_FINAL~110 X-Git-Url: https://source.dussan.org/?a=commitdiff_plain;h=b944df7e56ea7de59aaf4373712a241872caf0e9;p=poi.git HPBF docs update git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@690739 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/src/documentation/content/xdocs/hpbf/file-format.xml b/src/documentation/content/xdocs/hpbf/file-format.xml index e08ebbac04..ed01858288 100644 --- a/src/documentation/content/xdocs/hpbf/file-format.xml +++ b/src/documentation/content/xdocs/hpbf/file-format.xml @@ -171,6 +171,27 @@ PL 62 1a 00 00 48 00 00 00 // PL from: 1a62 (6754), len: 48 (72) think that the second 4 bytes of text describes the format of data block at the offset. The format of the text block is easy, but we're still trying to figure out the others.

+ +
Structure of TEXT bit +

This is very simple. All the text for the document is + stored in a single bit of the Quill CONTENTS. The text + is stored as little endian 16 bit unicode strings.

+
+
Structure of PLC bit +

The first four bytes seem to hold the count of the + entries in the bit, and the second four bytes seem to hold + the type. There is then some pre-data, and then data for + each of the entries, the exact format dependant on the type.

+

Type 0 has 4 2 byte unsigned ints, then a pair of 2 byte + unsigned ints for each entry.

+

Type 4 has 4 2 byte unsigned ints, then a pair of 4 byte + unsigned ints for each entry.

+

Type 8 has 7 2 byte unsigned ints, then a pair of 4 byte + unsigned ints for each entry.

+

Type 12 holds hyperlinks, and is very much more complex. + See org.apache.poi.hpbf.model.qcbits.QCPLCBit + for our best guess as to how the contents match up.

+
diff --git a/src/documentation/content/xdocs/hpbf/index.xml b/src/documentation/content/xdocs/hpbf/index.xml index 01f49f061f..84d6948fd4 100755 --- a/src/documentation/content/xdocs/hpbf/index.xml +++ b/src/documentation/content/xdocs/hpbf/index.xml @@ -41,7 +41,7 @@ lots of offsets to other parts of the file.

Our initial aim is to provude a text extractor for the format (now done), and be able to extract hyperlinks from within - the document (not yet supported). Additional low level + the document (partly supported). Additional low level code to process the file format may follow, if there is demand and developer interest warrant it.

At this time, there is no usermodel api or similar.