diff options
author | Glen Stampoultzis <glens@apache.org> | 2003-04-24 00:53:41 +0000 |
---|---|---|
committer | Glen Stampoultzis <glens@apache.org> | 2003-04-24 00:53:41 +0000 |
commit | 2d641e1f74a3118502974885e5f6577c2758d8a3 (patch) | |
tree | 7e3eabf25a0b09b509e0a98ef18183190d81c37b /src/documentation/content/xdocs/poifs | |
parent | 3f0a48aa9cafdef785eaecf7f7ee369d58441eb5 (diff) | |
download | poi-2d641e1f74a3118502974885e5f6577c2758d8a3.tar.gz poi-2d641e1f74a3118502974885e5f6577c2758d8a3.zip |
Merged from BUILD_BRANCH. Note: There is one problem. The HDF testcases are failing for me which prevents the full build from running. Committers, please feel free to tweak the build on your own now.
git-svn-id: https://svn.apache.org/repos/asf/jakarta/poi/trunk@353067 13f79535-47bb-0310-9956-ffa450edef68
Diffstat (limited to 'src/documentation/content/xdocs/poifs')
-rw-r--r-- | src/documentation/content/xdocs/poifs/book.xml | 17 | ||||
-rw-r--r-- | src/documentation/content/xdocs/poifs/fileformat.xml | 676 | ||||
-rw-r--r-- | src/documentation/content/xdocs/poifs/how-to.xml | 354 | ||||
-rw-r--r-- | src/documentation/content/xdocs/poifs/html/POIFSDesignDocument.html | 1279 | ||||
-rw-r--r-- | src/documentation/content/xdocs/poifs/index.xml | 40 | ||||
-rw-r--r-- | src/documentation/content/xdocs/poifs/usecases.xml | 635 |
6 files changed, 3001 insertions, 0 deletions
diff --git a/src/documentation/content/xdocs/poifs/book.xml b/src/documentation/content/xdocs/poifs/book.xml new file mode 100644 index 0000000000..9ee98e5fd0 --- /dev/null +++ b/src/documentation/content/xdocs/poifs/book.xml @@ -0,0 +1,17 @@ +<?xml version="1.0"?> +<!DOCTYPE book PUBLIC "-//APACHE//DTD Cocoon Documentation Book V1.0//EN" "../dtd/book-cocoon-v10.dtd"> + +<book software="Poi Project" + title="PoiFS" + copyright="@year@ Poi Project"> + + <menu label="Navigation"> + <menu-item label="Main" href="../index.html"/> + <menu-item label="How To" href="how-to.html"/> + <menu-item label="File System Documentation" href="fileformat.html"/> + <menu-item label="Use Cases" href="usecases.html"/> + </menu> + +</book> + + diff --git a/src/documentation/content/xdocs/poifs/fileformat.xml b/src/documentation/content/xdocs/poifs/fileformat.xml new file mode 100644 index 0000000000..315ca2c11d --- /dev/null +++ b/src/documentation/content/xdocs/poifs/fileformat.xml @@ -0,0 +1,676 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> +<document> + <header> + <title>POIFS File System Internals</title> + <authors> + <person email="mjohnson@apache.org" name="Marc Johnson" id="MJ"/> + </authors> + </header> + <body> + <section><title>POIFS File System Internals</title> + <section><title>Introduction</title> + <p>POIFS file systems are essentially normal files stored on a + Java-compatible platform's native file system. They are + typically identified by names ending in a four character + extension noting what type of data they contain. For + example, a file ending in ".xls" would likely + contain spreadsheet data, and a file ending in + ".doc" would probably contain a word processing + document. POIFS file systems are called "file + system", because they contain multiple embedded files + in a manner similar to traditional file systems. Along + functional lines, it would be more accurate to call these + POIFS archives. For the remainder of this document it is + referred to as a file system in order to avoid confusion + with the "files" it contains.</p> + <p>POIFS file systems are compatible with those document + formats used by a well-known software company's popular + office productivity suite and programs outputting + compatible data. Because the POIFS file system does not + provide compression, encryption or any other worthwhile + feature, its not a good choice unless you require + interoperability with these programs.</p> + <p>The POIFS file system does not encode the documents + themselves. For example, if you had a word processor file + with the extension ".doc", you would actually + have a POIFS file system with a document file archived + inside of that file system.</p> + </section> + <section><title>Document Conventions</title> + <p>This document utilizes the numeric types as described by + the Java Language Specification, which can be found at + <link href="http://java.sun.com">http://java.sun.com</link>. In + short:</p> + <ul> + <li>A <em>byte</em> is an 8 bit signed integer ranging from + -128 to 127.</li> + <li>A <em>short</em> is a 16 bit signed integer ranging from + -32768 to 32767</li> + <li>An <em>int</em> is a 32 bit signed integer ranging from + -2147483648 to 2147483647</li> + <li>A <em>long</em> is a 64 bit signed integer ranging from + -9.22E18 to 9.22E18.</li> + </ul> + <p>The Java Language Specification spells out a number of + other types that are not referred to by this document.</p> + <p>Where this document makes references to "endian + conversion" it is referring to the byte order of + stored numbers. Numbers in "little-endian order" + are stored with the <em>least</em> significant byte first. In + order to properly read a short, for example, you'd read two + bytes and then shift the second byte 8 bits to the left + before performing an <code>or</code> operation to it + against the first byte. The following code illustrates this + method:</p> + <source> +public int getShort (byte[] rec) +{ + return ((rec[1] << 8) | (rec[0] & 0x00ff)); +}</source> + </section> + <section><title>File System Walkthrough</title> + <p>This is a walkthrough of a POIFS file system and how it is + put together. It is not intended to give a concise + description but to give a "big picture" of the + general structure and how it's interpreted.</p> + <p>A POIFS file system begins with a header. This header + identifies locations in the file by function and provides a + sanity check identifying a file as a POIFS file system.</p> + <p>The first 64 bits of the header compose a <em>magic number + identifier.</em> This identifier tells the client software + that this is indeed a POIFS file system and that it should + be treated as such. This is a "sanity check" to + make sure this is a POIFS file system and not some other + format. The header also contains an <em>array of block + numbers</em>. These block numbers refer to blocks in the + file. When these blocks are read together they form the + <em>Block Allocation Table</em>. The header also contains a + pointer to the first element in the <em>property table</em>, + also known as the <em>root element</em>, and a pointer to the + <em>small Block Allocation Table (SBAT)</em>.</p> + <p>The <em>block allocation table</em> or <em>BAT</em>, along with + the <em>property table</em>, specify which blocks in the file + system belong to which files. After the header block, the + file system is divided into identically sized blocks of + data, numbered from 0 to however many blocks there are in + the file system. For each file in the file system, its + entry in the property table includes the index of the first + block in the array of blocks. Each block's index into the + array of blocks is also its index into the BAT, and the + integer value stored at that index in the BAT gives the + index of the next block in the array (and thus the index of + the next BAT value). A special value is stored in the BAT + to indicate "end of file".</p> + <p>The <em>property table</em> is essentially the directory + storage for the file system. It consists of the name of the + file or directory, its <em>start block</em> in both the file + system and <em>BAT</em>, and its actual size. The first + property in the property table is the <em>root + element</em>. It has two purposes: to be a directory entry + (the root of the directory tree, to be specific), and to + hold the start block for the <em>small block data</em>.</p> + <p>Small block data is a special file that contains the data + for small files (less than 4K bytes). It subdivides its + blocks into smaller blocks and there is a special small + block allocation table that, like the main BAT for larger + files, is used to map a small file to its small blocks.</p> + </section> + <section><title>Header Block</title> + <p>The POIFS file system begins with a <em>header + block</em>. The first 64 bits of the header form a long + <em>file type id</em> or <em>magic number identifier</em> of + <code>0xE11AB1A1E011CFD0L</code>. This is basically a + sanity check. If this isn't the first thing in the header + (and consequently the file system) then this is not a + POIFS file system and should be read with some other + library.</p> + <p>It's important to know the most important parts of the + header. These are discussed in the rest of this + section.</p> + <section><title>BATs</title> + <p>At offset <em>0x2C</em> is an int specifying the number + of elements in the <em>BAT array</em>. The array at + <em>0x4C</em> an array of ints. This array contains the + indices of every block in the Block Allocation + Table.</p> + </section> + <section><title>XBATs</title> + <p>Very large POIFS archives may have more blocks than can + be addressed by the BAT blocks enumerated in the header + block. How large? Well, the BAT array in the header can + contain up to 109 BAT block indices; each BAT block + references up to 128 blocks, and each block is 512 + bytes, so we're talking about 109 * 128 * 512 = + 6.8MB. That's a pretty respectable document! But, you + could have much more data than that, and in today's + world of cheap gigabyte drives, why not? So, the BAT + may be extended in that event. The integer value at + offset <em>0x44</em> of the header is the index of the + first <em>extended BAT (XBAT) block</em>. At offset + <em>0x48</em> of the header, there is an int value that + specifies how many XBAT blocks there are. The XBAT + blocks begin at the specified index into the array of + blocks making up the POIFS file system, and continue in + sequence for the specified count of XBAT blocks.</p> + <p>Each XBAT block contains the indices of up to 128 BAT + blocks, so the document size can be expanded by another + 8MB for each XBAT block. The BAT blocks indexed by an + XBAT block are appended to the end of the list of BAT + blocks enumerated in the header block. Thus the BAT + blocks enumerated in the header block are BAT blocks 0 + through 108, the BAT blocks enumerated in the first + XBAT block are BAT blocks 109 through 236, the BAT + blocks enumerated in the second XBAT block are BAT + blocks 237 through 364, and so on.</p> + <p>Through the use of XBAT blocks, the limit on the + overall document size is that imposed by the 4-byte + block indices; if the indices are unsigned ints, the + maximum file size is 2 terabytes, 1 terabyte if the + indices are treated as signed ints. Either way, I have + yet to see a disk drive large enough to accommodate + such a file on the shelves at the local office supply + stores.</p> + </section> + <section><title>SBATs</title> + <p>If a file contained in a POIFS archive is smaller than + 4096 bytes, it is stored in small blocks. Small blocks + are 64 bytes in length and are contained within big + blocks, up to 8 to a big block. As the main BAT is used + to navigate the array of big blocks, so the <em>small + block allocation table</em> is used to navigate the + array of small blocks. The SBAT's start block index is + found at offset <em>0x3C</em> of the header block, and + remaining blocks constituting the SBAT are found by + walking the main BAT as if it were an ordinary file in + the POIFS file system (this process is described + below).</p> + </section> + <section><title>Property Table Start Index</title> + <p>An integer at address <em>0x30</em> specifies the start + index of the property table. This integer is specified + as a <em>"block index"</em>. The Property Table + is stored, as is almost everything in a POIFS file + system, in big blocks and walked via the BAT. The + Property Table is described below.</p> + </section> + </section> + <section><title>Property Table</title> + <p>The property table is essentially nothing more than the + directory system. Properties are 128 byte records + contained within the 512 byte blocks. The first property + is always the Root Entry. The following applies to + individual properties within a property table:</p> + <ul> + <li>At offset <em>0x00</em> in the property is the + "<em>name</em>". This is stored as an + uncompressed 16 bit unicode string. In short every + other byte corresponds to an "ASCII" + character. The size of this string is stored at offset + <em>0x40</em> (<em>string size</em>) as a short.</li> + <li>At offset <em>0x42</em> is the <em>property type</em> + (byte). The type is 1 for directory, 2 for file or 5 + for the Root Entry.</li> + <li>At offset <em>0x43</em> is the <em>node color</em> + (byte). The color is either 1, (black), or 0, + (red). Properties are apparently meant to be arranged + in a red-black binary tree, subject to the following + rules: + <ol> + <li>The root of the tree is always black</li> + <li>Two consecutive nodes cannot both be red</li> + <li>A property is less than another property if its + name length is less than the other property's name + length</li> + <li>If two properties have the same name length, the + sort order is determined by the sort order of the + properties' names.</li> + </ol></li> + <li>At offset <em>0x44</em> is the index (int) of the + <em>previous property</em>.</li> + <li>At offset <em>0x48</em> is the index (int) of the + <em>next property</em>.</li> + <li>At offset <em>0x4C</em> is the index (int) of the + <em>first directory entry</em>. This is used by + directory entries.</li> + <li>At offset <em>0x74</em> is an integer giving the + <em>start block</em> for the file described by this + property. This index corresponds to an index in the + array of indices that is the Block Allocation Table + (or the Small Block Allocation Table) as well as the + index of the first block in the file. This is used by + files and the root entry.</li> + <li>At offset <em>0x78</em> is an integer giving the total + <em>actual size</em> of the file pointed at by this + property. If the file size is less than 4096, the file + is stored in small blocks and the SBAT is used to walk + the small blocks making up the file. If the file size + is 4096 or larger, the file is stored in big blocks + and the main BAT is used to walk the big blocks making + up the file. The exception to this rule is the <em>Root + Entry</em>, which, regardless of its size, is + <em>always</em> stored in big blocks and the main BAT is + used to walk the big blocks making up this special + file.</li> + </ul> + </section> + <section><title>Root Entry</title> + <p>The <em>Root Entry</em> in the <em>Property Table</em> + contains the information necessary to read and write + small files, which are files less than 4096 bytes + long. The start block field of the Root Entry is the + start index of the <em>Small Block Array</em>, which is + read like any other file in the POIFS file system. Since + the SBAT cannot be used without the Small Block Array, + the Root Entry MUST be read or written using the <em>Block + Allocation Table</em>. The blocks making up the Small + Block Array are divided into 64-byte small blocks, up to + the size indicated in the Root Entry (which should always + be a multiple of 64).</p> + </section> + <section><title>Walking the Nodes of the Property Table</title> + <p>The individual properties form a directory tree, with the + <em>Root Entry</em> as the directory tree's root, as shown + in the accompanying drawing. Note the numbers in + parentheses in each node; they represent the node's index + in the array of properties. The <em>NEXT_PROP</em>, + <em>PREVIOUS_PROP</em>, and <em>CHILD_PROP</em> fields hold + these indices, and are used to navigate the tree.</p> + <p><img alt="property set" src="images/PropertySet.jpg" /></p> + <p>Each directory entry (i.e., a property whose type is + <em>directory</em> or <em>root entry</em>) uses its + <em>CHILD_PROP</em> field to point to one of its + subordinate (child) properties. It doesn't seem to matter + which of its children it points to. Thus in the previous + drawing, the Root Entry's CHILD_PROP field may contain 1, + 4, or the index of one of its other children. Similarly, + the directory node (index 1) may have, in its CHILD_PROP + field, 2, 3, or the index of one of its other + children.</p> + <p>The children of a given directory property point to each + other in a similar fashion by using their + <em>NEXT_PROP</em> and <em>PREVIOUS_PROP</em> fields.</p> + <p>Unused <em>NEXT_PROP</em>, <em>PREVIOUS_PROP</em>, and + <em>CHILD_PROP</em> fields contain the marker value of + -1. All file properties have a value of -1 for their + CHILD_PROP fields for example.</p> + </section> + <section><title>Block Allocation Table</title> + <p>The <em>BAT blocks</em> are pointed at by the bat array + contained in the header and supplemented, if necessary, + by the <em>XBAT blocks</em>. These blocks form a large + table of integers. These integers are block numbers. The + <em>Block Allocation Table</em> holds chains of integers. + These chains are terminated with -2. The elements in + these chains refer to blocks in the files. The starting + block of a file is NOT specified in the BAT. It is + specified by the <em>property</em> for a given file. The + elements in this BAT are both the block number (within + the file minus the header) <em>and</em> the number of the + next BAT element in the chain. This can be thought of as + a linked list of blocks. The BAT array contains the links + from one block to the next, including the end of chain + marker.</p> + <p>Here's an example: Let's assume that the BAT begins as + follows:</p> + <p><code>BAT[ 0 ] = 2</code></p> + <p><code>BAT[ 1 ] = 5</code></p> + <p><code>BAT[ 2 ] = 3</code></p> + <p><code>BAT[ 3 ] = 4</code></p> + <p><code>BAT[ 4 ] = 6</code></p> + <p><code>BAT[ 5 ] = -2</code></p> + <p><code>BAT[ 6 ] = 7</code></p> + <p><code>BAT[ 7 ] = -2</code></p> + <p><code>...</code></p> + <p>Now, if we have a file whose Property Table entry says it + begins with index 0, we walk the BAT array and see that + the file consists of blocks 0 (because the start block is + 0), 2 (because BAT[ 0 ] is 2), 3 (BAT[ 2 ] is 3), 4 (BAT[ + 3 ] is 4), 6 (BAT[ 4 ] is 6), and 7 (BAT[ 6 ] is 7). It + ends at block 7 because BAT[ 7 ] is -2, which is the end + of chain marker.</p> + <p>Similarly, a file beginning at index 1 consists of + blocks 1 and 5.</p> + <p>Other special numbers in a BAT array are:</p> + <ul> + <li>-1, which indicates an unused block</li> + <li>-3, which indicates a "special" block, such + as a block used to make up the Small Block Array, the + Property Table, the main BAT, or the SBAT</li> + </ul> + </section> + <section><title>File System Structures</title> + <p>The following outlines the basic file system structures.</p> + <section><title>Header (block 1) -- 512 (0x200) bytes</title> + <table> + <tr> + <td><em>Field</em></td> + <td><em>Description</em></td> + <td><em>Offset</em></td> + <td><em>Length</em></td> + <td><em>Default value or const</em></td> + </tr> + <tr> + <td>FILETYPE</td> + <td>Magic number identifying this as a POIFS file + system.</td> + <td>0x0000</td> + <td>Long</td> + <td>0xE11AB1A1E011CFD0</td> + </tr> + <tr> + <td>UK1</td> + <td>Unknown constant</td> + <td>0x0008</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>UK2</td> + <td>Unknown Constant</td> + <td>0x000C</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>UK3</td> + <td>Unknown Constant</td> + <td>0x0014</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>UK4</td> + <td>Unknown Constant (revision?)</td> + <td>0x0018</td> + <td>Short</td> + <td>0x003B</td> + </tr> + <tr> + <td>UK5</td> + <td>Unknown Constant (version?)</td> + <td>0x001A</td> + <td>Short</td> + <td>0x0003</td> + </tr> + <tr> + <td>UK6</td> + <td>Unknown Constant</td> + <td>0x001C</td> + <td>Short</td> + <td>-2</td> + </tr> + <tr> + <td>LOG_2_BIG_BLOCK_SIZE</td> + <td>Log, base 2, of the big block size</td> + <td>0x001E</td> + <td>Short</td> + <td>9 (2 ^ 9 = 512 bytes)</td> + </tr> + <tr> + <td>LOG_2_SMALL_BLOCK_SIZE</td> + <td>Log, base 2, of the small block size</td> + <td>0x0020</td> + <td>Integer</td> + <td>6 (2 ^ 6 = 64 bytes)</td> + </tr> + <tr> + <td>UK7</td> + <td>Unknown Constant</td> + <td>0x0024</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>UK8</td> + <td>Unknown Constant</td> + <td>0x0028</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>BAT_COUNT</td> + <td>Number of elements in the BAT array</td> + <td>0x002C</td> + <td>Integer</td> + <td>required</td> + </tr> + <tr> + <td>PROPERTIES_START</td> + <td>Block index of the first block of the property + table</td> + <td>0x0030</td> + <td>Integer</td> + <td>required</td> + </tr> + <tr> + <td>UK9</td> + <td>Unknown Constant</td> + <td>0x0034</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>UK10</td> + <td>Unknown Constant</td> + <td>0x0038</td> + <td>Integer</td> + <td>0x00001000</td> + </tr> + <tr> + <td>SBAT_START</td> + <td>Block index of first big block containing the small + block allocation table (SBAT)</td> + <td>0x003C</td> + <td>Integer</td> + <td>-2</td> + </tr> + <tr> + <td>SBAT_Block_Count</td> + <td>Number of big blocks holding the SBAT</td> + <td>0x0040</td> + <td>Integer</td> + <td>1</td> + </tr> + <tr> + <td>XBAT_START</td> + <td>Block index of the first block in the Extended Block + Allocation Table (XBAT)</td> + <td>0x0044</td> + <td>Integer</td> + <td>-2</td> + </tr> + <tr> + <td>XBAT_COUNT</td> + <td>Number of elements in the Extended Block Allocation + Table (to be added to the BAT)</td> + <td>0x0048</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>BAT_ARRAY</td> + <td>Array of block indices constituting the Block + Allocation Table (BAT)</td> + <td>0x004C, 0x0050, 0x0054 ... 0x01FC</td> + <td>Integer[]</td> + <td>-1 for unused elements, at least first element must + be filled.</td> + </tr> + <tr> + <td>N/A</td> + <td>Header block data not otherwise described in this + table</td> + <td>N/A</td> + <td>N/A</td> + <td>-1</td> + </tr> + </table> + </section> + <section> + <title>Block Allocation Table Block -- 512 (0x200) bytes</title> + <table> + <tr> + <td> + <em>Field</em> + </td> + <td> + <em>Description</em> + </td> + <td> + <em>Offset</em> + </td> + <td> + <em>Length</em> + </td> + <td> + <em>Default value or const</em> + </td> + </tr> + <tr> + <td>BAT_ELEMENT</td> + <td>Any given element in the BAT block</td> + <td>0x0000, 0x0004, 0x0008, ... 0x01FC</td> + <td>Integer</td> + <td> + -1 = unused<br/> + -2 = end of chain<br/> + -3 = special (e.g., BAT block)<br/> + All other values point to the next element in the + chain and the next index of a block composing the + file. + </td> + </tr> + </table> + </section> + <section><title>Property Block -- 512 (0x200) byte block</title> + <table> + <tr> + <td><em>Field</em></td> + <td><em>Description</em></td> + <td><em>Offset</em></td> + <td><em>Length</em></td> + <td><em>Default value or const</em></td> + </tr> + <tr> + <td>Properties[]</td> + <td>This block contains the properties.</td> + <td>0x0000, 0x0080, 0x0100, 0x0180</td> + <td>128 bytes</td> + <td>All unused space is set to -1.</td> + </tr> + </table> + </section> + <section><title>Property -- 128 (0x80) byte block</title> + <table> + <tr> + <td><em>Field</em></td> + <td><em>Description</em></td> + <td><em>Offset</em></td> + <td><em>Length</em></td> + <td><em>Default value or const</em></td> + </tr> + <tr> + <td>NAME</td> + <td>A unicode null-terminated uncompressed 16bit string + (lose the high bytes) containing the name of the + property.</td> + <td>0x00, 0x02, 0x04, ... 0x3E</td> + <td>Short[]</td> + <td>0x0000 for unused elements, field required, 32 + (0x40) element max</td> + </tr> + <tr> + <td>NAME_SIZE</td> + <td>Number of characters in the NAME field</td> + <td>0x40</td> + <td>Short</td> + <td>Required</td> + </tr> + <tr> + <td>PROPERTY_TYPE</td> + <td>Property type (directory, file, or root)</td> + <td>0x42</td> + <td>Byte</td> + <td>1 (directory), 2 (file), or 5 (root entry)</td> + </tr> + <tr> + <td>NODE_COLOR</td> + <td>Node color</td> + <td>0x43</td> + <td>Byte</td> + <td>0 (red) or 1 (black)</td> + </tr> + <tr> + <td>PREVIOUS_PROP</td> + <td>Previous property index</td> + <td>0x44</td> + <td>Integer</td> + <td>-1</td> + </tr> + <tr> + <td>NEXT_PROP</td> + <td>Next property index</td> + <td>0x48</td> + <td>Integer</td> + <td>-1</td> + </tr> + <tr> + <td>CHILD_PROP</td> + <td>First child property index</td> + <td>0x4c</td> + <td>Integer</td> + <td>-1</td> + </tr> + <tr> + <td>SECONDS_1</td> + <td>Seconds component of the created timestamp?</td> + <td>0x64</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>DAYS_1</td> + <td>Days component of the created timestamp?</td> + <td>0x68</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>SECONDS_2</td> + <td>Seconds component of the modified timestamp?</td> + <td>0x6C</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>DAYS_2</td> + <td>Days component of the modified timestamp?</td> + <td>0x70</td> + <td>Integer</td> + <td>0</td> + </tr> + <tr> + <td>START_BLOCK</td> + <td>Starting block of the file, used as the first block + in the file and the pointer to the next block from + the BAT</td> + <td>0x74</td> + <td>Integer</td> + <td>Required</td> + </tr> + <tr> + <td>SIZE</td> + <td>Actual size of the file this property points + to. (used to truncate the blocks to the real + size).</td> + <td>0x78</td> + <td>Integer</td> + <td>0</td> + </tr> + </table> + </section> + </section> + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/poifs/how-to.xml b/src/documentation/content/xdocs/poifs/how-to.xml new file mode 100644 index 0000000000..ee525b5485 --- /dev/null +++ b/src/documentation/content/xdocs/poifs/how-to.xml @@ -0,0 +1,354 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> +<document> + <header> + <title>How To Use the POIFS APIs</title> + <authors> + <person email="mjohnson@apache.org" name="Marc Johnson" id="MJ"/> + </authors> + </header> + <body> + <section><title>How To Use the POIFS APIs</title> + <p>This document describes how to use the POIFS APIs to read, write, and modify files that employ a POIFS-compatible data structure to organize their content.</p> + <section><title>Revision History</title> + <ul> + <li>02.10.2002 - completely rewritten from original documents on <link href="https://sourceforge.net/cvs/?group_id=32701">sourceforge</link></li> + </ul> + </section> + <section><title>Target Audience</title> + <p>This document is intended for Java developers who need to use the POIFS APIs to read, write, or modify files that employ a POIFS-compatible data structure to organize their content. It is not necessary for developers to understand the POIFS data structures, and an explanation of those data structures is beyond the scope of this document. It is expected that the members of the target audience will understand the rudiments of a hierarchical file system, and familiarity with the event pattern employed by Java APIs such as AWT would be helpful.</p> + </section> + <section><title>Glossary</title> + <p>This document attempts to be consistent in its terminology, which is defined here:</p> + <table> + <tr> + <td><em>Term</em></td> + <td><em>Definition</em></td> + </tr> + <tr> + <td>Directory</td> + <td>A special file that may contain other directories and documents.</td> + </tr> + <tr> + <td>DirectoryEntry</td> + <td>Representation of a directory within another directory.</td> + </tr> + <tr> + <td>Document</td> + <td>A file containing data, such as word processing data or a spreadsheet workbook.</td> + </tr> + <tr> + <td>DocumentEntry</td> + <td>Representation of a document within a directory.</td> + </tr> + <tr> + <td>Entry</td> + <td>Representation of a file in a directory.</td> + </tr> + <tr> + <td>File</td> + <td>A named entity, managed and contained by the file system.</td> + </tr> + <tr> + <td>File System</td> + <td>The POIFS data structures, plus the contained directories and documents, which are maintained in a hierarchical directory structure.</td> + </tr> + <tr> + <td>Root Directory</td> + <td>The directory at the base of a file system. All file systems have a root directory. The POIFS APIs will not allow the root directory to be removed or renamed, but it can be accessed for the purpose of reading its contents or adding files (directories and documents) to it.</td> + </tr> + </table> + </section> + </section> + <section><title>Reading a File System</title> + <p>This section covers reading a file system. There are two ways to read a file system; these techniques are sketched out in the following table, and then explained in greater depth in the sections following the table.</p> + <table> + <tr> + <td><em>Technique</em></td> + <td><em>Advantages</em></td> + <td><em>Disadvantages</em></td> + </tr> + <tr> + <td>Conventional Reading</td> + <td> + Simpler API similar to reading a conventional file system.<br/> + Can read documents in any order. + </td> + <td> + All files are resident in memory, whether your application needs them or not. + </td> + </tr> + <tr> + <td>Event-Driven Reading</td> + <td> + Reduced footprint -- only the documents you care about are processed.<br/> + Improved performance -- no time is wasted reading the documents you're not interested in. + </td> + <td> + More complicated API.<br/> + Need to know in advance which documents you want to read.<br/> + No control over the order in which the documents are read.<br/> + No way to go back and get additional documents except to re-read the file system, which may not be possible, e.g., if the file system is being read from an input stream that lacks random access support. + </td> + </tr> + </table> + <section><title>Conventional Reading</title> + <p>In this technique for reading, the entire file system is loaded into memory, and the entire directory tree can be walked by an application, reading specific documents at the application's leisure.</p> + <section><title>Preparation</title> + <p>Before an application can read a file from the file system, the file system needs to be loaded into memory. This is done by using the <code>org.apache.poi.poifs.filesystem.POIFSFileSystem</code> class. Once the file system has been loaded into memory, the application may need the root directory. The following code fragment will accomplish this preparation stage:</p> + <source> +// need an open InputStream; for a file-based system, this would be appropriate: +// InputStream stream = new FileInputStream(fileName); +POIFSFileSystem fs; +try +{ + fs = new POIFSFileSystem(inputStream); +} +catch (IOException e) +{ + // an I/O error occurred, or the InputStream did not provide a compatible + // POIFS data structure +} +DirectoryEntry root = fs.getRoot();</source> + <p>Assuming no exception was thrown, the file system can then be read.</p> + <p>Note: loading the file system can take noticeable time, particularly for large file systems.</p> + </section> + <section><title>Reading the Directory Tree</title> + <p>Once the file system has been loaded into memory and the root directory has been obtained, the root directory can be read. The following code fragment shows how to read the entries in an <code>org.apache.poi.poifs.filesystem.DirectoryEntry</code> instance:</p> + <source> +// dir is an instance of DirectoryEntry ... +for (Iterator iter = dir.getEntries(); iter.hasNext(); ) +{ + Entry entry = (Entry)iter.next(); + System.out.println("found entry: " + entry.getName()); + if (entry instanceof DirectoryEntry) + { + // .. recurse into this directory + } + else if (entry instanceof DocumentEntry) + { + // entry is a document, which you can read + } + else + { + // currently, either an Entry is a DirectoryEntry or a DocumentEntry, + // but in the future, there may be other entry subinterfaces. The + // internal data structure certainly allows for a lot more entry types. + } +}</source> + </section> + <section><title>Reading a Specific Document</title> + <p>There are a couple of ways to read a document, depending on whether the document resides in the root directory or in another directory. Either way, you will obtain an <code>org.apache.poi.poifs.filesystem.DocumentInputStream</code> instance.</p> + <section><title>DocumentInputStream</title> + <p>The DocumentInputStream class is a simple implementation of InputStream that makes a few guarantees worth noting:</p> + <ul> + <li><code>available()</code> always returns the number of bytes in the document from your current position in the document.</li> + <li><code>markSupported()</code> returns <code>true</code>.</li> + <li><code>mark(int limit)</code> ignores the limit parameter; basically the method marks the current position in the document.</li> + <li><code>reset()</code> takes you back to the position when <code>mark()</code> was last called, or to the beginning of the document if <code>mark()</code> has not been called.</li> + <li><code>skip(long n)</code> will take you to your current position + n (but not past the end of the document).</li> + </ul> + <p>The behavior of <code>available</code> means you can read in a document in a single read call like this:</p> + <source> +byte[] content = new byte[ stream.available() ]; +stream.read(content); +stream.close();</source> + <p>The combination of <code>mark</code>, <code>reset</code>, and <code>skip</code> provide the basic mechanisms needed for random access of the document contents.</p> + </section> + <section><title>Reading a Document From the Root Directory</title> + <p>If the document resides in the root directory, you can obtain a <code>DocumentInputStream</code> like this:</p> + <source> +// load file system +try +{ + DocumentInputStream stream = filesystem.createDocumentInputStream(documentName); + // process data from stream +} +catch (IOException e) +{ + // no such document, or the Entry represented by documentName is not a + // DocumentEntry +}</source> + </section> + <section><title>Reading a Document From an Arbitrary Directory</title> + <p>A more generic technique for reading a document is to obtain an <code>org.apache.poi.poifs.filesystem.DirectoryEntry</code> instance for the directory containing the desired document (recall that you can use <code>getRoot()</code> to obtain the root directory from its file system). From that DirectoryEntry, you can then obtain a <code>DocumentInputStream</code> like this:</p> + <source> +DocumentEntry document = (DocumentEntry)directory.getEntry(documentName); +DocumentInputStream stream = new DocumentInputStream(document); +</source> + </section> + </section> + </section> + <section><title>Event-Driven Reading</title> + <p>The event-driven API for reading documents is a little more complicated and requires that your application know, in advance, which files it wants to read. The benefit of using this API is that each document is in memory just long enough for your application to read it, and documents that you never read at all are not in memory at all. When you're finished reading the documents you wanted, the file system has no data structures associated with it at all and can be discarded.</p> + <section><title>Preparation</title> + <p>The preparation phase involves creating an instance of <code>org.apache.poi.poifs.eventfilesystem.POIFSReader</code> and to then register one or more <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderListener</code> instances with the <code>POIFSReader</code>.</p> + <source> +POIFSReader reader = new POIFSReader(); +// register for everything +reader.registerListener(myOmnivorousListener); +// register for selective files +reader.registerListener(myPickyListener, "foo"); +reader.registerListener(myPickyListener, "bar"); +// register for selective files +reader.registerListener(myOtherPickyListener, new POIFSDocumentPath(), + "fubar"); +reader.registerListener(myOtherPickyListener, new POIFSDocumentPath( + new String[] { "usr", "bin" ), "fubar");</source> + </section> + <section><title>POIFSReaderListener</title> + <p><code>org.apache.poi.poifs.eventfilesystem.POIFSReaderListener</code> is an interface used to register for documents. When a matching document is read by the <code>org.apache.poi.poifs.eventfilesystem.POIFSReader</code>, the <code>POIFSReaderListener</code> instance receives an <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderEvent</code> instance, which contains an open <code>DocumentInputStream</code> and information about the document.</p> + <p>A <code>POIFSReaderListener</code> instance can register for individual documents, or it can register for all documents; once it has registered for all documents, subsequent (and previous!) registration requests for individual documents are ignored. There is no way to unregister a <code>POIFSReaderListener</code>.</p> + <p>Thus, it is possible to register a single <code>POIFSReaderListener</code> for multiple documents - one, some, or all documents. It is guaranteed that a single <code>POIFSReaderListener</code> will receive exactly one notification per registered document. There is no guarantee as to the order in which it will receive notification of its documents, as future implementations of <code>POIFSReader</code> are free to change the algorithm for walking the file system's directory structure.</p> + <p>It is also permitted to register more than one <code>POIFSReaderListener</code> for the same document. There is no guarantee of ordering for notification of <code>POIFSReaderListener</code> instances that have registered for the same document when <code>POIFSReader</code> processes that document.</p> + <p>It is guaranteed that all notifications occur in the same thread. A future enhancement may be made to provide multi-threaded notifications, but such an enhancement would very probably be made in a new reader class, a <code>ThreadedPOIFSReader</code> perhaps.</p> + <p>The following table describes the three ways to register a <code>POIFSReaderListener</code> for a document or set of documents:</p> + <table> + <tr> + <td><em>Method Signature</em></td> + <td><em>What it does</em></td> + </tr> + <tr> + <td>registerListener(POIFSReaderListener <em>listener</em>)</td> + <td>registers <em>listener</em> for all documents.</td> + </tr> + <tr> + <td>registerListener(POIFSReaderListener <em>listener</em>, String <em>name</em>)</td> + <td>registers <em>listener</em> for a document with the specified <em>name</em> in the root directory.</td> + </tr> + <tr> + <td>registerListener(POIFSReaderListener <em>listener</em>, POIFSDocumentPath <em>path</em>, String <em>name</em>)</td> + <td>registers <em>listener</em> for a document with the specified <em>name</em> in the directory described by <em>path</em></td> + </tr> + </table> + </section> + <section><title>POIFSDocumentPath</title> + <p>The <code>org.apache.poi.poifs.filesystem.POIFSDocumentPath</code> class is used to describe a directory in a POIFS file system. Since there are no reserved characters in the name of a file in a POIFS file system, a more traditional string-based solution for describing a directory, with special characters delimiting the components of the directory name, is not feasible. The constructors for the class are used as follows:</p> + <table> + <tr> + <td><em>Constructor example</em></td> + <td><em>Directory described</em></td> + </tr> + <tr> + <td>new POIFSDocumentPath()</td> + <td>The root directory.</td> + </tr> + <tr> + <td>new POIFSDocumentPath(null)</td> + <td>The root directory.</td> + </tr> + <tr> + <td>new POIFSDocumentPath(new String[ 0 ])</td> + <td>The root directory.</td> + </tr> + <tr> + <td>new POIFSDocumentPath(new String[ ] { "foo", "bar"} )</td> + <td>in Unix terminology, "/foo/bar".</td> + </tr> + <tr> + <td>new POIFSDocumentPath(new POIFSDocumentPath(new String[] { "foo" }), new String[ ] { "fu", "bar"} )</td> + <td>in Unix terminology, "/foo/fu/bar".</td> + </tr> + </table> + </section> + <section><title>Processing POIFSReaderEvent Events</title> + <p>Processing <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderEvent</code> events is relatively easy. After all of the <code>POIFSReaderListener</code> instances have been registered with <code>POIFSReader</code>, the <code>POIFSReader.read(InputStream stream)</code> method is called.</p> + <p>Assuming that there are no problems with the data, as the <code>POIFSReader</code> processes the documents in the specified <code>InputStream</code>'s data, it calls registered <code>POIFSReaderListener</code> instances' <code>processPOIFSReaderEvent</code> method with a <code>POIFSReaderEvent</code> instance.</p> + <p>The <code>POIFSReaderEvent</code> instance contains information to identify the document (a <code>POIFSDocumentPath</code> object to identify the directory that the document is in, and the document name), and an open <code>DocumentInputStream</code> instance from which to read the document.</p> + </section> + </section> + </section> + <section><title>Writing a File System</title> + <p>Writing a file system is very much like reading a file system in that there are multiple ways to do so. You can load an existing file system into memory and modify it (removing files, renaming files) and/or add new files to it, and write it, or you can start with a new, empty file system:</p> + <source> +POIFSFileSystem fs = new POIFSFileSystem();</source> + <section><title>The Naming of Names</title> + <p>There are two restrictions on the names of files in a file system that must be considered when creating files:</p> + <ol> + <li>The name of the file must not exceed 31 characters. If it does, the POIFS API will silently truncate the name to fit.</li> + <li>The name of the file must be unique within its containing directory. This seems pretty obvious, but if it isn't spelled out, there'll be hell to pay, to be sure. Uniqueness, of course, is determined <em>after</em> the name has been truncated, if the original name was too long to begin with.</li> + </ol> + </section> + <section><title>Creating a Document</title> + <p>A document can be created by acquiring a <code>DirectoryEntry</code> and calling one of the two <code>createDocument</code> methods:</p> + <table> + <tr> + <td><em>Method Signature</em></td> + <td><em>Advantages</em></td> + <td><em>Disadvantages</em></td> + </tr> + <tr> + <td>CreateDocument(String name, InputStream stream)</td> + <td> + Simple API. + </td> + <td> + Increased memory footprint (document is in memory until file system is written). + </td> + </tr> + <tr> + <td>CreateDocument(String name, int size, POIFSWriterListener writer)</td> + <td> + Decreased memory footprint (only very small documents are held in memory, and then only for a short time). + </td> + <td> + More complex API.<br/> + Determining document size in advance may be difficult.<br/> + Lose control over when document is to be written. + </td> + </tr> + </table> + <p>Unlike reading, you don't have to choose between the in-memory and event-driven writing models; both can co-exist in the same file system.</p> + <p>Writing is initiated when the <code>POIFSFileSystem</code> instance's <code>writeFilesystem()</code> method is called with an <code>OutputStream</code> to write to.</p> + <p>The event-driven model is quite similar to the event-driven model for reading, in that the file system calls your <code>org.apache.poi.poifs.filesystem.POIFSWriterListener</code> when it's time to write your document, just as the <code>POIFSReader</code> calls your <code>POIFSReaderListener</code> when it's time to read your document. Internally, when <code>writeFilesystem()</code> is called, the final POIFS data structures are created and are written to the specified <code>OutputStream</code>. When the file system needs to write a document out that was created with the event-driven model, it calls the <code>POIFSWriterListener</code> back, calling its <code>processPOIFSWriterEvent()</code> method, passing an <code>org.apache.poi.poifs.filesystem.POIFSWriterEvent</code> instance. This object contains the <code>POIFSDocumentPath</code> and name of the document, its size, and an open <code>org.apache.poi.poifs.filesystem.DocumentOutputStream</code> to which to write. A <code>DocumentOutputStream</code> is a wrapper over the <code>OutputStream</code> that was provided to the <code>POIFSFileSystem</code> to write to, and has the responsibility of making sure that the document your application writes fits within the size you specified for it.</p> + </section> + <section><title>Creating a Directory</title> + <p>Creating a directory is similar to creating a document, except that there's only one way to do so:</p> + <source> +DirectoryEntry createdDir = existingDir.createDirectory(name);</source> + </section> + <section><title>Using POIFSFileSystem Directly To Create a Document Or Directory</title> + <p>As with reading documents, it is possible to create a new document or directory in the root directory by using convenience methods of POIFSFileSystem.</p> + <table> + <tr> + <td>DirectoryEntry Method Signature</td> + <td>POIFSFileSystem Method Signature</td> + </tr> + <tr> + <td>createDocument(String name, InputStream stream)</td> + <td>createDocument(InputStream stream, String name)</td> + </tr> + <tr> + <td>createDocument(String name, int size, POIFSWriterListener writer)</td> + <td>createDocument(String name, int size, POIFSWriterListener writer)</td> + </tr> + <tr> + <td>createDirectory(String name)</td> + <td>createDirectory(String name)</td> + </tr> + </table> + </section> + </section> + <section><title>Modifying a File System</title> + <p>It is possible to modify an existing POIFS file system, whether it's one your application has loaded into memory, or one which you are creating on the fly.</p> + <section><title>Removing a Document</title> + <p>Removing a document is simple: you get the <code>Entry</code> corresponding to the document and call its <code>delete()</code> method. This is a boolean method, but should always return <code>true</code>, indicating that the operation succeeded.</p> + </section> + <section><title>Removing a Directory</title> + <p>Removing a directory is also simple: you get the <code>Entry</code> corresponding to the directory and call its <code>delete()</code> method. This is a boolean method, but, unlike deleting a document, may not always return <code>true</code>, indicating that the operation succeeded. Here are the reasons why the operation may fail:</p> + <ul> + <li>The directory still has files in it (to check, call <code>isEmpty()</code> on its DirectoryEntry; is the return value <code>false</code>?)</li> + <li>The directory is the root directory. You cannot remove the root directory.</li> + </ul> + </section> + <section><title>Renaming a File</title> + <p>Regardless of whether the file is a directory or a document, it can be renamed, with one exception - the root directory has a special name that is expected by the components of a major software vendor's office suite, and the POIFS API will not let that name be changed. Renaming is done by acquiring the file's corresponding <code>Entry</code> instance and calling its <code>renameTo</code> method, passing in the new name.</p> + <p>Like <code>delete</code>, <code>renameTo</code> returns <code>true</code> if the operation succeeded, otherwise <code>false</code>. Reasons for failure include these:</p> + <ul> + <li>The new name is the same as another file in the same directory. And don't forget - if the new name is longer than 31 characters, it <em>will</em> be silently truncated. In its original length, the new name may have been unique, but truncated to 31 characters, it may not be unique any longer.</li> + <li>You tried to rename the root directory.</li> + </ul> + </section> + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/poifs/html/POIFSDesignDocument.html b/src/documentation/content/xdocs/poifs/html/POIFSDesignDocument.html new file mode 100644 index 0000000000..0f18722fd6 --- /dev/null +++ b/src/documentation/content/xdocs/poifs/html/POIFSDesignDocument.html @@ -0,0 +1,1279 @@ +<HTML> + <HEAD> + <TITLE>POIFS Design Document</TITLE> + </HEAD> + <BODY> + <FONT SIZE="+3"><B>POIFS Design Document</B></FONT> + <P> + This document describes the design of the POIFS system. It is + organized as follows: + </P> + <UL> + <LI> + <A HREF="#Scope">Scope</A> A description of the limitations of + this document. + </LI> + <LI> + <A HREF="#Assumptions">Assumptions</A> The assumptions on + which this design is based. + </LI> + <LI> + <A HREF="#Considerations">Design Considerations</A> The + constraints and goals applied to the design. + </LI> + <LI> + <A HREF="#Design">Design</A> The design of the POIFS system. + </LI> + </UL> + <P></P> + <OL TYPE="I"> + <LI> + <A NAME="Scope"><FONT + SIZE="+2"><B>Scope</B></FONT></A> + <P> + This document is written as part of an iterative process. + As that process is not yet complete, neither is this + document. + </P> + </LI> + <LI> + <A NAME="Assumptions"><FONT + SIZE="+2"><B>Assumptions</B></FONT></A> + <P> + The design of POIFS is not dependent on the code written + for the proof-of-concept prototype POIFS package. + </P> + </LI> + <LI> + <A NAME="Considerations"><FONT SIZE="+2"><B>Design + Considerations</B></FONT></A> + <P> + As usual, the primary considerations in the design of the + POIFS assumption involve the classic space-time tradeoff. + In this case, the main consideration has to involve + minimizing the memory footprint of POIFS. POIFS may be + called upon to create relatively large documents, and in + web application server, it may be called upon to create + several documents simultaneously, and it will likely + co-exist with other Serializer systems, competing with + those other systems for space on the server. + </P> + <P> + We've addressed the risk of being too slow through a + proof-of-concept prototype. This prototype for POIFS + involved reading an existing file, decomposing it into its + constituent documents, composing a new POIFS from the + constituent documents, and writing the POIFS file back to + disk and verifying that the output file, while not + necessarily a byte-for-byte image of the input file, could + be read by the application that generated the input file. + This prototype proved to be quite fast, reading, + decomposing, and re-generating a large (300K) file in 2 to + 2.5 seconds. + </P> + <P> + While the POIFS format allows great flexibility in laying + out the documents and the other internal data structures, + the layout of the filesystem will be kept as simple as + possible. + </P> + </LI> + <LI> + <A NAME="Design"><FONT + SIZE="+2"><B>Design</B></FONT></A> + <P> + The design of the POIFS is broken down into two parts: + <A HREF="#Classes">discussion of the classes and + interfaces</A>, and <A HREF="#Scenarios">discussion of how + these classes and interfaces will be used to convert an + appropriate Java InputStream (such as an XML stream) to a + POIFS output stream containing an HSSF document</A>. + </P> + <A NAME="Classes"><FONT SIZE="+1"><B>Classes and Interfaces</B></FONT></A> + <P> + The classes and interfaces used in the POIFS are broken + down as follows: + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Package</B></TH> + <TH><B>Contents</B></TH> + </TR> + <TR> + <TD><A + HREF="#BlockClasses">net.sourceforge.poi.poifs.storage</A></TD> + <TD>Block classes and interfaces</TD> + </TR> + <TR> + <TD><A + HREF="#PropertyClasses">net.sourceforge.poi.poifs.property</A></TD> + <TD>Property classes and interfaces</TD> + </TR> + <TR> + <TD><A + HREF="#FilesystemClasses">net.sourceforge.poi.poifs.filesystem</A></TD> + <TD>Filesystem classes and interfaces</TD> + </TR> + <TR> + <TD><A + HREF="#UtilityClasses">net.sourceforge.poi.util</A></TD> + <TD>Utility classes and interfaces</TD> + </TR> + </TABLE> + <OL> + <LI> + <A NAME="BlockClasses"><B>Block Classes and + Interfaces</B></A> + <P> + The block classes and interfaces are shown + in the following class diagram. + </P> + <P> + <IMG SRC="BlockClassDiagram.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Class/Interface</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><A + NAME="BATBlock"><B>BATBlock</B></A></TD> + <TD>The <B>BATBlock</B> class + represents a single big block + containing 128 <A + HREF="POIFSFormat.html#BAT">BAT + entries</A>.<BR>Its + <CODE><I>_fields</I></CODE> array is + used to read and write the BAT entries + into the <CODE><I>_data</I></CODE> + array.<BR>Its + <CODE><I>createBATBlocks</I></CODE> + method is used to create an array of + BATBlock instances from an array of + int BAT entries.<BR>Its + <CODE><I>calculateStorageRequirements</I></CODE> + method calculates the number of BAT + blocks necessary to hold the specified + number of BAT entries.</TD> + </TR> + <TR> + <TD><A + NAME="BigBlock"><B>BigBlock</B></A></TD> + <TD>The <B>BigBlock</B> class is an + abstract class representing the common + big block of 512 bytes. It implements + <A + HREF="#BlockWritable">BlockWritable</A>, + trivially delegating the + <CODE><I>writeBlocks</I></CODE> method + of BlockWritable to its own abstract + <CODE><I>writeData</I></CODE> + method.</TD> + </TR> + <TR> + <TD><A + NAME="BlockWritable"><B>BlockWritable</B></A></TD> + <TD>The <B>BlockWritable</B> interface + defines a single method, + <CODE><I>writeBlocks</I></CODE>, that + is used to write an implementation's + block data to an + <CODE>OutputStream</CODE>.</TD> + </TR> + <TR> + <TD><A + NAME="DocumentBlock"><B>DocumentBlock</B></A></TD> + <TD>The <B>DocumentBlock</B> class is + used by a <A + HREF="#Document">Document</A> to holds + its raw data. It also retains the + number of bytes read, as this is used + by the Document class to determine the + total size of the data, and is also + used internally to determine whether + the block was filled by the + <CODE>InputStream</CODE> or + not.<BR>The + <CODE><I>DocumentBlock</I></CODE> + constructor is passed an + <CODE>InputStream</CODE> from which to + fill its <CODE><I>_data</I></CODE> + array.<BR>The <CODE><I>size</I></CODE> + method returns the number of bytes + read (<CODE><I>_bytes_read</I></CODE> + when the instance was + constructed.<BR>The + <CODE><I>partiallyRead</I></CODE> + method returns true if the + <CODE><I>_data</I></CODE> array was + not completely filled, which may be + interpreted by the Document as having + reached the end of file + point.<BR>Typical use of the + DocumentBlock class is like + this:<BR><CODE>while + (true)<BR>{<BR> DocumentBlock + block = new + DocumentBlock(stream);<BR> blocks.add(block);<BR> size + += + block.size();<BR> if + (block.partiallyRead())<BR> {<BR> break;<BR> }<BR>}</CODE></TD> + </TR> + <TR> + <TD><A + NAME="HeaderBlock"><B>HeaderBlock</B></A></TD> + <TD>The <B>HeaderBlock</B> class is + used to contain the data found in a + POIFS header.<BR>Its <A + HREF="#IntegerField">IntegerField</A> + members are used to read and write the + appropriate entries into the + <CODE><I>_data</I></CODE> + array.<BR>Its + <CODE><I>setBATBlocks</I></CODE>, + <CODE><I>setPropertyStart</I></CODE>, + and <CODE><I>setXBATStart</I></CODE> + methods are used to set the + appropriate fields in the + <CODE><I>_data</I></CODE> + array.<BR>The + <CODE><I>calculateXBATStorageRequirements</I></CODE> + method is used to determine how many + XBAT blocks are necessary to + accommodate the specified number of + BAT blocks. + </TD> + </TR> + <TR> + <TD><A + NAME="PropertyBlock"><B>PropertyBlock</B></A></TD> + <TD>The <B>PropertyBlock</B> class is + used to contain <A + HREF="#Property">Property</A> + instances for the <A + HREF="#PropertyTable">PropertyTable</A> + class.<BR>It contains an array, + <CODE><I>_properties</I></CODE> of 4 + Property instances, which together + comprise the 512 bytes of a <A + HREF="#BigBlock">BigBlock</A>.<BR>The + <CODE><I>createPropertyBlockArray</I></CODE> + method is used to convert a + <CODE>List</CODE> of Property + instances into an array of + PropertyBlock instances. The number of + Property instances is rounded up to a + multiple of 4 by creating empty + anonymous inner class extensions of + Property.</TD> + </TR> + </TABLE> + </LI> + <LI> + <A NAME="PropertyClasses"><B>Property Classes + and Interfaces</B></A> + <P> + The property classes and interfaces are + shown in the following class diagram. + </P> + <P> + <IMG SRC="PropertyTableClassDiagram.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Class/Interface</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><A + NAME="Directory"><B>Directory</B></A></TD> + <TD>The <B>Directory</B> interface is + implemented by the <A + HREF="#RootProperty">RootProperty</A> + class. It is not strictly necessary + for the initial POIFS implementation, + but when the POIFS supports <A + HREF="POIFSFormat.html#directoryEntry">directory + elements</A>, this interface will be + more widely implemented, and so is + included in the design at this point + to ease the eventual support of + directory elements.<BR>Its methods are + a getter/setter pair, + <CODE><I>getChildren</I></CODE>, + returning an <CODE>Iterator</CODE> of + <A HREF="#Property">Property</A> + instances; and + <CODE><I>addChild</I></CODE>, which + will allow the caller to add another + Property instance to the Directory's + children.</TD> + </TR> + <TR> + <TD><A + NAME="DocumentProperty"><B>DocumentProperty</B></A></TD> + <TD>The <B>DocumentProperty</B> class + is a trivial extension of <A + HREF="#Property">Property</A> and is + used by <A + HREF="#Document">Document</A> to keep + track of its associated entry in the + <A + HREF="#PropertyTable">PropertyTable</A>.<BR>Its + constructor takes a name and the + document size, on the assumption that + the Document will not create a + DocumentProperty until after it has + created the storage for the document + data and therefore knows how much data + there is.</TD> + </TR> + <TR> + <TD><A + NAME="File"><B>File</B></A></TD> + <TD>The <B>File</B> interface + specifies the behavior of reading and + writing the next and previous child + fields of a <A + HREF="#Property">Property</A>.</TD> + </TR> + <TR> + <TD><A + NAME="Property"><B>Property</B></A></TD> + <TD>The <B>Property</B> class is an + abstract class that defines the basic + data structure of an element of the <A + HREF="POIFSFormat.html#PropertyTable">Property + Table</A>.<BR>Its <A + HREF="#ByteField">ByteField</A>, <A + HREF="#ShortField">ShortField</A>, and + <A + HREF="#IntegerField">IntegerField</A> + members are used to read and write + data into the appropriate locations in + the <CODE><I>_raw_data</I></CODE> + array.<BR>The + <CODE><I>_index</I></CODE> member is + used to hold a Propery instance's + index in the <CODE>List</CODE> of + Property instances maintained by <A + HREF="#PropertyTable">PropertyTable</A>, + which is used to populate the child + property of parent <A + HREF="#Directory">Directory</A> + properties and the next property and + previous property of sibling <A + HREF="#File">File</A> + properties.<BR>The + <CODE><I>_name</I></CODE>, + <CODE><I>_next_file</I></CODE>, and + <CODE><I>_previous_file</I></CODE> + members are used to help fill the + appropriate fields of the _raw_data + array.<BR>Setters are provided for + some of the fields (name, property + type, node color, child property, + size, index, start block), as well as + a few getters (index, child + property).<BR>The + <CODE><I>preWrite</I></CODE> method is + abstract and is used by the owning + PropertyTable to iterate through its + Property instances and prepare each + for writing.<BR>The + <CODE><I>shouldUseSmallBlocks</I></CODE> + method returns true if the Property's + size is sufficiently small - how small + is none of the caller's business. + </TD> + </TR> + <TR> + <TD><B>PropertyBlock</B></TD> + <TD>See the description in <A + HREF="#PropertyBlock">PropertyBlock</A>.</TD> + </TR> + <TR> + <TD><A + NAME="PropertyTable"><B>PropertyTable</B></A></TD> + <TD>The <B>PropertyTable</B> class + holds all of the <A + HREF="#DocumentProperty">DocumentProperty</A> + instances and the <A + HREF="#RootProperty">RootProperty</A> + instance for a <A + HREF="#Filesystem">Filesystem</A> + instance.<BR>It maintains a + <CODE>List</CODE> of its <A + HREF="#Property">Property</A> + instances + (<CODE><I>_properties</I></CODE>), and + when prepared to write its data by a + call to <CODE><I>preWrite</I></CODE>, + it gets and holds an array of <A + HREF="#PropertyBlock">PropertyBlock</A> + instances + (<CODE><I>_blocks</I></CODE>.<BR>It + also maintains its start block in its + <CODE><I>_start_block</I></CODE> + member.<BR>It has a method, + <CODE><I>getRoot</I></CODE>, to get + the RootProperty, returning it as an + implementation of <A + HREF="#Directory">Directory</A>, and a + method to add a Property, + <CODE><I>addProperty</I></CODE>, and a + method to get its start block, + <CODE><I>getStartBlock</I></CODE>.</TD> + </TR> + <TR> + <TD><A + NAME="RootProperty"><B>RootProperty</B></A></TD> + <TD>The <B>RootProperty</B> class acts + as the <A + HREF="#Directory">Directory</A> for + all of the <A + HREF="#DocumentProperty">DocumentProperty</A> + instance. As such, it is more of a + pure <A + HREF="POIFSFormat.html#directoryEntry">directory + entry</A> than a proper <A + HREF="POIFSFormat.html#RootEntry">root + entry</A> in the <A + HREF="POIFSFormat.html#PropertyTable">Property + Table</A>, but the initial POIFS + implementation does not warrant the + additional complexity of a full-blown + root entry, and so it is not modeled + in this design.<BR>It maintains a + <CODE>List</CODE> of its children, + <CODE><I>_children</I></CODE>, in + order to perform its + directory-oriented duties.</TD> + </TR> + </TABLE> + </LI> + <LI> + <A NAME="FilesystemClasses"><B>Filesystem + Classes and Interfaces</B></A> + <P> + The property classes and interfaces are + shown in the following class diagram. + </P> + <P> + <IMG SRC="POIFSClassDiagram.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Class/Interface</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><A + NAME="Filesystem"><B>Filesystem</B></A></TD> + <TD>The <B>Filesystem</B> class is the + top-level class that manages the + creation of a POIFS document.<BR>It + maintains a <A + HREF="#PropertyTable">PropertyTable</A> + instance in its + <CODE><I>_property_table</I></CODE> + member, a <A + HREF="#HeaderBlock">HeaderBlock</A> + instance in its + <CODE><I>_header_block</I></CODE> + member, and a <CODE>List</CODE> of its + <A HREF="#Document">Document</A> + instances in its + <CODE><I>_documents</I></CODE> + member.<BR>It provides methods for a + client to create a document + (<CODE><I>createDocument</I></CODE>), + and a method to write the Filesystem + to an <CODE>OutputStream</CODE> + (<CODE><I>writeFilesystem</I></CODE>).</TD> + </TR> + <TR> + <TD><B>BATBlock</B></TD> + <TD>See the description in <A + HREF="#BATBlock">BATBlock</A></TD> + </TR> + <TR> + <TD><A + NAME="BATManaged"><B>BATManaged</B></A></TD> + <TD>The <B>BATManaged</B> interface + defines common behavior for objects + whose location in the written file is + managed by the <A + HREF="POIFSFormat.html#BAT">Block + Allocation Table</A>.<BR>It defines + methods to get a count of the + implementation's <A + HREF="#BigBlock">BigBlock</A> + instances + (<CODE><I>countBlocks</I></CODE>), and + to set an implementation's start block + (<CODE><I>setStartBlock</I></CODE>).</TD> + </TR> + <TR> + <TD><A + NAME="BlockAllocationTable"><B>BlockAllocationTable</B></A></TD> + <TD>The <B>BlockAllocationTable</B> is + an implementation of the POIFS <A + HREF="POIFSFormat.html#BAT">Block + Allocation Table</A>. It is only + created when the <A + HREF="#Filesystem">Filesystem</A> is + about to be written to an + <CODE>OutputStream</CODE>.<BR>It + contains an <A + HREF="#IntList">IntList</A> of block + numbers for all of the <A + HREF="#BATManaged">BATManaged</A> + implementations owned by the + Filesystem, + <CODE><I>_entries</I></CODE>, which is + filled by calls to + <CODE><I>allocateSpace</I></CODE>.<BR>It + fills its array, + <CODE><I>_blocks</I></CODE>, of <A + HREF="#BATBlock">BATBlock</A> + instances when its + <CODE><I>createBATBlocks</I></CODE> + method is called. This method has to + take into account its own storage + requirements, as well as those of the + XBAT blocks, and so calls + <CODE><I>BATBlock.calculateStorageRequirements</I></CODE> + and + <CODE><I>HeaderBlock.calculateXBATStorageRequirements</I></CODE> + repeatedly until the counts returned + by those methods stabilize.<BR>The + <CODE><I>countBlocks</I></CODE> method + returns the number of BATBlock + instances created by the preceding + call to createBlocks.</TD> + </TR> + <TR> + <TD><B>BlockWritable</B></TD> + <TD>See the description in <A + HREF="#BlockWritable">BlockWritable</A></TD> + </TR> + <TR> + <TD><A + NAME="Document"><B>Document</B></A></TD> + <TD>The <B>Document</B> class is used + to contain a document, such as an HSSF + workbook.<BR>It has its own <A + HREF="#DocumentProperty">DocumentProperty</A> + (<CODE><I>_property</I></CODE>) and + stores its data in a collection of <A + HREF="#DocumentBlock">DocumentBlock</A> + instances + (<CODE><I>_blocks</I></CODE>).<BR>It + has a method, + <CODE><I>getDocumentProperty</I></CODE>, + to get its DocumentProperty.</TD> + </TR> + <TR> + <TD><B>DocumentBlock</B></TD> + <TD>See the description in <A + HREF="#DocumentBlock">DocumentBlock</A></TD> + </TR> + <TR> + <TD><B>DocumentProperty</B></TD> + <TD>See the description in <A + HREF="#DocumentProperty">DocumentProperty</A></TD> + </TR> + <TR> + <TD><B>HeaderBlock</B></TD> + <TD>See the description in <A + HREF="#HeaderBlock">HeaderBlock</A></TD> + </TR> + <TR> + <TD><B>PropertyTable</B></TD> + <TD>See the description in <A + HREF="#PropertyTable">PropertyTable</A></TD> + </TR> + </TABLE> + </LI> + <LI> + <A NAME="UtilityClasses"><B>Utility Classes + and Interfaces</B></A> + <P> + The utility classes and interfaces are + shown in the following class diagram. + </P> + <P> + <IMG SRC="utilClasses.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Class/Interface</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><A + NAME="BitField"><B>BitField</B></A></TD> + <TD>The <B>BitField</B> class is used + primarily by HSSF code to manage + bit-mapped fields of HSSF records. It + is not likely to be used in the POIFS + code itself and is only included here + for the sake of complete documentation + of the POI utility classes.</TD> + </TR> + <TR> + <TD><A + NAME="ByteField"><B>ByteField</B></A></TD> + <TD>The <B>ByteField</B> class is an + implementation of <A + HREF="#FixedField">FixedField</A> for + the purpose of managing reading and + writing to a byte-wide field in an + array of <CODE>bytes</CODE>.</TD> + </TR> + <TR> + <TD><A + NAME="FixedField"><B>FixedField</B></A></TD> + <TD>The <B>FixedField</B> interface + defines a set of methods for reading a + field from an array of + <CODE>bytes</CODE> or from an + <CODE>InputStream</CODE>, and for + writing a field to an array of + <CODE>bytes</CODE>. Implementations + typically require an offset in their + constructors that, for the purposes of + reading and writing to an array of + <CODE>bytes</CODE>, makes sure that + the correct <CODE>bytes</CODE> in the + array are read or written.</TD> + </TR> + <TR> + <TD><A + NAME="HexDump"><B>HexDump</B></A></TD> + <TD>The <B>HexDump</B> class is a + debugging class that can be used to + dump an array of <CODE>bytes</CODE> to + an <CODE>OutputStream</CODE>. The + static method <CODE><I>dump</I></CODE> + takes an array of <CODE>bytes</CODE>, + a <CODE>long</CODE> offset that is + used to label the output, an open + <CODE>OutputStream</CODE>, and an + <CODE>int</CODE> index that specifies + the starting index within the array of + <CODE>bytes</CODE>.<BR>The data is + displayed 16 bytes per line, with each + byte displayed in hexadecimal format + and again in printable form, if + possible (a byte is considered + printable if its value is in the range + of 32 ... 126).<BR>Here is an example + of a small array of <CODE>bytes</CODE> + with an offset of + 0x110:<BR><CODE>00000110 C8 00 00 00 FF 7F 90 01 00 00 00 00 00 00 05 01 ................<BR>00000120 41 00 72 00 69 00 61 00 6C 00 A.r.i.a.l.</CODE></TD> + </TR> + <TR> + <TD><A + NAME="IntegerField"><B>IntegerField</B></A></TD> + <TD>The <B>IntegerField</B> class is + an implementation of <A + HREF="#FixedField">FixedField</A> for + the purpose of managing reading and + writing to an integer-wide field in an + array of <CODE>bytes</CODE>.</TD> + </TR> + <TR> + <TD><A + NAME="IntList"><B>IntList</B></A></TD> + <TD>The <B>IntList</B> class is a + work-around for functionality missing + in Java (see <A + HREF="http://developer.java.sun.com/developer/bugParade/bugs/4487555.html">http://developer.java.sun.com/developer/bugParade/bugs/4487555.html</A> + for details); it is a simple growable + array of <CODE>ints</CODE> that gets + around the requirement of wrapping and + unwrapping <CODE>ints</CODE> in + <CODE>Integer</CODE> instances in + order to use the + <CODE>java.util.List</CODE> + interface.<BR><B>IntList</B> mimics + the functionality of the + <CODE>java.util.List</CODE> interface + as much as possible.</TD> + </TR> + <TR> + <TD><A + NAME="LittleEndian"><B>LittleEndian</B></A></TD> + <TD>The <B>LittleEndian</B> class + provides a set of static methods for + reading and writing + <CODE>shorts</CODE>, + <CODE>ints</CODE>, <CODE>longs</CODE>, + and <CODE>doubles</CODE> in and out of + <CODE>byte</CODE> arrays, and out of + <CODE>InputStreams</CODE>, preserving + the Intel byte ordering and encoding + of these values.</TD> + </TR> + <TR> + <TD><A + NAME="LittleEndianConsts"><B>LittleEndianConsts</B></A></TD> + <TD>The <B>LittleEndianConsts</B> + interface defines the width of a + <CODE>short</CODE>, <CODE>int</CODE>, + <CODE>long</CODE>, and + <CODE>double</CODE> as stored by Intel + processors.</TD> + </TR> + <TR> + <TD><A + NAME="LongField"><B>LongField</B></A></TD> + <TD>The <B>LongField</B> class is an + implementation of <A + HREF="#FixedField">FixedField</A> for + the purpose of managing reading and + writing to a long-wide field in an + array of <CODE>bytes</CODE>.</TD> + </TR> + <TR> + <TD><A + NAME="ShortField"><B>ShortField</B></A></TD> + <TD>The <B>ShortField</B> class is an + implementation of <A + HREF="#FixedField">FixedField</A> for + the purpose of managing reading and + writing to a short-wide field in an + array of <CODE>bytes</CODE>.</TD> + </TR> + <TR> + <TD><A + NAME="ShortList"><B>ShortList</B></A></TD> + <TD>The <B>ShortList</B> class is a + work-around for functionality missing + in Java (see <A + HREF="http://developer.java.sun.com/developer/bugParade/bugs/4487555.html">http://developer.java.sun.com/developer/bugParade/bugs/4487555.html</A> + for details); it is a simple growable + array of <CODE>shorts</CODE> that gets + around the requirement of wrapping and + unwrapping <CODE>shorts</CODE> in + <CODE>Short</CODE> instances in order + to use the <CODE>java.util.List</CODE> + interface.<BR> <B>ShortList</B> mimics + the functionality of the + <CODE>java.util.List</CODE> interface + as much as possible.</TD> + </TR> + <TR> + <TD><A + NAME="StringUtil"><B>StringUtil</B></A></TD> + <TD>The <B>StringUtil</B> class + manages the processing of Unicode + strings.</TD> + </TR> + </TABLE> + </LI> + </OL> + <A NAME="Scenarios"><FONT + SIZE="+1"><B>Scenarios</B></FONT></A> + <P> + This section describes the scenarios of how the + POIFS classes and interfaces will be used to + convert an appropriate XML stream to a POIFS + output stream containing an HSSF document. + </P> + <P> + It is broken down as suggested by the following + scenario diagram: + </P> + <P> + <IMG SRC="POIFSLifeCycle.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Step</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><B>1</B></TD> + <TD><A HREF="Initialization">The Filesystem is + created by the client application.</A></TD> + </TR> + <TR> + <TD><B>2</B></TD> + <TD><A HREF="CreateDocument">The client + application tells the Filesystem to create a + document</A>, providing an + <CODE>InputStream</CODE> and the name of the + document. This may be repeated several + times.</TD> + </TR> + <TR> + <TD><B>3</B></TD> + <TD><A HREF="Initialization">The client + application asks the Filesystem to write its + data to an <CODE>OutputStream</CODE>.</A></TD> + </TR> + </TABLE> + <OL> + <LI> + <P> + <A + NAME="Initialization">Initialization</A> + </P> + <P> + Initialization of the POIFS system is + shown in the following scenario diagram: + </P> + <P> + <IMG SRC="POIFSInitialization.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Step</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><B>1</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + object, which is created for each + request to convert an appropriate XML + stream to a POIFS output stream + containing an HSSF document, creates + its <A + HREF="#PropertyTable">PropertyTable</A>.</TD> + </TR> + <TR> + <TD><B>2</B></TD> + <TD>The <A + HREF="#PropertyTable">PropertyTable</A> + creates its <A + HREF="#RootProperty">RootProperty</A> + instance, making the RootProperty the + first <A HREF="#Property">Property</A> + in its <CODE>List</CODE> of Property + instances.</TD> + </TR> + <TR> + <TD><B>3</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + creates its <A + HREF="#HeaderBlock">HeaderBlock</A> + instance. It should be noted that the + decision to create the HeaderBlock at + Filesystem initialization is + arbitrary; creation of the HeaderBlock + could easily and harmlessly be + postponed to the appropriate moment in + <A HREF="#WriteFilesystem">writing the + filesystem</A>.</TD> + </TR> + </TABLE> + </LI> + <LI> + <P> + <A NAME="CreateDocument">Creating a + Document</A> + </P> + <P> + Creating and adding a document to a POIFS + system is shown in the following scenario + diagram: + </P> + <P> + <IMG SRC="POIFSAddDocument.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Step</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><B>1</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + instance creates a new <A + HREF="#Document">Document</A> + instance. It will store the newly + created Document in a + <CODE>List</CODE> of <A + HREF="#BATManaged">BATManaged</A> + instances.</TD> + </TR> + <TR> + <TD><B>2</B></TD> + <TD>The <A + HREF="#Document">Document</A> reads + data from the provided + <CODE>InputStream</CODE>, storing the + data in <A + HREF="#DocumentBlock">DocumentBlock</A> + instances. It keeps track of the byte + count as it reads the data.</TD> + </TR> + <TR> + <TD><B>3</B></TD> + <TD>The <A + HREF="#Document">Document</A> creates + a <A + HREF="#DocumentProperty">DocumentProperty</A> + to keep track of its property + data. The byte count is stored in the + newly created DocumentProperty + instance.</TD> + </TR> + <TR> + <TD><B>4</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + requests the newly created <A + HREF="#DocumentProperty">DocumentProperty</A> + from the newly created <A + HREF="#Document">Document</A> + instance.</TD> + </TR> + <TR> + <TD><B>5</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + sends the newly created <A + HREF="#DocumentProperty">DocumentProperty</A> + to the Filesystem's <A + HREF="#PropertyTable">PropertyTable</A> + so that the PropertyTable can add the + DocumentProperty to its + <CODE>List</CODE> of <A + HREF="#Property">Property</A> + instances.</TD> + </TR> + <TR> + <TD><B>6</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> gets + the <A + HREF="#RootProperty">RootProperty</A> + from its <A + HREF="#PropertyTable">PropertyTable</A>.</TD> + </TR> + <TR> + <TD><B>7</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> adds + the newly created <A + HREF="#DocumentProperty">DocumentProperty</A> + to the <A + HREF="#RootProperty">RootProperty</A>.</TD> + </TR> + </TABLE> + <P> + Although typical deployment of the POIFS + system will only entail adding a single <A + HREF="#Document">Document</A> (the + workbook) to the <A + HREF="#Filesystem">Filesystem</A>, there + is nothing in the design to prevent + multiple Documents from being added to the + Filesystem. This flexibility can be + employed to write summary information + document(s) in addition to the workbook. + </P> + </LI> + <LI> + <P> + <A NAME="WriteFilesystem">Writing the + Filesystem</A> + </P> + <P> + Writing the filesystem is shown in the + following scenario diagram: + </P> + <P> + <IMG SRC="POIFSWriteFilesystem.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Step</B></TH> + <TH COLSPAN="2"><B>Description</B></TH> + </TR> + <TR> + <TD><B>1</B></TD> + <TD COLSPAN="2">The <A + HREF="#Filesystem">Filesystem</A> adds + the <A + HREF="#PropertyTable">PropertyTable</A> + to its <CODE>List</CODE> of <A + HREF="#BATManaged">BATManaged</A> + instances and calls the + PropertyTable's + <CODE><I>preWrite</I></CODE> + method. The action taken by the + PropertyTable is shown in the <A + HREF="#PropertyTablePreWrite">PropertyTable + preWrite scenario diagram</A>.</TD> + </TR> + <TR> + <TD><B>2</B></TD> + <TD COLSPAN="2">The <A + HREF="#Filesystem">Filesystem</A> + creates the <A + HREF="#BlockAllocationTable">BlockAllocationTable</A>.</TD> + </TR> + <TR> + <TD><B>3</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> gets + the block count from the <A + HREF="#BATManaged">BATManaged</A> + instance.</TD> <TD + ROWSPAN="3"><B>These three steps are + repeated for each <A + HREF="#BATManaged">BATManaged</A> + instance in the <A + HREF="#Filesystem">Filesystem</A>'s + <CODE>List</CODE> of BATManaged + instances (i.e., the <A + HREF="#Document">Documents</A>, in + order of their addition to the + Filesystem, followed by the <A + HREF="#PropertyTable">PropertyTable</A>).</B></TD> + </TR> + <TR> + <TD><B>4</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + sends the block count to the <A + HREF="#BlockAllocationTable">BlockAllocationTable</A>, + which adds the appropriate entries to + is <A HREF="#IntList">IntList</A> of + entries, returning the starting block + for the newly added entries.</TD> + </TR> + <TR> + <TD><B>5</B></TD> + <TD>The <A + HREF="#Filesystem">Filesystem</A> + gives the start block number to the <A + HREF="#BATManaged">BATManaged</A> + instance. If the BATManaged instance + is a <A HREF="#Document">Document</A>, + it sets the start block field in its + <A + HREF="#DocumentProperty">DocumentProperty</A>.</TD> + </TR> + <TR> + <TD><B>6</B></TD> + <TD COLSPAN="2">The <A + HREF="#Filesystem">Filesystem</A> + tells the <A + HREF="#BlockAllocationTable">BlockAllocationTable</A> + to create its <A + HREF="#BATBlock">BatBlocks</A>.</TD> + </TR> + <TR> + <TD><B>7</B></TD> + <TD COLSPAN="2">The <A + HREF="#Filesystem">Filesystem</A> + gives the BAT information to the <A + HREF="#HeaderBlock">HeaderBlock</A> so + that it can set its BAT fields and, if + necessary, create XBAT blocks.</TD> + </TR> + <TR> + <TD><B>8</B></TD> + <TD COLSPAN="2">If the filesystem is + unusually large (over <B>7MB</B>), the + <A HREF="#HeaderBlock">HeaderBlock</A> + will create XBAT blocks to contain the + BAT data that it cannot hold + directly. In this case, the <A + HREF="#Filesystem">Filesystem</A> + tells the HeaderBlock where those + additional blocks will be stored.</TD> + </TR> + <TR> + <TD><B>9</B></TD> + <TD COLSPAN="2">The <A + HREF="#Filesystem">Filesystem</A> + gives the <A + HREF="#PropertyTable">PropertyTable</A> + start block to the <A + HREF="#HeaderBlock">HeaderBlock</A>.</TD> + </TR> + <TR> + <TD><B>10</B></TD> + <TD COLSPAN="2">The <A + HREF="#Filesystem">Filesystem</A> + tells the <A + HREF="#BlockWritable">BlockWritable</A> + instance to write its blocks to the + provided + <CODE>OutputStream</CODE>.<BR>This + step is repeated for each + BlockWritable instance, in this + order:<BR> + <OL> + <LI> + The <A + HREF="#HeaderBlock">HeaderBlock</A>. + </LI> + <LI> + Each <A + HREF="#Document">Document</A>, + in the order in which it was + added to the <A + HREF="#Filesystem">Filesystem</A>. + </LI> + <LI> + The <A + HREF="#PropertyTable">PropertyTable</A>. + </LI> + <LI> + The <A + HREF="#BlockAllocationTable">BlockAllocationTable</A> + </LI> + <LI> + The XBAT blocks created by the + <A + HREF="#HeaderBlock">HeaderBlock</A>, + if any. + </LI> + </OL></TD> + </TR> + </TABLE> + <P> + <A + NAME="PropertyTablePreWrite"><B>PropertyTable + preWrite scenario diagram</B></A> + </P> + <P> + <IMG SRC="POIFSPropertyTablePreWrite.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Step</B></TH> + <TH><B>Description</B></TH> + </TR> + <TR> + <TD><B>1</B></TD> + <TD>The <A + HREF="#PropertyTable">PropertyTable</A> + calls <CODE><I>setIndex</I></CODE> for + each of its <A + HREF="#Property">Property</A> + instances, so that each Property now + knows its index within the + PropertyTable's <CODE>List</CODE> of + Property instances.</TD> + </TR> + <TR> + <TD><B>2</B></TD> <TD>The <A + HREF="#PropertyTable">PropertyTable</A> + requests the <A + HREF="#PropertyBlock">PropertyBlock</A> + class to create an array of <A + HREF="#PropertyBlock">PropertyBlock</A> + instances.</TD> + </TR> + <TR> + <TD><B>3</B></TD> + + <TD>The <A + HREF="#PropertyBlock">PropertyBlock</A> + calculates the number of empty <A + HREF="#Property">Property</A> + instances it needs to create and + creates them. The algorithm for the + number to create is:<BR> + <CODE>block_count = (properties.size() + + 3) / 4;<BR> emptyPropertiesNeeded = + (block_count * 4) - + properties.size();</CODE></TD> + </TR> + <TR> + <TD><B>4</B></TD> <TD>The <A + HREF="#PropertyBlock">PropertyBlock</A> + creates the required number of <A + HREF="#PropertyBlock">PropertyBlock</A> + instances from the <CODE>List</CODE> + of <A HREF="#Property">Property</A> + instances, including the newly created + empty <A HREF="#Property">Property</A> + instances.</TD> + </TR> + <TR> + <TD><B>5</B></TD> + <TD>The <A + HREF="#PropertyTable">PropertyTable</A> + calls <CODE><I>preWrite</I></CODE> on + each of its <A + HREF="#Property">Property</A> + instances. For <A + HREF="#DocumentProperty">DocumentProperty</A> + instances, this call is a no-op. For + the <A + HREF="#RootProperty">RootProperty</A>, + the action taken is shown in the <A + HREF="#RootPropertyPreWrite">RootProperty + preWrite scenario diagram</A>.</TD> + </TR> + </TABLE> + <P> + <A + NAME="RootPropertyPreWrite"><B>RootProperty + preWrite scenario diagram</B></A> + </P> + <P> + <IMG SRC="POIFSRootPropertyPreWrite.gif"> + </P> + <TABLE BORDER="1"> + <TR> + <TH><B>Step</B></TH> + <TH COLSPAN="2"><B>Description</B></TH> + </TR> + <TR> + <TD><B>1</B></TD> + <TD COLSPAN="2">The <A + HREF="#RootProperty">RootProperty</A> + sets its child property with the index + of the child <A + HREF="#Property">Property</A> that is + first in its <CODE>List</CODE> of + children.</TD> + </TR> + <TR> + <TD><B>2</B></TD> + <TD>The <A + HREF="#RootProperty">RootProperty</A> + sets its child's next property field + with the index of the child's next + sibling in the RootProperty's + <CODE>List</CODE> of children. If the + child is the last in the + <CODE>List</CODE>, its next property + field is set to <CODE>-1</CODE>.</TD> + <TD ROWSPAN="2"><B>These two steps are + repeated for each <A + HREF="#File">File</A> in the <A + HREF="#RootProperty">RootProperty</A>'s + <CODE>List</CODE> of + children.</B></TD> + </TR> + <TR> + <TD><B>3</B></TD> + <TD>The <A + HREF="#RootProperty">RootProperty</A> + sets its child's previous property + field with a value of + <CODE>-1</CODE>.</TD> + </TR> + </TABLE> + </LI> + </OL> + </LI> + </OL> + </BODY> +</HTML>
\ No newline at end of file diff --git a/src/documentation/content/xdocs/poifs/index.xml b/src/documentation/content/xdocs/poifs/index.xml new file mode 100644 index 0000000000..f0597a42cb --- /dev/null +++ b/src/documentation/content/xdocs/poifs/index.xml @@ -0,0 +1,40 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> +<document> + <header> + <title>PoiFS</title> + <subtitle>Overview</subtitle> + <authors> + <person name="Andrew C. Oliver" email="acoliver@apache.org"/> + <person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/> + </authors> + </header> + <body> + <section><title>Overview</title> + <p>POIFS is a pure Java implementation of the OLE 2 Compound + Document format.</p> + <p>By definition, all APIs developed by the POI project are + based somehow on the POIFS API.</p> + <p>A common confusion is on just what POIFS buys you or what OLE + 2 Compound Document format is exactly. POIFS does not buy you + DOC, or XLS, but is necessary to generate or read DOC or XLS + files. You see, all file formats based on the OLE 2 Compound + Document Format have a common structure. The OLE 2 Compound + Document Format is essentially a convoluted archive + format. Think of POIFS as a "zip" library. Once you can get + the data in a zip file you still need to interpret the + data. As a general rule, while all of our formats <em>use</em> + POIFS, most of them attempt to abstract you from it. There + are some circumstances where this is not possible, but as a + general rule this is true.</p> + <p>If you're an end user type just looking to generate XLS + files, then you'd be looking for HSSF not POIFS; however, if + you have legacy code that uses MFC property sets, POIFS is + for you! Regardless, you may or may not need to know how to + use POIFS but ultimately if you use technologies that come + from the POI project, you're using POIFS underneith. Perhaps + we should have a branding campaign "POIFS Inside!". ;-)</p> + + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/poifs/usecases.xml b/src/documentation/content/xdocs/poifs/usecases.xml new file mode 100644 index 0000000000..0a505cba54 --- /dev/null +++ b/src/documentation/content/xdocs/poifs/usecases.xml @@ -0,0 +1,635 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd"> +<document> + <header> + <title>POIFS Use Cases</title> + <authors> + <person email="mjohnson@apache.org" name="Marc Johnson" id="MJ"/> + </authors> + </header> + <body> + <section><title>POIFS Use Cases</title> + <section><title>Use Case 1: Read existing file system</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + POIFS client- wants to read content of file + system<br/> + POIFS - understands POIFS file system + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. POIFS client requests POIFS to read a POIFS file + system, providing an + <code>InputStream</code> + containing POIFS file system in question.<br/> + 2. POIFS reads from the + <code>InputStream</code> in + 512 byte blocks.<br/> + 3. POIFS verifies that the first block begins with + the well known signature + ( + <code>0xE11AB1A1E011CFD0</code>)<br/> + 4. POIFS reads the Block Allocation Table from the + first block and, if necessary, from the XBAT + blocks.<br/> + 5. POIFS obtains the start block of the Property + Table and reads the Property Table (use case 9, + read file)<br/> + 6. POIFS reads the individual entries in the Property + Table<br/> + 7. POIFS obtains the start block of the Small Block + Allocation Table and reads the Small Block + Allocation Table (use case 9, read file)<br/> + 8. POIFS obtains the start block of the Small Block + store from the first entry in the Property Table + and reads the Small Block Array (use case 9, read + file)<br/> + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td> + 2a. If the last block read is not a 512 byte + block, the + <code>InputStream</code> is not that of + a POIFS file system, and POIFS throws an + appropriate exception. + <br/> + 3a. If the signature is incorrect, the + <code>InputStream</code> is not that of a POIFS + file system, and POIFS throws an appropriate + exception.<br/> + </td> + </tr> + </table> + </section> + <section><title>Use Case 2: Write file system</title> + <table> + <tr> + <th>Primary Actor:</th> + <th>POIFS client</th> + </tr> + <tr> + <th>Scope:</th> + <td>POIFS</td> + </tr> + <tr> + <th>Level:</th> + <td>Summary</td> + </tr> + <tr> + <th>Stakeholders and Interests:</th> + <td> + POIFS client- wants to write file system out.<br/> + POIFS - knows how to write file system out. + </td> + </tr> + <tr> + <th>Precondition:</th> + <td> + File system has been read (use case 1, read + existing file system) and subsequently modified + (use case 4, replace file in file system; use case + 5, delete file from file system; or use case 6, + write new file to file system; in any + combination) + <br/>or<br/> + File system has been created (use case 3, create + new file system) + </td> + </tr> + <tr> + <th>Minimal Guarantee:</th> + <td>None</td> + </tr> + <tr> + <th>Main Success Guarantee:</th> + <td> + 1. POIFS client provides an + <code>OutputStream</code> + to write the file system to. + <br/> + 2. POIFS gets the sizes of the Property Table and + each file in the file system.<br/> + 3. If any files in the file system requires storage + in a Small Block Array, POIFS creates a Small + Block Array of sufficient size to hold all of the + small files.<br/> + 4. POIFS calculates the number of big blocks needed + to hold all of the large files, the Property + Table, and, if necessary, the Small Block Array + and the Small Block Allocation Table.<br/> + 5. POIFS creates a set of big blocks sufficient to + store the Block Allocation Table<br/> + 6. POIFS creates and writes the header block<br/> + 7. POIFS writes out the XBAT blocks, if needed.<br/> + 8. POIFS writes out the Small Block Array, if + needed<br/> + 9. POIFS writes out the Small Block Allocation Table, + if needed<br/> + 10. POIFS writes out the Property Table<br/> + 11. POIFS writes out the large files, if needed<br/> + 12. POIFS closes the <code>OutputStream</code>. + </td> + </tr> + <tr> + <th>Extensions:</th> + <td> + 6a. Exceptions writing to the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + 7a. Exceptions writing to the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + 8a. Exceptions writing to the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + 9a. Exceptions writing to the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + 10a. Exceptions writing to the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + 11a. Exceptions writing to the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + 12a. Exceptions closing the + <code>OutputStream</code> will be propagated back + to the POIFS client. + <br/> + </td> + </tr> + </table> + </section> + <section><title>Use Case 3: Create new file system</title> + <table> + <tr> + <th>Primary Actor:</th> + <td>POIFS client</td> + </tr> + <tr> + <th>Scope:</th> + <td>POIFS</td> + </tr> + <tr> + <th>Level:</th> + <td>Summary</td> + </tr> + <tr> + <th>Stakeholders and Interests:</th> + <td> + POIFS client- wants to create a new file + system<br/> + POIFS - knows how to create a new file system + </td> + </tr> + <tr> + <th>Precondition:</th> + <td>None</td> + </tr> + <tr> + <th>Minimal Guarantee:</th> + <td>None</td> + </tr> + <tr> + <th>Main Success Guarantee:</th> + <td> + POIFS creates an empty Property Table. + </td> + </tr> + <tr> + <th>Extensions:</th> + <td>None</td> + </tr> + </table> + </section> + <section><title>Use Case 4: Replace file in file system</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + 1. POIFS client- wants to replace an existing file in + the file system<br/> + 2. POIFS - knows how to manage the file system + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td> + Either + <br/><br/> + The file system has been read (use case 1, read + existing file system) and a file has been + extracted from the file system (use case 7, read + existing file from file system) + <br/><br/>or<br/><br/> + The file system has been created (use case 3, + create new file system) and a file has been + written to the file system (use case 6, write new + file to file system) + </td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. POIFS discards storage of the existing file.<br/> + 2. POIFS updates the existing file's entry in the + Property Table<br/> + 3. POIFS stores the new file's data + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td> + 1a. POIFS throws an exception if the file does not + exist. + </td> + </tr> + </table> + </section> + <section><title>Use Case 5: Delete file from file system</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + * POIFS client- wants to remove a file from a file + system<br/> + * POIFS - knows how to manage the file system + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td> + Either<br/><br/> + The file system has been read (use case 1, read + existing file system) and a file has been + extracted from the file system (use case 7, read + existing file from file system)<br/> + <br/> + or<br/> + <br/> + The file system has been created (use case 3, + create new file system) and a file has been + written to the file system (use case 6, write new + file to file system) + </td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. POIFS discards the specified file's storage.<br/> + 2. POIFS discards the file's Property Table + entry. + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td> + 1a. POIFS throws an exception if the file does not + exist. + </td> + </tr> + </table> + </section> + <section><title>Use Case 6: Write new file to file system</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + * POIFS client- wants to add a new file to the file + system<br/> + * POIFS - knows how to manage the file system + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td>The specified file does not yet exist in the file + system</td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. The POIFS client provides a file name<br/> + 2. POIFS creates a new Property Table entry for the + new file<br/> + 3. POIFS provides the POIFS client with an + <code>OutputStream</code> to write to.<br/> + 4. The POIFS client writes data to the provided + <code>OutputStream</code>.<br/> + 5. The POIFS client closes the provided + <code>OutputStream</code><br/> + 6. POIFS updates the Property Table entry with the + new file's size + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td> + 1a. POIFS throws an exception if a file with the + specified name already exists in the file + system.<br/> + 1b. POIFS throws an exception if the file name is + too long. The limit on file name length is 31 + characters. + </td> + </tr> + </table> + </section> + <section><title>Use Case 7: Read existing file from file system</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + * POIFS client- wants to read a file from the file + system<br/> + * POIFS - knows how to manage the file system + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td> + * The file system has been read (use case 1, read + existing file system) or has been created and + written to (use case 3, create new file system; + use case 6, write new file to file system).<br/> + * The specified file exists in the file system. + </td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + * The POIFS client provides the name of a file to be read <br/> + * POIFS provides an <code>InputStream</code> to read from. <br/> + * The POIFS client reads from the <code>InputStream</code>.<br/> + * The POIFS client closes the <code>InputStream</code>. + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td>1a. POIFS throws an exception if no file with the + specified name exists.</td> + </tr> + </table> + </section> + <section><title>Use Case 8: Read file system directory</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + * POIFS client- wants to know what files exist in + the file system<br/> + * POIFS - knows how to manage the file system + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td>The file system has been read (use case 1, read + existing file system) or created (use case 3, create + new file system)</td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. The POIFS client requests the file system + directory. + 2. POIFS returns an <code>Iterator</code>. The + <code>Iterator</code> will not include the root + entry in the Property Table, and may be an + <code>Iterator</code> over an empty + <code>Collection</code>. + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td>None</td> + </tr> + </table> + </section> + <section><title>Use Case 9: Read file</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + POIFS - POIFS needs to read a file, or something + resembling a file (i.e., the Property Table, the + Small Block Array, or the Small Block Allocation + Table) + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. POIFS begins with a start block, a file size, and + a flag indicating whether to use the Big Block + Allocation Table or the Small Block Allocation + Table<br/> + 2. POIFS returns an <code>InputStream</code>.<br/> + 3. Reads from the <code>InputStream</code> are + performed by walking the specified Block + Allocation Table and reading the blocks + indicated.<br/> + 4. POIFS closes the <code>InputStream</code> when + finished reading the file, or its client wants to + close the <code>InputStream</code>. + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td>3a. An exception will be thrown if the specified Block + Allocation Table is corrupt, as evidenced by an index + pointing to a non-existent block, or by a chain + extending past the known size of the file.</td> + </tr> + </table> + </section> + <section><title>Use Case 10: Rename existing file in the file system</title> + <table> + <tr> + <td><em>Primary Actor:</em></td> + <td>POIFS client</td> + </tr> + <tr> + <td><em>Scope:</em></td> + <td>POIFS</td> + </tr> + <tr> + <td><em>Level:</em></td> + <td>Summary</td> + </tr> + <tr> + <td><em>Stakeholders and Interests:</em></td> + <td> + * POIFS client- wants to rename an existing file in + the file system.<br/> + * POIFS - knows how to manage the file system. + </td> + </tr> + <tr> + <td><em>Precondition:</em></td> + <td> + * The file system is has been read (use case 1, read + existing file system) or has been created and + written to (use case 3, create new file system; + use case 6, write new file to file system.<br/> + * The specified file exists in the file system.<br/> + * The new name for the file does not duplicate + another file in the file system. + </td> + </tr> + <tr> + <td><em>Minimal Guarantee:</em></td> + <td>None</td> + </tr> + <tr> + <td><em>Main Success Guarantee:</em></td> + <td> + 1. POIFS updates the Property Table entry for the + specified file with its new name. + </td> + </tr> + <tr> + <td><em>Extensions:</em></td> + <td> + * 1a. If the old file name is not in the file + system, POIFS throws an exception.<br/> + * 1b. If the new file name already exists in the + file system, POIFS throws an exception.<br/> + * 1c. If the new file name is too long (the limit is + 31 characters), POIFS throws an exception. + </td> + </tr> + </table> + </section> + </section> + </body> +</document> |