|
|
@@ -1,837 +0,0 @@ |
|
|
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> |
|
|
|
<HTML> |
|
|
|
<HEAD> |
|
|
|
<META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=iso-8859-1"> |
|
|
|
<TITLE></TITLE> |
|
|
|
<META NAME="GENERATOR" CONTENT="StarOffice/5.2 (Linux)"> |
|
|
|
<META NAME="AUTHOR" CONTENT=" "> |
|
|
|
<META NAME="CREATED" CONTENT="20010728;10223600"> |
|
|
|
<META NAME="CHANGEDBY" CONTENT="Marc Johnson"> |
|
|
|
<META NAME="CHANGED" CONTENT="20010810;13415800"> |
|
|
|
<STYLE> |
|
|
|
<!-- |
|
|
|
@page { margin-left: 1.25in; margin-right: 1.25in; margin-top: 1in; margin-bottom: 1in } |
|
|
|
H1 { margin-bottom: 0.08in; font-size: 16pt } |
|
|
|
TD P { margin-bottom: 0.08in } |
|
|
|
H2 { margin-bottom: 0.08in; font-size: 14pt; font-style: italic } |
|
|
|
H3 { margin-bottom: 0.08in } |
|
|
|
H4 { margin-bottom: 0.08in; font-size: 11pt; font-style: italic } |
|
|
|
P { margin-bottom: 0.08in } |
|
|
|
--> |
|
|
|
</STYLE> |
|
|
|
</HEAD> |
|
|
|
<BODY> |
|
|
|
<H1>POI Filesystem format</H1> |
|
|
|
<H2>Introduction</H2> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium"> |
|
|
|
The POI file format is essentially an archive wrapper |
|
|
|
around files. It is intended to mimic a filesystem. For |
|
|
|
the remainder of this document it is referred to as a |
|
|
|
filesystem in order to avoid confusion with the |
|
|
|
"files" it contains. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium; text-decoration: none"> |
|
|
|
POI filesystems are compatible with those document formats |
|
|
|
used by a well-known software company's popular office |
|
|
|
productivity suite and programs outputting compatible |
|
|
|
data. Because the POI filesystem does not provide |
|
|
|
compression, encryption or any other worthwhile feature, |
|
|
|
its not a good choice unless you require interoperability |
|
|
|
with these programs. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium"> |
|
|
|
The POI filesystem does not encode the documents |
|
|
|
themselves. For example, if you had a word processor file |
|
|
|
with the extension ".doc", you would actually |
|
|
|
have a POI filesystem with a document file archived inside |
|
|
|
of the filesystem. |
|
|
|
</P> |
|
|
|
<H2>Document Conventions</H2> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
This document utilizes the numeric types as described by |
|
|
|
the Java Language Specification, which can be found at |
|
|
|
java.sun.com. In short: |
|
|
|
</P> |
|
|
|
<UL> |
|
|
|
<LI> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
a byte is an 8 bit signed integer ranging from |
|
|
|
(-128) to 127. |
|
|
|
</P> |
|
|
|
</LI> |
|
|
|
<LI> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
a short is a 16 bit signed integer ranging from |
|
|
|
(-32768) to 32767 |
|
|
|
</P> |
|
|
|
</LI> |
|
|
|
<LI> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
an int is a 32 bit signed integer ranging from |
|
|
|
(-2.14e+9) to 2.14e+9 |
|
|
|
</P> |
|
|
|
</LI> |
|
|
|
<LI> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
a long is a 64 bit signed integer ranging from |
|
|
|
(-9.22e+18) to 9.22e+18 |
|
|
|
</P> |
|
|
|
</LI> |
|
|
|
</UL> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The Java Language Specification spells out a number of |
|
|
|
other types that are not referred to by this document. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Where this document makes references to "endian |
|
|
|
conversion" it is referring to the byte order of |
|
|
|
stored numbers. Numbers in "little-endian order" |
|
|
|
are stored with the LEAST significant byte first. In order |
|
|
|
to properly read a short, for example, you'd read two |
|
|
|
bytes and then shift the second byte 8 bits to the left |
|
|
|
before performing an <CODE>or</CODE> operation to it |
|
|
|
against the first byte while stripping the |
|
|
|
"sign" from the first byte. The following code |
|
|
|
illustrates this method: |
|
|
|
</P> |
|
|
|
<P STYLE="text-decoration: none"> |
|
|
|
<FONT FACE="Courier, monospace"><FONT |
|
|
|
SIZE=2><B>public int getShort (byte[ ] rec) |
|
|
|
{</B></FONT></FONT> |
|
|
|
</P> |
|
|
|
<P> |
|
|
|
<FONT FACE="Courier, monospace"><FONT SIZE=2><B>return ( |
|
|
|
(rec[1] << 8) | (rec[0] & 0xff) |
|
|
|
);</B></FONT></FONT> |
|
|
|
</P> |
|
|
|
<P> |
|
|
|
<FONT FACE="Courier, monospace"><FONT |
|
|
|
SIZE=2><B>}</B></FONT></FONT> |
|
|
|
</P> |
|
|
|
<H2>Filesystem Introduction</H2> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
POI filesystems are essentially normal files stored on a |
|
|
|
Java-compatible platform's native filesystem. They are |
|
|
|
identified by names ending in a four character identifier |
|
|
|
noting what type of data they contain. For example, a file |
|
|
|
ending in ".xls" would likely contain |
|
|
|
spreadsheet data, and a file ending in ".doc" |
|
|
|
would probably contain a word processing document. POI |
|
|
|
filesystems are called "filesystem", because |
|
|
|
they contain multiple embedded files in a manner similar |
|
|
|
to traditional filesystems. Along functional lines, it |
|
|
|
would be more accurate to call these POI archives. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
POI filesystems do not provide encryption, compression, or |
|
|
|
any other feature of a modern archive and are therefore a |
|
|
|
poor choice for implementing new file formats. It is |
|
|
|
suggested that POI filesystems are most useful for |
|
|
|
interoperability with legacy applications that use a |
|
|
|
compatible file format. |
|
|
|
</P> |
|
|
|
<H2>Filesystem Walkthrough</H2> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
This is a walkthrough of a POI filesystem and how it is |
|
|
|
put together. It is not intended to give a concise |
|
|
|
description but to give a "big picture" of the |
|
|
|
general structure and how it's interpreted. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
A POI filesystem begins with a <A |
|
|
|
HREF="HeaderBlock"><B><I>header</I></B></A>. This header |
|
|
|
identifies locations in the file by function and provides |
|
|
|
a sanity check identifying a native filesystem file as |
|
|
|
indeed a POI filesystem. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The first 64 bits of the header compose a <B><I>magic |
|
|
|
number identifier.</I></B> This identifier tells the |
|
|
|
client software that this is indeed a POI filesystem and |
|
|
|
that it should be treated as such. This is a "sanity |
|
|
|
check" to make sure this is a POI filesystem and not |
|
|
|
some other format. The header also contains an <B><I>array |
|
|
|
of block numbers</I></B>. These block numbers refer to |
|
|
|
blocks in the file. When these blocks are read together |
|
|
|
they form the <A HREF="#BAT"><B><I>Block Allocation |
|
|
|
Table</I></B></A>. The header also contains a pointer to |
|
|
|
the first element in the <A |
|
|
|
HREF="#PropertyTable"><B><I>property table</I></B></A> |
|
|
|
also known as the <A HREF="RootEntry"><B><I>root |
|
|
|
element</I></B></A>, and a pointer to the <B>small Block |
|
|
|
Allocation Table (SBAT)</B>. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The <A HREF="#BAT"><B><I>block allocation |
|
|
|
table</I></B></A> or <B><I>BAT</I></B>, along with the <A |
|
|
|
HREF="#PropertyTable"><B><I>property table</I></B></A> |
|
|
|
specify which blocks in the filesystem belong to which |
|
|
|
files. It is somewhat hard to conceptualize the Block |
|
|
|
Allocation Table at first. The block allocation table is |
|
|
|
essentially an array of integers that point at each |
|
|
|
other. These elements form chains. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
To read the <A HREF="#BAT"><B><I>block allocation |
|
|
|
table</I></B></A> you must first read the <B><I>start |
|
|
|
block </I></B>of the file from the <A |
|
|
|
HREF="#PropertyTable"><B><I>property |
|
|
|
table</I></B></A>. This is both your index for the next |
|
|
|
element in the <B><I>BAT </I></B>array as well as the |
|
|
|
index of the first block in your file. For instance: if |
|
|
|
the <B><I>start block</I></B> from your file's property is |
|
|
|
0 then you read block 0 (the first block after the header) |
|
|
|
from your filesystem as the first block of your file. You |
|
|
|
also read element 0 from the <B><I>BAT array</I></B>. |
|
|
|
Supposing this element has a value equal to 2, you'd read |
|
|
|
block 2 from your filesystem as the next block of your |
|
|
|
file and element 2 from your <B><I>BAT array</I></B>. |
|
|
|
This will be covered further later in this document. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The <A HREF="#PropertyTable"><B><I>Property |
|
|
|
Table</I></B></A> is essentially the directory structure |
|
|
|
for the filesystem. It consists of the name of the file or |
|
|
|
directory, its <B><I>start block</I></B> in both the |
|
|
|
filesystem and <B><I>BAT</I></B>, and its actual size. |
|
|
|
The first property in the <A |
|
|
|
HREF="#PropertyTable">property table</A> is the <A |
|
|
|
HREF="RootEntry"><B><I>root element</I></B></A>. Its real |
|
|
|
purpose is to hold the start block for the <B><I>small |
|
|
|
blocks.</I></B> |
|
|
|
</P> |
|
|
|
<H3>Filesystem Structure</H3> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium"> |
|
|
|
All values in the POI filesystem are stored in |
|
|
|
"little-endian" order, meaning you must reverse |
|
|
|
the order of the bytes before assigning them to |
|
|
|
variables. Assume the values you see below are originally |
|
|
|
stored backwards. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium"> |
|
|
|
The POI filesystem is divided into 512 byte blocks. Each |
|
|
|
block has an implicit block-type. The order and |
|
|
|
description of these is described below. |
|
|
|
</P> |
|
|
|
<A NAME="HeaderBlock"><H3>Header Block</H3></A> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium"> |
|
|
|
The POI filesystem begins with a <B><I>header |
|
|
|
block</I></B>. The first 64 bits of the header form a long |
|
|
|
<B><I>file type id</I></B> or <B><I>magic number |
|
|
|
identifier</I></B> of |
|
|
|
<CODE>0xE11AB1A1E011CFD0L</CODE>. This is basically a |
|
|
|
sanity check. If this isn't the first thing in the header |
|
|
|
(and consequently the filesystem) then this is not a POI |
|
|
|
filesystem and should be read with some other library. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in; font-weight: medium"> |
|
|
|
It's important to know the most important parts of the |
|
|
|
header. These are discussed in the rest of this |
|
|
|
section. |
|
|
|
</P> |
|
|
|
<H4>BATs</H4> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x2c</B> is an int specifying the number of |
|
|
|
elements in the <B><I>BAT array</I></B>. The array at |
|
|
|
<B>0x4c</B> an array of ints. This array contains the |
|
|
|
indices of every block in the <A HREF="#BAT">Block |
|
|
|
Allocation Table</A>. |
|
|
|
</P> |
|
|
|
<H4><I><B>XBATs</B></I></H4> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Very large POI archives may have more blocks than can be |
|
|
|
addressed by the BAT blocks enumerated in the header |
|
|
|
block. How large? Well, the BAT array in the header can |
|
|
|
contain up to 109 BAT block indices; each BAT block |
|
|
|
references up to 128 blocks, and each block is 512 bytes, |
|
|
|
so we're talking about 109 * 128 * 512 = 6.8MB. That's a |
|
|
|
pretty respectable document! But, you could have much more |
|
|
|
data than that, and in today's world of cheap gigabyte |
|
|
|
drives, why not? So, the BAT may be extended in that |
|
|
|
event. The integer value at offset <B>0x44</B> of the |
|
|
|
header is the index of the first <B><I>extended BAT (XBAT) |
|
|
|
block</I></B>. At offset <B>0x48</B> of the header, there |
|
|
|
is an int value that specifies how many XBAT blocks there |
|
|
|
are. The XBAT blocks begin at the specified index into the |
|
|
|
array of blocks making up the POI filesystem, and continue |
|
|
|
in sequence for the specified count of XBAT blocks. |
|
|
|
</p> |
|
|
|
<p> |
|
|
|
Each XBAT block contains the indices of up to 128 BAT |
|
|
|
blocks, so the document size can be expanded by another |
|
|
|
8MB for each XBAT block. The BAT blocks indexed by an XBAT |
|
|
|
block are appended to the end of the list of BAT blocks |
|
|
|
enumerated in the header block. Thus the BAT blocks |
|
|
|
enumerated in the header block are BAT blocks 0 through |
|
|
|
108, the BAT blocks enumerated in the first XBAT block are |
|
|
|
BAT blocks 109 through 236, the BAT blocks enumerated in |
|
|
|
the second XBAT block are BAT blocks 237 through 364, and |
|
|
|
so on. |
|
|
|
</P> |
|
|
|
<p> |
|
|
|
Through the use of XBAT blocks, the limit on the overall |
|
|
|
document size is that imposed by the 4-byte block indices; |
|
|
|
if the indices are unsigned ints, the maximum file size is |
|
|
|
2 terabytes, 1 terabyte if the indices are treated as |
|
|
|
signed ints. Either way, I have yet to see a disk drive |
|
|
|
large enough to accommodate such a file on the shelves at |
|
|
|
the local office supply stores. |
|
|
|
</p> |
|
|
|
<H4>SBATs</H4> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
If a file contained in a POI archive is smaller than 4096 |
|
|
|
bytes, it is stored in small blocks. Small blocks are 64 |
|
|
|
bytes in length and are contained within big blocks, up to |
|
|
|
8 to a big block. As the main BAT is used to navigate the |
|
|
|
array of big blocks, so the <B><I>small block allocation |
|
|
|
table</I></B> is used to navigate the array of small |
|
|
|
blocks. The SBAT's start block index is found at offset |
|
|
|
<B>0x3C</B> of the header block, and remaining blocks |
|
|
|
constituting the SBAT are found by walking the main BAT as |
|
|
|
if it were an ordinary file in the POI filesystem (this |
|
|
|
process is described below). |
|
|
|
</P> |
|
|
|
<H4>Property Table Start Index</H4> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
An integer at address <B>0x30</B> specifies the start |
|
|
|
index of the <A HREF="#PropertyTable">property |
|
|
|
table</A>. This integer is specified as a |
|
|
|
<B><I>"block index". </I></B>The <A |
|
|
|
HREF="#PropertyTable">Property Table</A> is stored, as is |
|
|
|
almost everything in a POI file system, in big blocks and |
|
|
|
walked via the BAT. The <A HREF="#PropertyTable">Property |
|
|
|
Table</A> is described below. |
|
|
|
</P> |
|
|
|
<A NAME="PropertyTable"><H3>Property Table</H3></A> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The property table is essentially nothing more than the |
|
|
|
directory system. Properties are 128 byte records |
|
|
|
contained within the 512 byte blocks. The first property |
|
|
|
is always the <A HREF="RootEntry">Root Entry</A>. The |
|
|
|
following applies to individual properties within a |
|
|
|
property table: |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x00</B> in the property is the |
|
|
|
"<B><I>name</I></B>". This is stored as an |
|
|
|
uncompressed 16 bit unicode string. In short every other |
|
|
|
byte corresponds to an "ASCII" character. The |
|
|
|
size of this string is stored at offset <B>0x40</B> |
|
|
|
(<B><I>string size</I></B>) as a short. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x42</B> is the <B><I>property type</I></B> |
|
|
|
(byte). The type is 1 for directory, 2 for file or 5 for |
|
|
|
the Root Entry. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x43</B> is the <B><I>node color</I></B> |
|
|
|
(byte). The color is either 1, (black), or 0, |
|
|
|
(red). Properties are apparently meant to be arranged in a |
|
|
|
red-black binary tree, subject to the following rules: |
|
|
|
<A name="node_rules"></A> |
|
|
|
<OL> |
|
|
|
<LI>The root of the tree is always black |
|
|
|
<LI>Two consecutive nodes cannot both be red |
|
|
|
<LI>A property is less than another property if its |
|
|
|
name length is less than the other property's name |
|
|
|
length |
|
|
|
<LI>If two properties have the same name length, the |
|
|
|
sort order is determined by the sort order of the |
|
|
|
properties' names. |
|
|
|
</OL> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x44</B> is the index (int) of the |
|
|
|
<B><I>previous property</I></B>. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x48</B> is the index (int) of the <B><I>next |
|
|
|
property</I></B>. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x4C</B> is the index (int) of the |
|
|
|
<B><I>first directory entry</I></B>. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x74</B> is an integer giving the <B><I>start |
|
|
|
block</I></B> for the file described by this |
|
|
|
property. This index corresponds to an index in the array |
|
|
|
of indices that is the Block Allocation Table (or the |
|
|
|
Small Block Allocation Table) as well as the index of the |
|
|
|
first block in the file. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
At offset <B>0x78</B> is an integer giving the total |
|
|
|
<B><I>actual size</I></B> of the file pointed at by this |
|
|
|
property. If the file size is less than 4096, the file is |
|
|
|
stored in small blocks and the SBAT is used to walk the |
|
|
|
small blocks making up the file. If the file size is 4096 |
|
|
|
or larger, the file is stored in big blocks and the main |
|
|
|
BAT is used to walk the big blocks making up the file. The |
|
|
|
exception to this rule is the <B><I>Root Entry</I></B>, |
|
|
|
which, regardless of its size, is ALWAYS stored in big |
|
|
|
blocks and the main BAT is used to walk the big blocks |
|
|
|
making up this special file. |
|
|
|
</P> |
|
|
|
<A NAME="RootEntry"><H3>Root Entry</H3></A> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The <B><I>Root Entry</I></B> in the <A |
|
|
|
HREF="#PropertyTable"><B><I>Property Table</I></B></A> |
|
|
|
contains the information necessary to read and write small |
|
|
|
files, which are files less than 4096 bytes long. The |
|
|
|
start block field of the Root Entry is the start index of |
|
|
|
the <B><I>Small Block Array</I></B>, which is read like |
|
|
|
any other file in the POI filesysstem. Since the SBAT |
|
|
|
cannot be used without the Small Block Array, the Root |
|
|
|
Entry MUST be read or written using the <A |
|
|
|
HREF="#BAT"><B><I>Block Allocation Table</I></B></A>. The |
|
|
|
blocks making up the Small Block Array are divided into |
|
|
|
64-byte small blocks, up to the size indicated in the Root |
|
|
|
Entry (which should always be a multiple of 64) |
|
|
|
</P> |
|
|
|
<H3>Walking the Nodes of the <A HREF="#PropertyTable">Property |
|
|
|
Table</A></H3> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The individual properties form a directory tree, with the |
|
|
|
<B><I>Root Entry</I></B> as the directory tree's root, as |
|
|
|
shown in the accompanying drawing. Note the numbers in |
|
|
|
parentheses in each node; they represent the node's index |
|
|
|
in the array of properties. The <B>NEXT_PROP</B>, |
|
|
|
<B>PREVIOUS_PROP</B>, and <B>CHILD_PROP</B> fields hold |
|
|
|
these indices, and are used to navigate the tree. |
|
|
|
</P> |
|
|
|
<P> |
|
|
|
<IMG SRC="PropertySet.jpg"> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Each <A NAME="directoryEntry">directory entry</A> (i.e., a |
|
|
|
property whose type is <B><I>directory</I></B> or |
|
|
|
<B><I>root entry</I></B>) uses its <B>CHILD_PROP</B> field |
|
|
|
to point to one of its subordinate (child) properties. It |
|
|
|
doesn't seem to matter which of its children it points |
|
|
|
to. Thus in the previous drawing, the Root Entry's |
|
|
|
CHILD_PROP field may contain 1, 4, or the index of one of |
|
|
|
its other children. Similarly, the directory node (index |
|
|
|
1) may have, in its CHILD_PROP field, 2, 3, or the index |
|
|
|
of one of its other children. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The children of a given <A |
|
|
|
HREF="#directoryEntry">directory property</A> point to |
|
|
|
each other in a similar fashion by using their |
|
|
|
<B>NEXT_PROP</B> and <B>PREVIOUS_PROP</B> fields. The |
|
|
|
ordering of the children is governed by rules described <a |
|
|
|
href="#node_rules">here</a> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Unused <B>NEXT_PROP</B>, <B>PREVIOUS_PROP</B>, and |
|
|
|
<B>CHILD_PROP</B> fields contain the marker value of |
|
|
|
-1. All file properties have a value of -1 for their |
|
|
|
CHILD_PROP fields for example. |
|
|
|
</P> |
|
|
|
<A NAME="BAT"><H3>Block Allocation Table</H3></A> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
The <B><I>BAT blocks</I></B> are pointed at by the bat |
|
|
|
array contained in the <A HREF="HeaderBlock">header</A> |
|
|
|
and supplemented, if necessary, by the <B><I>XBAT |
|
|
|
blocks</I></B>. These blocks form a large table of |
|
|
|
integers. These integers are block numbers. The |
|
|
|
<B><I>Block Allocation Table</I></B> holds chains of |
|
|
|
integers. These chains are terminated with -2. The |
|
|
|
elements in these chains refer to blocks in the files. The |
|
|
|
starting block of a file is NOT specified in the BAT. It |
|
|
|
is specified by the <B><I>property</I></B> for a given |
|
|
|
file. The elements in this BAT are both the block number |
|
|
|
(within the file minus the header) AND the number of the |
|
|
|
next BAT element in the chain. This can be thought of as a |
|
|
|
linked list of blocks. The BAT array contains the links |
|
|
|
from one block to the next, including the end of chain |
|
|
|
marker. |
|
|
|
</P> |
|
|
|
<P> |
|
|
|
Here's an example: Let's assume that the BAT begins as |
|
|
|
follows: |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 0 ] = 2</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 1 ] = 5</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 2 ] = 3</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 3 ] = 4</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 4 ] = 6</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 5 ] = |
|
|
|
-2</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 6 ] = 7</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<FONT FACE="Courier, monospace"><B>BAT[ 7 ] = |
|
|
|
-2</B></FONT> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
<B>...</B> |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Now, if we have a file whose <A |
|
|
|
HREF="#PropertyTable">Property Table</A> entry says it |
|
|
|
begins with index 0, we walk the BAT array and see that |
|
|
|
the file consists of blocks 0 (because the start block is |
|
|
|
0), 2 (because BAT[ 0 ] is 2), 3 (BAT[ 2 ] is 3), 4 (BAT[ |
|
|
|
3 ] is 4), 6 (BAT[ 4 ] is 6), and 7 (BAT[ 6 ] is 7). It |
|
|
|
ends at block 7 because BAT[ 7 ] is -2, which is the end |
|
|
|
of chain marker. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Similarly, a file beginning at index 1 consists of |
|
|
|
blocks 1 and 5. |
|
|
|
</P> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
Other special numbers in a BAT array are: |
|
|
|
</P> |
|
|
|
<UL> |
|
|
|
<LI> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
-1, which indicates an unused block |
|
|
|
</P> |
|
|
|
</LI> |
|
|
|
<LI> |
|
|
|
<P STYLE="margin-bottom: 0in"> |
|
|
|
-3, which indicates a "special" block, |
|
|
|
such as a block used to make up the Small Block |
|
|
|
Array, the <A HREF="#PropertyTable">Property |
|
|
|
Table</A>, the main BAT, or the SBAT |
|
|
|
</P> |
|
|
|
</LI> |
|
|
|
</UL> |
|
|
|
<H2>Filesystem Structures</H2> |
|
|
|
<P> |
|
|
|
The following outlines the basic filesystem structures. |
|
|
|
</P> |
|
|
|
<H3>Header (block 1) -- 512 (0x200) bytes</H3> |
|
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD><B>Field</B></TD> |
|
|
|
<TD><B>Description</B></TD> |
|
|
|
<TD><B>Offset</B></TD> |
|
|
|
<TD><B>Length</B></TD> |
|
|
|
<TD><B>Default value or const</B></TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>FILETYPE</TD> |
|
|
|
<TD>Magic number identifying this as a POI |
|
|
|
filesystem.</TD> |
|
|
|
<TD>0x0000</TD> |
|
|
|
<TD>Long</TD> |
|
|
|
<TD>0xE11AB1A1E011CFD0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK1</TD> |
|
|
|
<TD>Unknown constant</TD> |
|
|
|
<TD>0x0008</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK2</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x000C</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK3</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x0014</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK4</TD> |
|
|
|
<TD>Unknown Constant (revision?)</TD> |
|
|
|
<TD>0x0018</TD> |
|
|
|
<TD>Short</TD> |
|
|
|
<TD>0x003B</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK5</TD> |
|
|
|
<TD>Unknown Constant (version?)</TD> |
|
|
|
<TD>0x001A</TD> |
|
|
|
<TD>Short</TD> |
|
|
|
<TD>0x0003</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK6</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x001C</TD> |
|
|
|
<TD>Short</TD> |
|
|
|
<TD>-2</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>LOG_2_BIG_BLOCK_SIZE</TD> |
|
|
|
<TD>Log, base 2, of the big block size</TD> |
|
|
|
<TD>0x001E</TD> |
|
|
|
<TD>Short</TD> |
|
|
|
<TD>9 (2 ^ 9 = 512 bytes)</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>LOG_2_SMALL_BLOCK_SIZE</TD> |
|
|
|
<TD>Log, base 2, of the small block size</TD> |
|
|
|
<TD>0x0020</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>6 (2 ^ 6 = 64 bytes)</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK7</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x0024</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK8</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x0028</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>BAT_COUNT</TD> |
|
|
|
<TD>Number of elements in the BAT array</TD> |
|
|
|
<TD>0x002C</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>required</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>PROPERTIES_START</TD> |
|
|
|
<TD>Block index of the first block of the <A |
|
|
|
HREF="#PropertyTable">property table</A></TD> |
|
|
|
<TD>0x0030</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>required</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK9</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x0034</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK10</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x0038</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0x00001000</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>SBAT_START</TD> |
|
|
|
<TD>Block index of first big block containing the |
|
|
|
small block allocation table (SBAT)</TD> |
|
|
|
<TD>0x003C</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>-2</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>UK11</TD> |
|
|
|
<TD>Unknown Constant</TD> |
|
|
|
<TD>0x0040</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>1</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>XBAT_START</TD> |
|
|
|
<TD>Block index of the first block in the Extended |
|
|
|
Block Allocation Table (XBAT)</TD> |
|
|
|
<TD>0x0044</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>-2</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>XBAT_COUNT</TD> |
|
|
|
<TD>Number of elements in the Extended Block |
|
|
|
Allocation Table (to be added to the BAT)</TD> |
|
|
|
<TD>0x0048</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>BAT_ARRAY</TD> |
|
|
|
<TD>Array of block indicies constituting the <A |
|
|
|
HREF="#BAT">Block Allocation Table (BAT)</A></TD> |
|
|
|
<TD>0x004C, 0x0050, 0x0054 ... 0x01FC</TD> |
|
|
|
<TD>Integer[ ]</TD> |
|
|
|
<TD>-1 for unused elements, at least first element |
|
|
|
must be filled.</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>N/A</TD> |
|
|
|
<TD>Header block data not otherwise described in this |
|
|
|
table</TD> |
|
|
|
<TD>N/A</TD> |
|
|
|
<TD>N/A</TD> |
|
|
|
<TD>-1</TD> |
|
|
|
</TR> |
|
|
|
</TABLE> |
|
|
|
<A HREF="#BAT"><H3><B>Block Allocation Table Block -- 512 |
|
|
|
(0x200) bytes</B></H3></A> |
|
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD><B>Field</B></TD> |
|
|
|
<TD><B>Description</B></TD> |
|
|
|
<TD><B>Offset</B></TD> |
|
|
|
<TD><B>Length</B></TD> |
|
|
|
<TD><B>Default value or const</B></TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>BAT_ELEMENT</TD> |
|
|
|
<TD>Any given element in the BAT block</TD> |
|
|
|
<TD>0x0000, 0x0004, 0x0008, ... 0x01FC</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>-1 = unused<BR> |
|
|
|
-2 = end of chain<BR> |
|
|
|
-3 = special (e.g., BAT block)<BR> |
|
|
|
All other values point to the next element in the |
|
|
|
chain and the next index of a block composing the |
|
|
|
file.</TD> |
|
|
|
</TR> |
|
|
|
</TABLE> |
|
|
|
<H3>Property Block -- 512 (0x200) byte block</H3> |
|
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD><B>Field</B></TD> |
|
|
|
<TD><B>Description</B></TD> |
|
|
|
<TD><B>Offset</B></TD> |
|
|
|
<TD><B>Length</B></TD> |
|
|
|
<TD><B>Default value or const</B></TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>Properties[ ]</TD> |
|
|
|
<TD>This block contains the properties.</TD> |
|
|
|
<TD>0x0000, 0x0080, 0x0100, 0x0180</TD> |
|
|
|
<TD>128 bytes</TD> |
|
|
|
<TD>All unused space is set to -1.</TD> |
|
|
|
</TR> |
|
|
|
</TABLE> |
|
|
|
<H3>Property -- 128 (0x80) byte block</H3> |
|
|
|
<TABLE BORDER=0 CELLPADDING=4 CELLSPACING=0> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD><B>Field</B></TD> |
|
|
|
<TD><B>Description</B></TD> |
|
|
|
<TD><B>Offset</B></TD> |
|
|
|
<TD><B>Length</B></TD> |
|
|
|
<TD><B>Default value or const</B></TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>NAME</TD> |
|
|
|
<TD>A unicode null-terminated uncompressed 16bit |
|
|
|
string (lose the high bytes) containing the name |
|
|
|
of the property.</TD> |
|
|
|
<TD>0x00, 0x02, 0x04, ... 0x3E</TD> |
|
|
|
<TD>Short[ ]</TD> |
|
|
|
<TD>0x0000 for unused elements, field required, 32 |
|
|
|
(0x40) element max</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>NAME_SIZE</TD> |
|
|
|
<TD>Number of characters in the NAME field</TD> |
|
|
|
<TD>0x40</TD> |
|
|
|
<TD>Short</TD> |
|
|
|
<TD>Required</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>PROPERTY_TYPE</TD> |
|
|
|
<TD>Property type (directory, file, or root)</TD> |
|
|
|
<TD>0x42</TD> |
|
|
|
<TD>Byte</TD> |
|
|
|
<TD>1 (directory), 2 (file), or 5 (root entry)</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>NODE_COLOR</TD> |
|
|
|
<TD>Node color</TD> |
|
|
|
<TD>0x43</TD> |
|
|
|
<TD>Byte</TD> |
|
|
|
<TD>0 (red) or 1 (black)</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>PREVIOUS_PROP</TD> |
|
|
|
<TD>Previous property index</TD> |
|
|
|
<TD>0x44</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>-1</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>NEXT_PROP</TD> |
|
|
|
<TD>Next property index</TD> |
|
|
|
<TD>0x48</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>-1</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>CHILD_PROP</TD> |
|
|
|
<TD>First child property index</TD> |
|
|
|
<TD>0x4c</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>-1</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>SECONDS_1</TD> |
|
|
|
<TD>Seconds component of the created timestamp?</TD> |
|
|
|
<TD>0x64</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>DAYS_1</TD> |
|
|
|
<TD>Days since epoch component of the created |
|
|
|
timestamp?</TD> |
|
|
|
<TD>0x68</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>SECONDS_2</TD> |
|
|
|
<TD>Seconds component of the modified timestamp?</TD> |
|
|
|
<TD>0x6C</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>DAYS_2</TD> |
|
|
|
<TD>Days since epoch component of the modified |
|
|
|
timestamp?</TD> |
|
|
|
<TD>0x70</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>START_BLOCK</TD> |
|
|
|
<TD>Starting block of the file, used as the first |
|
|
|
block in the file and the pointer to the next |
|
|
|
block from the BAT</TD> |
|
|
|
<TD>0x74</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>Required</TD> |
|
|
|
</TR> |
|
|
|
<TR VALIGN=TOP> |
|
|
|
<TD>SIZE</TD> |
|
|
|
<TD>Actual size of the file this property points |
|
|
|
to. (used to truncate the blocks to the real |
|
|
|
size).</TD> |
|
|
|
<TD>0x78</TD> |
|
|
|
<TD>Integer</TD> |
|
|
|
<TD>0</TD> |
|
|
|
</TR> |
|
|
|
</TABLE> |
|
|
|
</BODY> |
|
|
|
</HTML> |