aboutsummaryrefslogtreecommitdiffstats
path: root/build/jakarta-poi/docs/hpsf/how-to.html
diff options
context:
space:
mode:
Diffstat (limited to 'build/jakarta-poi/docs/hpsf/how-to.html')
-rw-r--r--build/jakarta-poi/docs/hpsf/how-to.html622
1 files changed, 0 insertions, 622 deletions
diff --git a/build/jakarta-poi/docs/hpsf/how-to.html b/build/jakarta-poi/docs/hpsf/how-to.html
deleted file mode 100644
index 12655dd39a..0000000000
--- a/build/jakarta-poi/docs/hpsf/how-to.html
+++ /dev/null
@@ -1,622 +0,0 @@
-<html>
-<head>
-<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
-<meta content="text/html; charset=ISO-8859-1">
-<title>HPSF HOW-TO</title>
-<style type="text/css">
- body { background-color: white; font-size: normal; color: black ; }
- a { color: #525d76; }
- a.black { color: #000000;}
- table {border-width: 0; width: 100%}
- table.centered {text-align: center}
- table.title {text-align: center; width: 80%}
- img{border-width: 0;}
- span.s1 {font-family: Helvetica, Arial, sans-serif; font-weight: bold; color: #000000; }
- span.s1_white { font-family: Helvetica, Arial, sans-serif; font-weight: bold; color: #ffffff; }
- span.title {font-family: Helvetica, Arial, sans-serif; font-weight: bold; color: #000000; }
- span.c1 {color: #000000; font-family: Helvetica, Arial, sans-serif}
- tr.left {text-align: left}
- hr { width: 100%; size: 2}
-</style>
-</head>
-<body>
-<table width="100%" cellspacing="0" cellpadding="0" border="0">
-<tr>
-<td valign="top" align="left"><a href="http://jakarta.apache.org/index.html"><img hspace="0" vspace="0" border="0" src="images/jakarta-logo.gif"></a></td><td width="100%" valign="top" align="left" bgcolor="#ffffff"><img hspace="0" vspace="0" border="0" align="right" src="images/header.gif"></td>
-</tr>
-<tr>
-<td colspan="2" bgcolor="#525d76"><span class="c1"><a class="black" href="http://www.apache.org/">www.apache.org &gt;</a><a class="black" href="http://jakarta.apache.org/">jakarta.apache.org &gt;</a><a href="http://jakarta.apache.org/poi/" class="black">jakarta.apache.org/poi</a></span></td>
-</tr>
-<tr>
-<td height="8"></td>
-</tr>
-</table>
-<table border="0" cellpadding="0" cellspacing="0" width="100%">
-<tr>
-<td width="1%">
-<br>
-</td><td nowrap="1" valign="top" width="14%">
-<br>
-<span class="s1">Navigation</span>
-<br>
-<a class="s1" href="../index.html">Main</a>
-<br>
-<br>
-<span class="s1">HPSF</span>
-<br>
-<a class="s1" href="index.html">Overview</a>
-<br>
-<a class="s1" href="how-to.html">How To</a>
-<br>
-<a class="s1" href="internals.html">Internals</a>
-<br>
-<a class="s1" href="todo.html">To Do</a>
-<br>
-</td><td width="1%">
-<br>
-</td><td align="left" valign="top" width="*">
-<title>HPSF HOW-TO</title>
-<table width="100%" align="center" class="centered">
-<tbody>
-<tr>
-<td align="center">
-<table border="0" cellpadding="1" cellspacing="0" class="title">
-<tbody>
-<tr>
-<td bgcolor="#525d76">
-<table width="100%" border="0" cellpadding="2" cellspacing="0" class="centered">
-<tbody>
-<tr>
-<td bgcolor="#f3dd61"><span class="title">HPSF HOW-TO</span></td>
-</tr>
-</tbody>
-</table>
-</td>
-</tr>
-</tbody>
-</table>
-</td>
-</tr>
-</tbody>
-</table>
-<font size="-2" color="#000000">
-<p>
-<a href="mailto:"></a>
-</p>
-</font>
-<div align="right">
-<table cellspacing="0" cellpadding="2" border="0" width="100%">
-<tr>
-<td bgcolor="#525D76"><font color="#ffffff" size="+1"><font face="Arial,sans-serif"><b>How To Use the HPSF APIs</b></font></font></td>
-</tr>
-<tr>
-<td>
-<br>
-
-
-<p align="justify">This HOW-TO is organized in three section. You should read them
- sequentially because the later sections build upon the earlier ones.</p>
-
-
-<ol>
-
-<li>
-
-<p align="justify">The <a href="#sec1">first section</a> explains how to read
- the most important standard properties of a Microsoft Office
- document. Standard properties are things like title, author, creation
- date etc. It is quite likely that you will find here what you need and
- don't have to read the other sections.</p>
-
-</li>
-
-
-<li>
-
-<p align="justify">The <a href="#sec2">second section</a> goes a small step
- further and focusses on reading additional standard properties. It also
- talks about exceptions that may be thrown when dealing with HPSF and
- shows how you can read properties of embedded objects.</p>
-
-</li>
-
-
-<li>
-
-<p align="justify">The <a href="#sec3">third section</a> tells how to read
- non-standard properties. Non-standard properties are application-specific
- name/value/type triples.</p>
-
-</li>
-
-</ol>
-
-
-<anchor id="sec1"></anchor>
-
-<div align="right">
-<table cellspacing="0" cellpadding="2" border="0" width="99%">
-<tr>
-<td bgcolor="#525D76"><font color="#ffffff" size="+0"><font face="Arial,sans-serif"><b>Reading Standard Properties</b></font></font></td>
-</tr>
-<tr>
-<td>
-<br>
-
-
-<note>This section explains how to read
- the most important standard properties of a Microsoft Office
- document. Standard properties are things like title, author, creation
- date etc. Chances are that you will find here what you need and
- don't have to read the other sections.</note>
-
-
-<p align="justify">The first thing you should understand is that properties are stored in
- separate documents inside the POI filesystem. (If you don't know what a
- POI filesystem is, read its <a href="../poifs/index.html">documentation</a>.) A document in a POI
- filesystem is also called a <em>stream</em>.</p>
-
-
-<p align="justify">The following example shows how to read a POI filesystem's
- "title" property. Reading other properties is similar. Consider the API
- documentation of <code>org.apache.poi.hpsf.SummaryInformation</code>.</p>
-
-
-<p align="justify">The standard properties this section focusses on can be
- found in a document called <em>\005SummaryInformation</em> in the root of
- the POI filesystem. The notation <em>\005</em> in the document's name
- means the character with the decimal value of 5. In order to read the
- title, an application has to perform the following steps:</p>
-
-
-<ol>
-
-<li>
-
-<p align="justify">Open the document <em>\005SummaryInformation</em> located in the root
- of the POI filesystem.</p>
-
-</li>
-
-<li>
-
-<p align="justify">Create an instance of the class
- <code>SummaryInformation</code> from that
- document.</p>
-
-</li>
-
-<li>
-
-<p align="justify">Call the <code>SummaryInformation</code> instance's
- <code>getTitle()</code> method.</p>
-
-</li>
-
-</ol>
-
-
-<p align="justify">Sounds easy, doesn't it? Here are the steps in detail.</p>
-
-
-
-<div align="right">
-<table cellspacing="0" cellpadding="2" border="0" width="98%">
-<tr>
-<td bgcolor="#525D76"><font color="#ffffff" size="-1"><font face="Arial,sans-serif"><b>Open the document \005SummaryInformation in the root of the POI filesystem</b></font></font></td>
-</tr>
-<tr>
-<td>
-<br>
-
-
-<p align="justify">An application that wants to open a document in a POI filesystem
- (POIFS) proceeds as shown by the following code fragment. (The full
- source code of the sample application is available in the
- <em>examples</em> section of the POI source tree as
- <em>ReadTitle.java</em>.)</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>
-import java.io.*;
-import org.apache.poi.hpsf.*;
-import org.apache.poi.poifs.eventfilesystem.*;
-
-// ...
-
-public static void main(String[] args)
- throws IOException
-{
- final String filename = args[0];
- POIFSReader r = new POIFSReader();
- r.registerListener(new MyPOIFSReaderListener(),
- "\005SummaryInformation");
- r.read(new FileInputStream(filename));
-}</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">The first interesting statement is</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>POIFSReader r = new POIFSReader();</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">It creates a
- <code>org.apache.poi.poifs.eventfilesystem.POIFSReader</code> instance
- which we shall need to read the POI filesystem. Before the application
- actually opens the POI filesystem we have to tell the
- <code>POIFSReader</code> which documents we are interested in. In this
- case the application should do something with the document
- <em>\005SummaryInformation</em>.</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>
-r.registerListener(new MyPOIFSReaderListener(),
- "\005SummaryInformation");</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">This method call registers a
- <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderListener</code>
- with the <code>POIFSReader</code>. The <code>POIFSReaderListener</code>
- interface specifies the method <code>processPOIFSReaderEvent</code>
- which processes a document. The class
- <code>MyPOIFSReaderListener</code> implements the
- <code>POIFSReaderListener</code> and thus the
- <code>processPOIFSReaderEvent</code> method. The eventing POI filesystem
- calls this method when it finds the <em>\005SummaryInformation</em>
- document. In the sample application <code>MyPOIFSReaderListener</code> is
- a static class in the <em>ReadTitle.java</em> source file.)</p>
-
-
-<p align="justify">Now everything is prepared and reading the POI filesystem can
- start:</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>r.read(new FileInputStream(filename));</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">The following source code fragment shows the
- <code>MyPOIFSReaderListener</code> class and how it retrieves the
- title.</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>
-static class MyPOIFSReaderListener implements POIFSReaderListener
-{
- public void processPOIFSReaderEvent(POIFSReaderEvent e)
- {
- SummaryInformation si = null;
- try
- {
- si = (SummaryInformation)
- PropertySetFactory.create(e.getStream());
- }
- catch (Exception ex)
- {
- throw new RuntimeException
- ("Property set stream \"" +
- event.getPath() + event.getName() + "\": " + ex);
- }
- final String title = si.getTitle();
- if (title != null)
- System.out.println("Title: \"" + title + "\"");
- else
- System.out.println("Document has no title.");
- }
-}
-</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">The line</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>SummaryInformation si = null;</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">declares a <code>SummaryInformation</code> variable and initializes it
- with <code>null</code>. We need an instance of this class to access the
- title. The instance is created in a <code>try</code> block:</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>si = (SummaryInformation)
- PropertySetFactory.create(e.getStream());</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">The expression <code>e.getStream()</code> returns the input stream
- containing the bytes of the property set stream named
- <em>\005SummaryInformation</em>. This stream is passed into the
- <code>create</code> method of the factory class
- <code>org.apache.poi.hpsf.PropertySetFactory</code> which returns
- a <code>org.apache.poi.hpsf.PropertySet</code> instance. It is more or
- less safe to cast this result to <code>SummaryInformation</code>, a
- convenience class with methods like <code>getTitle()</code>,
- <code>getAuthor()</code> etc.</p>
-
-
-<p align="justify">The <code>PropertySetFactory.create</code> method may throw all sorts
- of exceptions. We'll deal with them in the next sections. For now we just
- catch all exceptions and throw a <code>RuntimeException</code>
- containing the message text of the origin exception.</p>
-
-
-<p align="justify">If all goes well, the sample application retrieves the title and prints
- it to the standard output. As you can see you must be prepared for the
- case that the POI filesystem does not have a title.</p>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td>
-<pre>final String title = si.getTitle();
- if (title != null)
- System.out.println("Title: \"" + title + "\"");
- else
- System.out.println("Document has no title.");</pre>
-</td>
-</tr>
-</table>
-</div>
-
-
-<p align="justify">Please note that a Microsoft Office document does not necessarily
- contain the <em>\005SummaryInformation</em> stream. The documents created
- by the Microsoft Office suite have one, as far as I know. However, an
- Excel spreadsheet exported from StarOffice 5.2 won't have a
- <em>\005SummaryInformation</em> stream. In this case the applications
- won't throw an exception but simply does not call the
- <code>processPOIFSReaderEvent</code> method. You have been warned!</p>
-
-</td>
-</tr>
-</table>
-</div>
-<br>
-
-</td>
-</tr>
-</table>
-</div>
-<br>
-
-
-<anchor id="sec2"></anchor>
-
-<div align="right">
-<table cellspacing="0" cellpadding="2" border="0" width="99%">
-<tr>
-<td bgcolor="#525D76"><font color="#ffffff" size="+0"><font face="Arial,sans-serif"><b>Additional Standard Properties, Exceptions And Embedded Objects</b></font></font></td>
-</tr>
-<tr>
-<td>
-<br>
-
-
-<note>This section focusses on reading additional standard properties. It
- also talks about exceptions that may be thrown when dealing with HPSF and
- shows how you can read properties of embedded objects.</note>
-
-
-<p align="justify">A couple of <em>additional standard properties</em> are not
- contained in the <em>\005SummaryInformation</em> stream explained above,
- for example a document's category or the number of multimedia clips in a
- PowerPoint presentation. Microsoft has invented an additional stream named
- <em>\005DocumentSummaryInformation</em> to hold these properties. With two
- minor exceptions you can proceed exactly as described above to read the
- properties stored in <em>\005DocumentSummaryInformation</em>:</p>
-
-
-<ul>
-
-<li>
-<p align="justify">Instead of <em>\005SummaryInformation</em> use
- <em>\005DocumentSummaryInformation</em> as the stream's name.</p>
-</li>
-
-<li>
-<p align="justify">Replace all occurrences of the class
- <code>SummaryInformation</code> by
- <code>DocumentSummaryInformation</code>.</p>
-</li>
-
-</ul>
-
-
-<p align="justify">And of course you cannot call <code>getTitle()</code> because
- <code>DocumentSummaryInformation</code> has different query methods. See
- the API documentation for the details!</p>
-
-
-<p align="justify">In the previous section the application simply caught all
- <em>exceptions</em> and was in no way interested in any
- details. However, a real application will likely want to know what went
- wrong and act appropriately. Besides any IO exceptions there are three
- HPSF resp. POI specific exceptions you should know about:</p>
-
-
-<dl>
-
-<dt>
-<code>NoPropertySetStreamException</code>:</dt>
-
-<dd>
-<p align="justify">This exception is thrown if the application tries to create a
- <code>PropertySet</code> or one of its subclasses
- <code>SummaryInformation</code> and
- <code>DocumentSummaryInformation</code> from a stream that is not a
- property set stream. A faulty property set stream counts as not being a
- property set stream at all. An application should be prepared to deal
- with this case even if opens streams named
- <em>\005SummaryInformation</em> or
- <em>\005DocumentSummaryInformation</em> only. These are just names. A
- stream's name by itself does not ensure that the stream contains the
- expected contents and that this contents is correct.</p>
-</dd>
-
-
-<dt>
-<code>UnexpectedPropertySetTypeException</code>
-</dt>
-
-<dd>
-<p align="justify">This exception is thrown if a certain type of property set is
- expected somewhere (e.g. a <code>SummaryInformation</code> or
- <code>DocumentSummaryInformation</code>) but the provided property
- set is not of that type.</p>
-</dd>
-
-
-<dt>
-<code>MarkUnsupportedException</code>
-</dt>
-
-<dd>
-<p align="justify">This exception is thrown if an input stream that is to be parsed
- into a property set does not support the
- <code>InputStream.mark(int)</code> operation. The POI filesystem uses
- the <code>DocumentInputStream</code> class which does support this
- operation, so you are safe here. However, if you read a property set
- stream from another kind of input stream things may be
- different.</p>
-</dd>
-
-</dl>
-
-
-<p align="justify">Many Microsoft Office documents contain <em>embedded
- objects</em>, for example an Excel sheet on a page in a Word
- document. Embedded objects may have property sets of their own. An
- application can open these property set streams as described above. The
- only difference is that they are not located in the POI filesystem's root
- but in a nested directory instead. Just register a
- <code>POIFSReaderListener</code> for the property set streams you are
- interested in. For example, the <em>POIBrowser</em> application in the
- contrib section tries to open each and every document in a POI filesystem
- as a property set stream. If this operation was successful it displays the
- properties.</p>
-
-</td>
-</tr>
-</table>
-</div>
-<br>
-
-
-<anchor id="sec3"></anchor>
-
-<div align="right">
-<table cellspacing="0" cellpadding="2" border="0" width="99%">
-<tr>
-<td bgcolor="#525D76"><font color="#ffffff" size="+0"><font face="Arial,sans-serif"><b>Reading Non-Standard Properties</b></font></font></td>
-</tr>
-<tr>
-<td>
-<br>
-
-
-<note>This section tells how to read
- non-standard properties. Non-standard properties are application-specific
- name/value/type triples.</note>
-
-
-<div align="center">
-<table cellspacing="2" cellpadding="2" border="1">
-<tr>
-<td bgcolor="#c0c0c0"><font size="-1" color="#023264">Write this section!</font></td>
-</tr>
-</table>
-</div>
-
-</td>
-</tr>
-</table>
-</div>
-<br>
-
-</td>
-</tr>
-</table>
-</div>
-<br>
-</td>
-</tr>
-</table>
-<br>
-<table width="100%" border="0" cellspacing="0" cellpadding="0">
-<tbody>
-<tr>
-<td>
-<hr noshade="" size="1">
-</td>
-</tr>
-<tr>
-<td align="center"><i>Copyright &copy; 2002 Apache Software Foundation</i></td>
-</tr>
-<tr>
-<td align="right" width="100%">
-<br>
-</td>
-</tr>
-<tr>
-<td align="right" width="100%"><a href="http://krysalis.org/"><img alt="Krysalis Logo" src="images/krysalis-compatible.jpg"></a><a href="http://xml.apache.org/cocoon/"><img alt="Cocoon Logo" src="images/built-with-cocoon.gif"></a></td>
-</tr>
-</tbody>
-</table>
-</body>
-</html>