diff options
Diffstat (limited to 'build/jakarta-poi/docs/hpsf/how-to.html')
-rw-r--r-- | build/jakarta-poi/docs/hpsf/how-to.html | 622 |
1 files changed, 0 insertions, 622 deletions
diff --git a/build/jakarta-poi/docs/hpsf/how-to.html b/build/jakarta-poi/docs/hpsf/how-to.html deleted file mode 100644 index 12655dd39a..0000000000 --- a/build/jakarta-poi/docs/hpsf/how-to.html +++ /dev/null @@ -1,622 +0,0 @@ -<html> -<head> -<META http-equiv="Content-Type" content="text/html; charset=UTF-8"> -<meta content="text/html; charset=ISO-8859-1"> -<title>HPSF HOW-TO</title> -<style type="text/css"> - body { background-color: white; font-size: normal; color: black ; } - a { color: #525d76; } - a.black { color: #000000;} - table {border-width: 0; width: 100%} - table.centered {text-align: center} - table.title {text-align: center; width: 80%} - img{border-width: 0;} - span.s1 {font-family: Helvetica, Arial, sans-serif; font-weight: bold; color: #000000; } - span.s1_white { font-family: Helvetica, Arial, sans-serif; font-weight: bold; color: #ffffff; } - span.title {font-family: Helvetica, Arial, sans-serif; font-weight: bold; color: #000000; } - span.c1 {color: #000000; font-family: Helvetica, Arial, sans-serif} - tr.left {text-align: left} - hr { width: 100%; size: 2} -</style> -</head> -<body> -<table width="100%" cellspacing="0" cellpadding="0" border="0"> -<tr> -<td valign="top" align="left"><a href="http://jakarta.apache.org/index.html"><img hspace="0" vspace="0" border="0" src="images/jakarta-logo.gif"></a></td><td width="100%" valign="top" align="left" bgcolor="#ffffff"><img hspace="0" vspace="0" border="0" align="right" src="images/header.gif"></td> -</tr> -<tr> -<td colspan="2" bgcolor="#525d76"><span class="c1"><a class="black" href="http://www.apache.org/">www.apache.org ></a><a class="black" href="http://jakarta.apache.org/">jakarta.apache.org ></a><a href="http://jakarta.apache.org/poi/" class="black">jakarta.apache.org/poi</a></span></td> -</tr> -<tr> -<td height="8"></td> -</tr> -</table> -<table border="0" cellpadding="0" cellspacing="0" width="100%"> -<tr> -<td width="1%"> -<br> -</td><td nowrap="1" valign="top" width="14%"> -<br> -<span class="s1">Navigation</span> -<br> -<a class="s1" href="../index.html">Main</a> -<br> -<br> -<span class="s1">HPSF</span> -<br> -<a class="s1" href="index.html">Overview</a> -<br> -<a class="s1" href="how-to.html">How To</a> -<br> -<a class="s1" href="internals.html">Internals</a> -<br> -<a class="s1" href="todo.html">To Do</a> -<br> -</td><td width="1%"> -<br> -</td><td align="left" valign="top" width="*"> -<title>HPSF HOW-TO</title> -<table width="100%" align="center" class="centered"> -<tbody> -<tr> -<td align="center"> -<table border="0" cellpadding="1" cellspacing="0" class="title"> -<tbody> -<tr> -<td bgcolor="#525d76"> -<table width="100%" border="0" cellpadding="2" cellspacing="0" class="centered"> -<tbody> -<tr> -<td bgcolor="#f3dd61"><span class="title">HPSF HOW-TO</span></td> -</tr> -</tbody> -</table> -</td> -</tr> -</tbody> -</table> -</td> -</tr> -</tbody> -</table> -<font size="-2" color="#000000"> -<p> -<a href="mailto:"></a> -</p> -</font> -<div align="right"> -<table cellspacing="0" cellpadding="2" border="0" width="100%"> -<tr> -<td bgcolor="#525D76"><font color="#ffffff" size="+1"><font face="Arial,sans-serif"><b>How To Use the HPSF APIs</b></font></font></td> -</tr> -<tr> -<td> -<br> - - -<p align="justify">This HOW-TO is organized in three section. You should read them - sequentially because the later sections build upon the earlier ones.</p> - - -<ol> - -<li> - -<p align="justify">The <a href="#sec1">first section</a> explains how to read - the most important standard properties of a Microsoft Office - document. Standard properties are things like title, author, creation - date etc. It is quite likely that you will find here what you need and - don't have to read the other sections.</p> - -</li> - - -<li> - -<p align="justify">The <a href="#sec2">second section</a> goes a small step - further and focusses on reading additional standard properties. It also - talks about exceptions that may be thrown when dealing with HPSF and - shows how you can read properties of embedded objects.</p> - -</li> - - -<li> - -<p align="justify">The <a href="#sec3">third section</a> tells how to read - non-standard properties. Non-standard properties are application-specific - name/value/type triples.</p> - -</li> - -</ol> - - -<anchor id="sec1"></anchor> - -<div align="right"> -<table cellspacing="0" cellpadding="2" border="0" width="99%"> -<tr> -<td bgcolor="#525D76"><font color="#ffffff" size="+0"><font face="Arial,sans-serif"><b>Reading Standard Properties</b></font></font></td> -</tr> -<tr> -<td> -<br> - - -<note>This section explains how to read - the most important standard properties of a Microsoft Office - document. Standard properties are things like title, author, creation - date etc. Chances are that you will find here what you need and - don't have to read the other sections.</note> - - -<p align="justify">The first thing you should understand is that properties are stored in - separate documents inside the POI filesystem. (If you don't know what a - POI filesystem is, read its <a href="../poifs/index.html">documentation</a>.) A document in a POI - filesystem is also called a <em>stream</em>.</p> - - -<p align="justify">The following example shows how to read a POI filesystem's - "title" property. Reading other properties is similar. Consider the API - documentation of <code>org.apache.poi.hpsf.SummaryInformation</code>.</p> - - -<p align="justify">The standard properties this section focusses on can be - found in a document called <em>\005SummaryInformation</em> in the root of - the POI filesystem. The notation <em>\005</em> in the document's name - means the character with the decimal value of 5. In order to read the - title, an application has to perform the following steps:</p> - - -<ol> - -<li> - -<p align="justify">Open the document <em>\005SummaryInformation</em> located in the root - of the POI filesystem.</p> - -</li> - -<li> - -<p align="justify">Create an instance of the class - <code>SummaryInformation</code> from that - document.</p> - -</li> - -<li> - -<p align="justify">Call the <code>SummaryInformation</code> instance's - <code>getTitle()</code> method.</p> - -</li> - -</ol> - - -<p align="justify">Sounds easy, doesn't it? Here are the steps in detail.</p> - - - -<div align="right"> -<table cellspacing="0" cellpadding="2" border="0" width="98%"> -<tr> -<td bgcolor="#525D76"><font color="#ffffff" size="-1"><font face="Arial,sans-serif"><b>Open the document \005SummaryInformation in the root of the POI filesystem</b></font></font></td> -</tr> -<tr> -<td> -<br> - - -<p align="justify">An application that wants to open a document in a POI filesystem - (POIFS) proceeds as shown by the following code fragment. (The full - source code of the sample application is available in the - <em>examples</em> section of the POI source tree as - <em>ReadTitle.java</em>.)</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre> -import java.io.*; -import org.apache.poi.hpsf.*; -import org.apache.poi.poifs.eventfilesystem.*; - -// ... - -public static void main(String[] args) - throws IOException -{ - final String filename = args[0]; - POIFSReader r = new POIFSReader(); - r.registerListener(new MyPOIFSReaderListener(), - "\005SummaryInformation"); - r.read(new FileInputStream(filename)); -}</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">The first interesting statement is</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre>POIFSReader r = new POIFSReader();</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">It creates a - <code>org.apache.poi.poifs.eventfilesystem.POIFSReader</code> instance - which we shall need to read the POI filesystem. Before the application - actually opens the POI filesystem we have to tell the - <code>POIFSReader</code> which documents we are interested in. In this - case the application should do something with the document - <em>\005SummaryInformation</em>.</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre> -r.registerListener(new MyPOIFSReaderListener(), - "\005SummaryInformation");</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">This method call registers a - <code>org.apache.poi.poifs.eventfilesystem.POIFSReaderListener</code> - with the <code>POIFSReader</code>. The <code>POIFSReaderListener</code> - interface specifies the method <code>processPOIFSReaderEvent</code> - which processes a document. The class - <code>MyPOIFSReaderListener</code> implements the - <code>POIFSReaderListener</code> and thus the - <code>processPOIFSReaderEvent</code> method. The eventing POI filesystem - calls this method when it finds the <em>\005SummaryInformation</em> - document. In the sample application <code>MyPOIFSReaderListener</code> is - a static class in the <em>ReadTitle.java</em> source file.)</p> - - -<p align="justify">Now everything is prepared and reading the POI filesystem can - start:</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre>r.read(new FileInputStream(filename));</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">The following source code fragment shows the - <code>MyPOIFSReaderListener</code> class and how it retrieves the - title.</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre> -static class MyPOIFSReaderListener implements POIFSReaderListener -{ - public void processPOIFSReaderEvent(POIFSReaderEvent e) - { - SummaryInformation si = null; - try - { - si = (SummaryInformation) - PropertySetFactory.create(e.getStream()); - } - catch (Exception ex) - { - throw new RuntimeException - ("Property set stream \"" + - event.getPath() + event.getName() + "\": " + ex); - } - final String title = si.getTitle(); - if (title != null) - System.out.println("Title: \"" + title + "\""); - else - System.out.println("Document has no title."); - } -} -</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">The line</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre>SummaryInformation si = null;</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">declares a <code>SummaryInformation</code> variable and initializes it - with <code>null</code>. We need an instance of this class to access the - title. The instance is created in a <code>try</code> block:</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre>si = (SummaryInformation) - PropertySetFactory.create(e.getStream());</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">The expression <code>e.getStream()</code> returns the input stream - containing the bytes of the property set stream named - <em>\005SummaryInformation</em>. This stream is passed into the - <code>create</code> method of the factory class - <code>org.apache.poi.hpsf.PropertySetFactory</code> which returns - a <code>org.apache.poi.hpsf.PropertySet</code> instance. It is more or - less safe to cast this result to <code>SummaryInformation</code>, a - convenience class with methods like <code>getTitle()</code>, - <code>getAuthor()</code> etc.</p> - - -<p align="justify">The <code>PropertySetFactory.create</code> method may throw all sorts - of exceptions. We'll deal with them in the next sections. For now we just - catch all exceptions and throw a <code>RuntimeException</code> - containing the message text of the origin exception.</p> - - -<p align="justify">If all goes well, the sample application retrieves the title and prints - it to the standard output. As you can see you must be prepared for the - case that the POI filesystem does not have a title.</p> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td> -<pre>final String title = si.getTitle(); - if (title != null) - System.out.println("Title: \"" + title + "\""); - else - System.out.println("Document has no title.");</pre> -</td> -</tr> -</table> -</div> - - -<p align="justify">Please note that a Microsoft Office document does not necessarily - contain the <em>\005SummaryInformation</em> stream. The documents created - by the Microsoft Office suite have one, as far as I know. However, an - Excel spreadsheet exported from StarOffice 5.2 won't have a - <em>\005SummaryInformation</em> stream. In this case the applications - won't throw an exception but simply does not call the - <code>processPOIFSReaderEvent</code> method. You have been warned!</p> - -</td> -</tr> -</table> -</div> -<br> - -</td> -</tr> -</table> -</div> -<br> - - -<anchor id="sec2"></anchor> - -<div align="right"> -<table cellspacing="0" cellpadding="2" border="0" width="99%"> -<tr> -<td bgcolor="#525D76"><font color="#ffffff" size="+0"><font face="Arial,sans-serif"><b>Additional Standard Properties, Exceptions And Embedded Objects</b></font></font></td> -</tr> -<tr> -<td> -<br> - - -<note>This section focusses on reading additional standard properties. It - also talks about exceptions that may be thrown when dealing with HPSF and - shows how you can read properties of embedded objects.</note> - - -<p align="justify">A couple of <em>additional standard properties</em> are not - contained in the <em>\005SummaryInformation</em> stream explained above, - for example a document's category or the number of multimedia clips in a - PowerPoint presentation. Microsoft has invented an additional stream named - <em>\005DocumentSummaryInformation</em> to hold these properties. With two - minor exceptions you can proceed exactly as described above to read the - properties stored in <em>\005DocumentSummaryInformation</em>:</p> - - -<ul> - -<li> -<p align="justify">Instead of <em>\005SummaryInformation</em> use - <em>\005DocumentSummaryInformation</em> as the stream's name.</p> -</li> - -<li> -<p align="justify">Replace all occurrences of the class - <code>SummaryInformation</code> by - <code>DocumentSummaryInformation</code>.</p> -</li> - -</ul> - - -<p align="justify">And of course you cannot call <code>getTitle()</code> because - <code>DocumentSummaryInformation</code> has different query methods. See - the API documentation for the details!</p> - - -<p align="justify">In the previous section the application simply caught all - <em>exceptions</em> and was in no way interested in any - details. However, a real application will likely want to know what went - wrong and act appropriately. Besides any IO exceptions there are three - HPSF resp. POI specific exceptions you should know about:</p> - - -<dl> - -<dt> -<code>NoPropertySetStreamException</code>:</dt> - -<dd> -<p align="justify">This exception is thrown if the application tries to create a - <code>PropertySet</code> or one of its subclasses - <code>SummaryInformation</code> and - <code>DocumentSummaryInformation</code> from a stream that is not a - property set stream. A faulty property set stream counts as not being a - property set stream at all. An application should be prepared to deal - with this case even if opens streams named - <em>\005SummaryInformation</em> or - <em>\005DocumentSummaryInformation</em> only. These are just names. A - stream's name by itself does not ensure that the stream contains the - expected contents and that this contents is correct.</p> -</dd> - - -<dt> -<code>UnexpectedPropertySetTypeException</code> -</dt> - -<dd> -<p align="justify">This exception is thrown if a certain type of property set is - expected somewhere (e.g. a <code>SummaryInformation</code> or - <code>DocumentSummaryInformation</code>) but the provided property - set is not of that type.</p> -</dd> - - -<dt> -<code>MarkUnsupportedException</code> -</dt> - -<dd> -<p align="justify">This exception is thrown if an input stream that is to be parsed - into a property set does not support the - <code>InputStream.mark(int)</code> operation. The POI filesystem uses - the <code>DocumentInputStream</code> class which does support this - operation, so you are safe here. However, if you read a property set - stream from another kind of input stream things may be - different.</p> -</dd> - -</dl> - - -<p align="justify">Many Microsoft Office documents contain <em>embedded - objects</em>, for example an Excel sheet on a page in a Word - document. Embedded objects may have property sets of their own. An - application can open these property set streams as described above. The - only difference is that they are not located in the POI filesystem's root - but in a nested directory instead. Just register a - <code>POIFSReaderListener</code> for the property set streams you are - interested in. For example, the <em>POIBrowser</em> application in the - contrib section tries to open each and every document in a POI filesystem - as a property set stream. If this operation was successful it displays the - properties.</p> - -</td> -</tr> -</table> -</div> -<br> - - -<anchor id="sec3"></anchor> - -<div align="right"> -<table cellspacing="0" cellpadding="2" border="0" width="99%"> -<tr> -<td bgcolor="#525D76"><font color="#ffffff" size="+0"><font face="Arial,sans-serif"><b>Reading Non-Standard Properties</b></font></font></td> -</tr> -<tr> -<td> -<br> - - -<note>This section tells how to read - non-standard properties. Non-standard properties are application-specific - name/value/type triples.</note> - - -<div align="center"> -<table cellspacing="2" cellpadding="2" border="1"> -<tr> -<td bgcolor="#c0c0c0"><font size="-1" color="#023264">Write this section!</font></td> -</tr> -</table> -</div> - -</td> -</tr> -</table> -</div> -<br> - -</td> -</tr> -</table> -</div> -<br> -</td> -</tr> -</table> -<br> -<table width="100%" border="0" cellspacing="0" cellpadding="0"> -<tbody> -<tr> -<td> -<hr noshade="" size="1"> -</td> -</tr> -<tr> -<td align="center"><i>Copyright © 2002 Apache Software Foundation</i></td> -</tr> -<tr> -<td align="right" width="100%"> -<br> -</td> -</tr> -<tr> -<td align="right" width="100%"><a href="http://krysalis.org/"><img alt="Krysalis Logo" src="images/krysalis-compatible.jpg"></a><a href="http://xml.apache.org/cocoon/"><img alt="Cocoon Logo" src="images/built-with-cocoon.gif"></a></td> -</tr> -</tbody> -</table> -</body> -</html> |