<menu-item label="HSSF" href="hssf/index.html"/>
<menu-item label="HWPF" href="hwpf/index.html"/>
<menu-item label="HPSF" href="hpsf/index.html"/>
- <menu-item label="POI-Ruby" href="poi-ruby.html"/>
+ <menu-item label="HSLF" href="hslf/index.html"/>
+ <menu-item label="POI-Ruby" href="poi-ruby.html"/>
<menu-item label="POI-Utils" href="utils/index.html"/>
<menu-item label="Download" href="ext:download"/>
--- /dev/null
+<?xml version="1.0"?>
+<!-- Copyright (C) 2005 The Apache Software Foundation. All rights reserved. -->
+<!DOCTYPE book PUBLIC "-//APACHE//DTD Cocoon Documentation Book V1.0//EN" "../dtd/book-cocoon-v10.dtd">
+<book software="POI Project"
+ title="HSSF"
+ copyright="@year@ POI Project">
+ <menu label="Jakarta POI">
+ <menu-item label="Top" href="../index.html"/>
+ </menu>
+ <menu label="HSLF">
+ <menu-item label="Overview" href="index.html"/>
+ <menu-item label="Quick Guide" href="quick-guide.html"/>
+ </menu>
--- /dev/null
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Copyright (C) 2004 The Apache Software Foundation. All rights reserved. -->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd">
+ <header>
+ <title>POI-HSLF - Java API To Access Microsoft Powerpoint Format Files</title>
+ <subtitle>Overview</subtitle>
+ <authors>
+ <person name="Avik Sengupta" email="avik at apache dot org"/>
+ </authors>
+ </header>
+ <body>
+ <section>
+ <title>Overview</title>
+ <p>HSLF is the POI Project's pure Java implementation of the Powerpoint file format.</p>
+ <p>HSSF provides a way to read powerpoint presentations, and extract text from it.
+ It also provides some (currently limited) edit capabilities.
+ </p>
+ <note> This code currently lives the scratchpad area of the POI CVS repository.
+ Ensure that you have the scratchpad jar or the scratchpad build area in your
+ classpath before experimenting with this code.
+ </note>
+ <p>The <link href="./quick-guide.html">quick guide</link> documentation provides
+ information on using this API. Comments and fixes gratefully accepted on the POI
+ dev mailing lists.</p>
+ </section>
+ </body>
--- /dev/null
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Copyright (C) 2004 The Apache Software Foundation. All rights reserved. -->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "../dtd/document-v11.dtd">
+ <header>
+ <title>POI-HSLF - A Quick Guide</title>
+ <subtitle>Overview</subtitle>
+ <authors>
+ <person name="Nick Burch" email="nick at torchbox dot com"/>
+ </authors>
+ </header>
+ <body>
+ <section><title>Basic Text Extraction</title>
+ <p>For basic text extraction, make use of
+<code>org.apache.poi.extractor.PowerPointExtractor</code>. It accepts a file or an input
+stream. The <code>getText()</code> method can be used to get the text from the slides,
+from the notes, or from both.
+ </p>
+ </section>
+ <section><title>Specific Text Extraction</title>
+ <p>To get specific bits of text, first create a <code>org.apache.poi.usermodel.SlideShow</code>
+(from a <code>org.apache.poi.HSLFSlideShow</code>, which accepts a file or an input
+stream). Use <code>getSlides()</code> and <code>getNotes()</code> to get the slides and notes.
+These can be queried to get their page ID (though they should be returned
+in the right order). You can also call <code>getTextRuns()</code> on these, to get their
+blocks of text. From the <code>TextRun</code>, you can extract the text, and check
+what type of text it is (eg Body, Title)
+ </p>
+ </section>
+ <section><title>Changing Text</title>
+ <p>It is possible to change the text via <code>TextRun.setText(String)</code>. However, if
+the length of the text is changed, things will break because PowerPoint has
+internal file references in byte offsets, which are not yet all updated when
+the size changes.
+ </p>
+ </section>
+ <section><title>Guide to key classes</title>
+ <ul>
+ <li><code>org.apache.poi.hslf.HSLFSlideShow</code>
+ Handles reading in and writing out files. Generates a tree of the records
+ in the file
+ </li>
+ <li><code>org.apache.poi.hslf.usermode.SlideShow</code>
+ Builds up model entries from the records, and presents a user facing
+ view of the file
+ </li>
+ <li><code>org.apache.poi.hslf.extractor.PowerPointExtractor</code>
+ Uses the model code to allow extraction of text from files
+ </li>
+ </ul>
+ </section>
+ </body>
\ No newline at end of file