diff options
author | Andrew C. Oliver <acoliver@apache.org> | 2002-02-03 03:33:18 +0000 |
---|---|---|
committer | Andrew C. Oliver <acoliver@apache.org> | 2002-02-03 03:33:18 +0000 |
commit | d2a7221a5d4a7d405fdf736d3d16849dc0a82cd4 (patch) | |
tree | 596b7060fc43ec68383328f7fdb1a957971fae36 | |
parent | e19352347b3cf5742595bf37e03b37df8cb64f24 (diff) | |
download | poi-d2a7221a5d4a7d405fdf736d3d16849dc0a82cd4.tar.gz poi-d2a7221a5d4a7d405fdf736d3d16849dc0a82cd4.zip |
added vision
git-svn-id: https://svn.apache.org/repos/asf/jakarta/poi/trunk@352075 13f79535-47bb-0310-9956-ffa450edef68
-rw-r--r-- | src/documentation/xdocs/plan/POI20Vision.xml | 604 |
1 files changed, 604 insertions, 0 deletions
diff --git a/src/documentation/xdocs/plan/POI20Vision.xml b/src/documentation/xdocs/plan/POI20Vision.xml new file mode 100644 index 0000000000..c744f15e2e --- /dev/null +++ b/src/documentation/xdocs/plan/POI20Vision.xml @@ -0,0 +1,604 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.0//EN" "dtd/document-v10.dtd"> + +<document> + <header> + <title>POI 2.0 Vision Document</title> + <authors> + <person name="Andrew C. Oliver" email="acoliver2@users.sourceforge.net"/> + <person name="Marcus W. Johnson" email="mjohnson@apache.org"/> + <person name="Glen Stampoultzis" email="gstamp@iprimus.com.au"/> + <person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/> + </authors> + </header> + + <body> + + <s1 title="Preface"> + <p> + This is the POI 2.0 cycle vision document. Although the vision + has not changed and this document is certainly not out of date and + the vision has not changed, the structure of the project has + changed a bit. We're not going to change the vision document to + reflect this (however proper that may be) because it would only + involve deletion. There is no purpose in providing less + information provded we give clarification. + </p> + <p> + This document was created before the POI components for + <link href="http://xml.apache.org/cocoon">Apache Cocoon</link> + were accepted into the Cocoon project itself. It was also + written before POI was accepted into Jakarta. So while the + vision hasn't changed some of the components are actually now + part of other projects. We'll still be working on them on the + same timeline roughly (minus the overhead of coordination with + other groups), but they are no longer technically part of the + POI project itself. + </p> + </s1> + + <s1 title="1. Introduction"> + <s2 title="1.1 Purpose of this document"> + <p> + The purpose of this document is to + collect, analyze and define high-level requirements, user needs, + and features of the second release of the POI project software. + The POI project currently consists of the following components: + the HSSF Serializer, the HSSF library and the POIFS library. + </p> + <ul> + <li> + The HSSF Serializer is a set of Java classes whose main + class supports the Serializer interface from the Cocoon + 2 project and outputs the serialized data in a format + compatible with the spreadsheet program Microsoft Excel + '97. + </li> + <li> + The HSSF library is a set of classes for reading and + writing Microsoft Excel 97 file format using pure Java. + </li> + <li> + The POIFS library is a set of classes for reading and + writing Microsoft's OLE 2 Compound Document format using + pure Java. + </li> + </ul> + <p>By the completion of this release cycle the POI project will also + include the HSSF Generator and the HDF library. + </p> + <ul> + <li>The HSSF Generator will be responsible for using HSSF to read + in the XLS (Excel 97) file format and create SAX events. The HSSF + Generator will support the applicable interfaces specified by the + Apache Cocoon 2 project. + </li> + <li>The HDF library will provide a set of high level interfaces + for reading and writing Microsoft Word 97 file format using pure + Java.</li> + </ul> + + </s2> + + + <s2 title="1.2 Project Overview"> + <p> + The first release of the POI project + was an astounding success. This release seeks to build on that + success by: + </p> + <ul> + <li> + Refactoring POIFS into imput and + output classes as well as an event-driven API for reading. + </li> + <li> + Refactor HSSF for greater + performance as well as an event-driven API for reading + </li> + <li> + Extend HSSF by adding the ability to read and write formulas. + </li> + <li> + Extend HSSF by adding the ability to read and write + user-defined styles. + </li> + <li> + Create a Cocoon 2 Generator for HSSF using the same tags + as the HSSF Serializer. + </li> + <li> + Create a new library (HDF) for reading and writing + Microsoft Word DOC format. + </li> + <li> + Refactor the HSSFSerializer into a separate extensible + POIFSSerializer and HSSFSerializer + </li> + <li> + Providing the create excel charts. (write only) + </li> + </ul> + </s2> + </s1> + <s1 title="2. User Description"> + <s2 title="2.1 User/Market Demographics"> + <p> + There are a number of enthusiastic + users of XML, UNIX and Java technology. Furthermore, the Microsoft + solution for outputting Office Document formats often involves + actually manipulating the software as an OLE Server. This method + provides extremely low performance, extremely high overhead and is + only capable of handing one document at a time. + </p> + <ol> + <li> + Our intended audience for the HSSF + Serializer portion of this project are developers writing reports or + data extracts in XML format. + </li> + <li> + Our intended audience for the HSSF + library portion of this project is ourselves as we are developing + the HSSF serializer and anyone who needs to read and write Excel + spreadsheets in a non-XML Java environment, or who has specific + needs not addressed by the Serializer + </li> + <li> + Our intended audience for the + POIFS library is ourselves as we are developing the HSSF and HDF + libraries and anyone wishing to provide other libraries for + reading/writing other file formats utilizing the OLE 2 Compound + Document Format in Java. + </li> + <li> + Our intended audience for the HSSF + generator are developers who need to export Excel spreadsheets to + XML in a non-proprietary environment. + </li> + <li> + Our intended audience for the HDF + library is ourselves, as we will be developing a HDF Serializer in a + later release, and anyone wishing to add .DOC file processing and + creation to their projects. + </li> + </ol> + </s2> + <s2 title="2.2. User environment"> + <p> + The users of this software shall be + developers in a Java environment on any operating system, or power + users who are capable of XML document generation/deployment. + </p> + </s2> + <s2 title="2.3. Key User Needs"> + <p> + The HSSF library currently requires a + full object representation to be created before reading values. This + results in very high memory utilization. We need to reduce this + substantially for reading. It would be preferable to do this for + writing, but it may not be possible due to the constraints imposed by + the file format itself. Memory utilization during read is our top + user complaint. + </p> + <p> + The POIFS library currently requires a + full object representation to be created before reading values. This + results in very high memory utilization. We need to reduce this + substantially for reading. + </p> + <p> + The HSSF library currently ignores + formula cells and identifies them as "UnknownRecord" at the + lower level of the API. We must provide a way to read and write + formulas. This is now the top requested feature. + </p> + <p> + The HSSF library currently does not support + charts. This is a key requirement of some users who wish to use HSSF + in a reporting engine. + </p> + <p> + The HSSF Serializer currently does not + provide serialization for cell styling. User's will want stylish + spreadsheets to result from their XML. + </p> + <p> + There is currently no way to generate + the XML from an XLS that is consistent with the format used by the + HSSF Serializer. + </p> + <p> + There should be a way to read and write + the DOC file format using pure Java. + </p> + + </s2> + <s2 title="2.4. Alternatives and Competition"> + <p> + Alternatives to using HSSF to manipulate Excel files include: + </p> + <ol> + <li>Buy the $10,000 Formula 1 library + (<link href="http://www.f1j.com/">www.tidestone.com</link>) + now owned by Actuate and accept its crude api and limitations. + </li> + <li>Give up XML, Java, and operating system independence, and + write Visual Basic code in a Microsoft Windows based environment + </li> + <li>Try writing output in Microsoft's poorly documented XHTML + for Office format. + </li> + </ol> + <p> + There is also a decent library for + reading Excel documents written by Andy Khan called xlReader + (<link href="http://www.sourceforge.net/projects/xlrd">http://www.sourceforge.net/projects/xlrd</link>). + It does not provide write ability. + </p> + <p> + There are a number of PERL and C alternatives. + None are consistent. + </p> + </s2> + </s1> + <s1 title="3. Project Overview"> + <s2 title="3.1. Project Perspective"> + <p> + The produced code shall be licensed by + the Apache License as used by the Cocoon 2 project (APL 1.1) and + maintained on at <link href="http://poi.sourceforge.net/">http://poi.sourceforge.net</link> + and <link href="http://sourcefoge.net/projects/poi">http://sourcefoge.net/projects/poi</link>. + It is our hope to at some point integrate with the various Apache + projects (xml.apache.org and jakarta.apache.org), at which point we'd + turn the copyright over to them. + </p> + </s2> + <s2 title="3.2. Project Position Statement"> + <p> + For developers on a Java and/or XML + environment this project will provide all the tools necessary for + outputting XML data in the Microsoft Excel format. This project seeks + to make the use of Microsoft Windows based servers unnecessary for + file format considerations and to fully document the OLE 2 Compound + Document format. The project aims not only to provide the tools for + serializing XML to Excel and Word file formats and the tools for + writing to those file formats from Java, but also to provide the + tools for later projects to convert other OLE 2 Compound Document + formats to pure Java APIs. + </p> + </s2> + <s2 title="3.3. Summary of Capabilities"> + <p> + HSSF Serializer for Apache Cocoon 2 + </p> + <table> + <tr> + <td> + Benefit + </td> + <td> + Supporting Features + </td> + </tr> + <tr> + <td> + Ability to serialize styles from XML spreadsheets. + </td> + <td> + HSSFSerialzier will support styles. + </td> + </tr> + <tr> + <td> + Ability to read and write formulas in XLS files. + </td> + <td> + HSSF will support reading/writing formulas. + </td> + </tr> + <tr> + <td> + Ability to output in MS Word on any platform using Java. + </td> + <td> + The project will develop an API that outputs in Word format + using pure Java. + </td> + </tr> + <tr> + <td> + Enhance performance for reading and writing XLS files. + </td> + <td> + HSSF will undergo a number of performance enhancements. HSSF + will include a new event-based API for reading XLS files. POIFS + will support a new event-based API for reading OLE2 CDF files. + </td> + </tr> + <tr> + <td> + Ability to generate XML from XLS files + </td> + <td> + The project will develop an HSSF Generator. + </td> + </tr> + <tr> + <td> + The ability to generate charts + </td> + <td> + HSSF will provide low level support for chart records as well + as high level API support for generating charts. The ability + to read chart information will not initially be provided. + </td> + </tr> + + </table> + </s2> + <s2 title="3.4. Assumptions and Dependencies"> + <ul> + <li> + The HSSF Serializer and Generator + will support the Gnumeric 1.0 XML tag language. + </li> + <li> + The HSSF Generator and HSSF + Serializer will be mutually validating. It should be possible to + have an XLS file created by the Serializer run through the Generator + and the output back through the Serializer (via the Cocoon pipeline) + and get the same file or a reasonable facimille (no one cares if it + differs by the order of the binary records in some minor but + non-visually recognizable manner). + </li> + <li> + The HSSF Generator will run on any + Java 2 supporting platform with Apache Cocoon 2 installed along with + the HSSF and POIFS APIs. + </li> + <li> + The HSSF Serializer will run on + any Java 2 supporting platform with Apache Cocoon 2 installed along + with the HSSF and POIFS APIs. + </li> + <li> + The HDF API requires a Java 2 + implementation and the POIFS API. + </li> + <li> + The HSSF API requires a Java 2 + implementation and the POIFS API. + </li> + <li> + The POIFS API requires a Java 2 + implementation. + </li> + + </ul> + </s2> + </s1> + <s1 title="4. Project Features"> + <p> + Enhancements to the POIFS API will + include: + </p> + <ul> + <li> + An event driven API for reading + POIFS Filesystems. + </li> + <li> + A low-level API for + creating/manipulating POI filesystems. + </li> + <li> + Code improvements supporting + greater separation between read and write structures. + </li> + </ul> + <p> + Enhancements to the HSSF API will + include: + </p> + <ul> + <li> + An event driven API for reading + XLS files. + </li> + <li> + Performance improvements. + </li> + <li> + Formula support (read/write) + </li> + <li> + Support for user-defined data + formats + </li> + <li> + Better documentation of the file + format and structure. + </li> + <li> + An API for creation of charts. + </li> + </ul> + <p> + The HSSF Generator will include: + </p> + <ul> + <li> + A set of classes supporting the + Cocoon 2 Generator interfaces providing a method for reading XLS + files and outputting SAX events. + </li> + <li> + The same tag format used by the + HSSFSerializer in any given release. + </li> + </ul> + <p> + The HDF API will include: + </p> + <ul> + <li> + An event driven API for reading + DOC files. + </li> + <li> + A set of high and low level APIs + for reading and writing DOC files. + </li> + <li> + Documentation of the DOC file + format or enhancements to existing documentation. + </li> + </ul> + </s1> + <s1 title="5. Other Product Requirements"> + <s2 title="5.1. Applicable Standards"> + <p> + All Java code will be 100% pure Java. + </p> + </s2> + <s2 title="5.2. System Requirements"> + <p> + The minimum system requirements for the POIFS API are: + </p> + <ul> + <li>64 Mbytes memory</li> + <li>Java 2 environment</li> + <li>Pentium or better processor (or equivalent on other platforms)</li> + </ul> + <p> + The minimum system requirements for the the HSSF API are: + </p> + <ul> + <li>64 Mbytes memory</li> + <li>Java 2 environment</li> + <li>Pentium or better processor (or equivalent on other platforms)</li> + <li>POIFS API</li> + </ul> + <p> + The minimum system requirements for the the HDF API are: + </p> + <ul> + <li>64 Mbytes memory</li> + <li>Java 2 environment</li> + <li>Pentium or better processor (or equivalent on other platforms)</li> + <li>POIFS API</li> + </ul> + + <p> + The minimum system requirements for the HSSF Serializer are: + </p> + <ul> + <li>64 Mbytes memory</li> + <li>Java 2 environment</li> + <li>Pentium or better processor (or equivalent on other platforms)</li> + <li>Cocoon 2</li> + <li>HSSF API</li> + <li>POI API</li> + </ul> + </s2> + <s2 title="5.3. Performance Requirements"> + <p> + All components must perform well enough + to be practical for use in a webserver environment (especially + the "killer trio": Cocoon2/Tomcat/Apache combo) + </p> + </s2> + <s2 title="5.4. Environmental Requirements"> + <p> + The software will run primarily in + developer environments. We should make some allowances for + not-highly-technical users to write XML documents for the HSSF + Serializer. All other components will assume intermediate Java 2 + knowledge. No XML knowledge will be required except for using the + HSSF Serializer. As much documentation as is practical shall be + required for all components as XML is relatively new, and the + concepts introduced for writing spreadsheets and to POI filesystems + will be brand new to Java and many Java developers. + </p> + </s2> + </s1> + <s1 title="6. Documentation Requirements"> + <s2 title="6.1 POI Filesystem"> + <p> + The filesystem as read and written by + POI shall be fully documented and explained so that the average Java + developer can understand it. + </p> + </s2> + <s2 title="6.2. POI API"> + <p> + The POI API will be fully documented + through Javadoc. A walkthrough of using the high level POI API shall + be provided. No documentation outside of the Javadoc shall be + provided for the low-level POI APIs. + </p> + </s2> + <s2 title="6.3. HSSF File Format"> + <p> + The HSSF File Format as implemented by + the HSSF API will be fully documented. No documentation will be + provided for features that are not supported by HSSF API that are + supported by the Excel 97 File Format. Care will be taken not to + infringe on any "legal stuff". Additionally, we are + collaborating with the fine folks at OpenOffice.org on + *free* documentation of the format. + </p> + </s2> + <s2 title="6.4. HSSF API"> + <p> + The HSSF API will be documented by + javadoc. A walkthrough of using the high level HSSF API shall be + provided. No documentation outside of the Javadoc shall be provided + for the low level HSSF APIs. + </p> + </s2> + <s2 title="6.5 HDF API"> + <p> + The HDF API will be documented by + javadoc. A walkthrough of using the high level HDF API shall be + provided. No documentation outside of the Javadoc shall be provided + for the low level HDF APIs. + </p> + </s2> + <s2 title="6.6 HSSF Serializer"> + <p> + The HSSF Serializer will be documented + by javadoc. + </p> + </s2> + <s2 title="6.7 HSSF Generator"> + <p> + The HSSF Generator will be documented + by javadoc. + </p> + </s2> + <s2 title="6.8 HSSF Serializer Tag language"> + <p> + The XML tag language along with + function and usage shall be fully documented. Examples will be + provided as well. + </p> + </s2> + </s1> + <s1 title="7. Terminology"> + <s2 title="7.1 Filesystem"> + <p> + filesystem shall refer only to the POI formatted archive. + </p> + </s2> + <s2 title="7.2 File"> + <p> + file shall refer to the embedded data stream within a + POI filesystem. This will be the actual embedded document. + </p> + </s2> + </s1> +</body> +</document> + |