diff options
author | Dave Fisher <wave@apache.org> | 2009-11-19 21:22:21 +0000 |
---|---|---|
committer | Dave Fisher <wave@apache.org> | 2009-11-19 21:22:21 +0000 |
commit | b5cdef8c2e9d6346e2064d4d04cb8df22129c6cb (patch) | |
tree | a4af67739f50b84c99cd248b4f72832e822295f0 /src/documentation/content/xdocs | |
parent | 48368af6c9bbc25c7c28cba9db0c6d6b71173832 (diff) | |
download | poi-b5cdef8c2e9d6346e2064d4d04cb8df22129c6cb.tar.gz poi-b5cdef8c2e9d6346e2064d4d04cb8df22129c6cb.zip |
Many documentation changes. See https://issues.apache.org/bugzilla/show_bug.cgi?id=48242
git-svn-id: https://svn.apache.org/repos/asf/poi/trunk@882301 13f79535-47bb-0310-9956-ffa450edef68
Diffstat (limited to 'src/documentation/content/xdocs')
-rw-r--r-- | src/documentation/content/xdocs/book.xml | 28 | ||||
-rw-r--r-- | src/documentation/content/xdocs/casestudies.xml | 17 | ||||
-rwxr-xr-x | src/documentation/content/xdocs/download.xml | 74 | ||||
-rw-r--r-- | src/documentation/content/xdocs/faq.xml | 393 | ||||
-rw-r--r-- | src/documentation/content/xdocs/guidelines.xml | 197 | ||||
-rw-r--r-- | src/documentation/content/xdocs/howtobuild.xml | 80 | ||||
-rw-r--r-- | src/documentation/content/xdocs/index.xml | 194 | ||||
-rw-r--r-- | src/documentation/content/xdocs/legal.xml | 20 | ||||
-rw-r--r-- | src/documentation/content/xdocs/mailinglists.xml | 3 | ||||
-rw-r--r-- | src/documentation/content/xdocs/overview.xml | 290 | ||||
-rw-r--r-- | src/documentation/content/xdocs/subversion.xml | 13 | ||||
-rw-r--r-- | src/documentation/content/xdocs/text-extraction.xml | 81 | ||||
-rw-r--r-- | src/documentation/content/xdocs/who.xml | 39 |
13 files changed, 881 insertions, 548 deletions
diff --git a/src/documentation/content/xdocs/book.xml b/src/documentation/content/xdocs/book.xml index afcf3a7a7b..048ca3f444 100644 --- a/src/documentation/content/xdocs/book.xml +++ b/src/documentation/content/xdocs/book.xml @@ -24,33 +24,37 @@ copyright="@year@ POI Project" xmlns:xlink="http://www.w3.org/1999/xlink"> - <menu label="Apache POI"> + <menu label="Overview"> <menu-item label="Home" href="index.html"/> <menu-item label="Download" href="download.html"/> - <menu-item label="Changes" href="changes.html"/> + <menu-item label="Components" href="overview.html"/> + <menu-item label="Text Extraction" href="text-extraction.html"/> + <menu-item label="Case Studies" href="casestudies.html"/> + <menu-item label="Legal" href="legal.html"/> </menu> - <menu label="Documentation"> - <menu-item label="Source" href="subversion.html"/> - <menu-item label="Building" href="howtobuild.html"/> + <menu label="Help"> <menu-item label="Javadocs" href="ext:javadoc"/> - <menu-item label="Case Studies" href="casestudies.html"/> <menu-item label="FAQ" href="faq.html"/> <menu-item label="Mailing Lists" href="mailinglists.html"/> <menu-item label="Bug Database" href="http://issues.apache.org/bugzilla/buglist.cgi?product=POI"/> - <menu-item label="Get Involved" href="getinvolved/index.html"/> + <menu-item label="Changes Log" href="changes.html"/> + </menu> + + <menu label="Getting Involved"> + <menu-item label="Subversion Repository" href="subversion.html"/> + <menu-item label="How To Build" href="howtobuild.html"/> + <menu-item label="Contribution Guidelines" href="guidelines.html"/> <menu-item label="Who We Are" href="who.html"/> - <menu-item label="Legal" href="legal.html"/> </menu> - <menu label="Modules"> - <menu-item label="Text Extraction" href="text-extraction.html"/> + <menu label="Component APIs"> <menu-item label="Excel (SS=HSSF+XSSF)" href="spreadsheet/index.html"/> <menu-item label="Word (HWPF+XWPF)" href="hwpf/index.html"/> <menu-item label="PowerPoint (HSLF+XSLF)" href="slideshow/index.html"/> + <menu-item label="OpenXML4J (OOXML)" href="oxml4j/index.html"/> <menu-item label="OLE2 Filesystem (POIFS)" href="poifs/index.html"/> - <menu-item label="OLE2 Properties (HPSF)" href="hpsf/index.html"/> - <menu-item label="OpenXML4J" href="oxml4j/index.html"/> + <menu-item label="OLE2 Document Props (HPSF)" href="hpsf/index.html"/> <menu-item label="Outlook (HSMF)" href="hsmf/index.html"/> <menu-item label="Visio (HDGF)" href="hdgf/index.html"/> <menu-item label="Publisher (HPBF)" href="hpbf/index.html"/> diff --git a/src/documentation/content/xdocs/casestudies.xml b/src/documentation/content/xdocs/casestudies.xml index 6b3b4e5c9b..3646c42b19 100644 --- a/src/documentation/content/xdocs/casestudies.xml +++ b/src/documentation/content/xdocs/casestudies.xml @@ -25,6 +25,7 @@ <authors> <person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/> <person id="CR" name="Cameron Riley" email="crileyNO@SPAMekmail.com"/> + <person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/> </authors> </header> @@ -196,6 +197,22 @@ format, <p>It is obvious that Microsoft Excel files are also supported. POI has been used to successfully implement this support in ETL4ALL.</p> </section> + <section> + <title>JM Lafferty Associates, Inc.</title> + <p> + On its <link href="http://www.forecastworks.com/public/">ForecastWorks</link> website + <link href="http://www.jmlafferty.com/">JM Lafferty Associates, Inc.</link> produces dynamic on demand + financial analyses of companies and institutional funds. The pages produced are selected and exported + in several file formats including PPT and XLS. + </p> + <ul> + <li>The PPT files produced are of high quality which is on a par with similar PDF files.</li> + <li>The XLS files produced contain a complex forecasting model built from a template with a VBA Macro.</li> + </ul> + <p> + David Fisher (dfisher@jmlafferty.com) + </p> + </section> </section> </body> diff --git a/src/documentation/content/xdocs/download.xml b/src/documentation/content/xdocs/download.xml index 72a58735ea..6a8d2bb89b 100755 --- a/src/documentation/content/xdocs/download.xml +++ b/src/documentation/content/xdocs/download.xml @@ -21,27 +21,28 @@ <document> <header> - <title>Apache POI - Download</title> + <title>Apache POI - Download Release Artifacts</title> </header> <body> - <section><title>Downloads</title> + <section><title>Available Downloads</title> <p> - Use the links below to download Apache POI releases from one of our mirrors. - You should <link href="download.html#verify">verify the integrity</link> - of the files using the signatures and checksums available from this page. + This page provides instructions on how to download and verify Apache POI release artifacts. </p> <ul> - <li>Latest stable release: <link href="download.html#POI-3.5-FINAL">Apache POI 3.5</link></li> - <li><link href="http://archive.apache.org/dist/poi/">Release Archive</link></li> + <li><link href="download.html#POI-3.5-FINAL">The latest stable release is Apache POI 3.5</link></li> + <li><link href="download.html#archive">Archives of all prior releases</link></li> </ul> <p> Apache POI releases are available under the <link href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0.</link> See the NOTICE file contained in each release artifact for applicable copyright attribution notices. - </p> + </p> + <p> + To insure that you have downloaded the true release you should <link href="download.html#verify">verify the integrity</link> + of the files using the signatures and checksums available from this page. + </p> </section> - <section><title>28 September 2009 - POI 3.5-FINAL available</title> - <anchor id="POI-3.5-FINAL"/> + <section id="POI-3.5-FINAL"><title>28 September 2009 - POI 3.5-FINAL available</title> <p>The Apache POI team is pleased to announce the release of 3.5 FINAL. This release brings many improvements including support for the new OOXML formats introduced in Office 2007, such as XLSX and DOCX. </p> @@ -81,11 +82,11 @@ </ul> </section> </section> - <section><title>Verify</title> - <anchor id="verify"/> + <section id="verify"><title>Verify</title> <p> It is essential that you verify the integrity of the downloaded files using the PGP or MD5 signatures. - Please read Verifying Apache HTTP Server Releases for more information on why you should verify our releases. + Please read <link href="http://httpd.apache.org/dev/verification.html">Verifying Apache HTTP Server Releases</link> + for more information on why you should verify our releases. This page provides detailed instructions which you can use for POI artifacts. </p> <p> The PGP signatures can be verified using PGP or GPG. First download the KEYS file as well as the .asc signature files @@ -93,20 +94,55 @@ rather than from a mirror. Then verify the signatures using </p> <source> - % pgpk -a KEYS - % pgpv poi-X.Y.Z.jar.asc +% pgpk -a KEYS +% pgpv poi-X.Y.Z.jar.asc </source> <p>or</p> <source> - % pgp -ka KEYS - % pgp poi-X.Y.Z.jar.asc +% pgp -ka KEYS +% pgp poi-X.Y.Z.jar.asc </source> <p>or</p> <source> - % gpg --import KEYS - % gpg --verify poi-X.Y.Z.jar.asc +% gpg --import KEYS +% gpg --verify poi-X.Y.Z.jar.asc + </source> + <p>Sample verification of poi-bin-3.5-FINAL-20090928.tar.gz</p> + <source> +% gpg --import KEYS +gpg: key 12DAE9BE: "Glen Stampoultzis <glens at apache dot org>" not changed +gpg: key 4CEED75F: "Nick Burch <nick at gagravarr dot org>" not changed +gpg: key 84B5A42E: "Rainer Klute <rainer.klute at gmx dot de>" not changed +gpg: key F5BB52CD: "Yegor Kozlov <yegor.kozlov at gmail dot com>" not changed +gpg: Total number processed: 4 +gpg: unchanged: 4 +% gpg --verify poi-bin-3.5-FINAL-20090928.tar.gz.asc +gpg: Signature made Mon Sep 28 10:28:25 2009 PDT using DSA key ID F5BB52CD +gpg: Good signature from "Yegor Kozlov <yegor.kozlov at gmail dot com>" +gpg: aka "Yegor Kozlov <yegor at dinom dot ru>" +gpg: aka "Yegor Kozlov <yegor at apache dot org>" +Primary key fingerprint: 7D77 0C77 6CE7 754E E6AF 23AA 6934 0A02 F5BB 52CD +% gpg --fingerprint F5BB52CD +pub 1024D/F5BB52CD 2007-06-18 [expires: 2012-06-16] + Key fingerprint = 7D77 0C77 6CE7 754E E6AF 23AA 6934 0A02 F5BB 52CD +uid Yegor Kozlov <yegor.kozlov at gmail dot com> +uid Yegor Kozlov <yegor at dinom dot ru> +uid Yegor Kozlov <yegor at apache dot org> +sub 4096g/7B45A98A 2007-06-18 [expires: 2012-06-16] </source> </section> + <section id="archive"><title>Release Archives</title> + <p> + Apache POI became a top level project in June 2007 and POI 3.0 aritfacts were re-released. + Prior to that date POI was a sub-project of <link href="http://jakarta.apache.org/">Apache Jakarta.</link> + </p> + <ul> + <li><link href="http://archive.apache.org/dist/poi/release/bin/">Binary Artifacts</link></li> + <li><link href="http://archive.apache.org/dist/poi/release/src/">Source Artifacts</link></li> + <li><link href="http://archive.apache.org/dist/poi/">Keys</link></li> + <li><link href="http://archive.apache.org/dist/jakarta/poi/release/">Artifacts from prior to 3.0</link></li> + </ul> + </section> </body> <footer> <legal> diff --git a/src/documentation/content/xdocs/faq.xml b/src/documentation/content/xdocs/faq.xml index a0cd7b8b80..87ed8c990b 100644 --- a/src/documentation/content/xdocs/faq.xml +++ b/src/documentation/content/xdocs/faq.xml @@ -20,89 +20,90 @@ <!DOCTYPE faqs PUBLIC "-//APACHE//DTD FAQ V1.1//EN" "./dtd/faq-v11.dtd"> <faqs title="Frequently Asked Questions"> - <faq> - <question> - My code uses some new feature, compiles fine but fails when live with a "MethodNotFoundException" - </question> - <answer> - <p>You almost certainly have an older version of POI - on your classpath. Quite a few runtimes and other packages - will ship an older version of POI, so this is an easy problem - to hit without your realising.</p> - <p>The best way to identify the offending earlier jar file is - with a few lines of java. These will load one of the core POI - classes, and report where it came from.</p> - <source> -ClassLoader classloader = org.apache.poi.poifs.filesystem.POIFSFileSystem.class.getClassLoader(); -URL res = classloader.getResource("org/apache/poi/poifs/filesystem/POIFSFileSystem.class"> + <faq> + <question> + My code uses some new feature, compiles fine but fails when live with a "MethodNotFoundException" + </question> + <answer> + <p>You almost certainly have an older version of POI + on your classpath. Quite a few runtimes and other packages + will ship an older version of POI, so this is an easy problem + to hit without your realising.</p> + <p>The best way to identify the offending earlier jar file is + with a few lines of java. These will load one of the core POI + classes, and report where it came from.</p> + <source> +ClassLoader classloader = + org.apache.poi.poifs.filesystem.POIFSFileSystem.class.getClassLoader(); +URL res = classloader.getResource( + "org/apache/poi/poifs/filesystem/POIFSFileSystem.class"); String path = res.getPath(); System.out.println("Core POI came from " + path); - </source> - </answer> - </faq> - <faq> - <question> - My code uses the scratchpad, compiles fine but fails to run with a "MethodNotFoundException" - </question> - <answer> - <p>You almost certainly have an older version earlier on your - classpath. See the prior answer.</p> - </answer> - </faq> - <faq> - <question> - Why is reading a simple sheet taking so long? - </question> - <answer> - <p>You've probably enabled logging. Logging is intended only for - autopsy style debugging. Having it enabled will reduce performance - by a factor of at least 100. Logging is helpful for understanding - why POI can't read some file or developing POI itself. Important - errors are thrown as exceptions, which means you probably don't need - logging.</p> - </answer> - </faq> - <faq> - <question> - What is the HSSF "eventmodel"? - </question> - <answer> - <p>The SS eventmodel package is an API for reading Excel files without loading the whole spreadsheet into memory. It does - require more knowledge on the part of the user, but reduces memory consumption by more than - tenfold. It is based on the AWT event model in combination with SAX. If you need read-only - access, this is the best way to do it.</p> - </answer> - - </faq> - <faq> - <question> - Why can't read the document I created using Star Office 5.1? - </question> - <answer> - <p>Star Office 5.1 writes some records using the older BIFF standard. This causes some problems - with POI which supports only BIFF8.</p> - </answer> - </faq> - <faq> - <question> - Why am I getting an exception each time I attempt to read my spreadsheet? - </question> - <answer> - <p>It's possible your spreadsheet contains a feature that is not currently supported by POI. - If you encounter this then please create the simplest file that demonstrates the trouble and submit it to - <link href="http://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bugzilla.</link></p> - </answer> - </faq> - <faq> - <question> - How do you tell if a spreadsheet cell contains a date? - </question> - <answer> - <p>Excel stores dates as numbers therefore the only way to determine if a cell is - actually stored as a date is to look at the formatting. There is a helper method - in HSSFDateUtil that checks for this. - Thanks to Jason Hoffman for providing the solution.</p> - <source> + </source> + </answer> + </faq> + <faq> + <question> + My code uses the scratchpad, compiles fine but fails to run with a "MethodNotFoundException" + </question> + <answer> + <p>You almost certainly have an older version earlier on your + classpath. See the prior answer.</p> + </answer> + </faq> + <faq> + <question> + Why is reading a simple sheet taking so long? + </question> + <answer> + <p>You've probably enabled logging. Logging is intended only for + autopsy style debugging. Having it enabled will reduce performance + by a factor of at least 100. Logging is helpful for understanding + why POI can't read some file or developing POI itself. Important + errors are thrown as exceptions, which means you probably don't need + logging.</p> + </answer> + </faq> + <faq> + <question> + What is the HSSF "eventmodel"? + </question> + <answer> + <p>The SS eventmodel package is an API for reading Excel files without loading the whole spreadsheet into memory. It does + require more knowledge on the part of the user, but reduces memory consumption by more than + tenfold. It is based on the AWT event model in combination with SAX. If you need read-only + access, this is the best way to do it.</p> + </answer> + </faq> + <faq> + <question> + Why can't read the document I created using Star Office 5.1? + </question> + <answer> + <p>Star Office 5.1 writes some records using the older BIFF standard. This causes some problems + with POI which supports only BIFF8.</p> + </answer> + </faq> + <faq> + <question> + Why am I getting an exception each time I attempt to read my spreadsheet? + </question> + <answer> + <p>It's possible your spreadsheet contains a feature that is not currently supported by POI. + If you encounter this then please create the simplest file that demonstrates the trouble and submit it to + <link href="http://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bugzilla.</link></p> + </answer> + </faq> + <faq> + <question> + How do you tell if a spreadsheet cell contains a date? + </question> + <answer> + <p>Excel stores dates as numbers therefore the only way to determine if a cell is + actually stored as a date is to look at the formatting. There is a helper method + in HSSFDateUtil that checks for this. + Thanks to Jason Hoffman for providing the solution.</p> + <source> case HSSFCell.CELL_TYPE_NUMERIC: double d = cell.getNumericCellValue(); // test if a date! @@ -114,94 +115,93 @@ System.out.println("Core POI came from " + path); cellText = cal.get(Calendar.MONTH)+1 + "/" + cal.get(Calendar.DAY_OF_MONTH) + "/" + cellText; - } </source> - </answer> - </faq> - <faq> - <question> - I'm trying to stream an XLS file from a servlet and I'm having some trouble. What's the problem? - </question> - <answer> - <p> - The problem usually manifests itself as the junk characters being shown on - screen. The problem persists even though you have set the correct mime type. - </p> - <p> - The short answer is, don't depend on IE to display a binary file type properly if you stream it via a - servlet. Every minor version of IE has different bugs on this issue. - </p> - <p> - The problem in most versions of IE is that it does not use the mime type on - the HTTP response to determine the file type; rather it uses the file extension - on the request. Thus you might want to add a - <strong>.xls</strong> to your request - string. For example - <em>http://yourserver.com/myServelet.xls?param1=xx</em>. This is - easily accomplished through URL mapping in any servlet container. Sometimes - a request like - <em>http://yourserver.com/myServelet?param1=xx&dummy=file.xls</em> is also - known to work. - - </p> - <p> - To guarantee opening the file properly in Excel from IE, write out your file to a - temporary file under your web root from your servelet. Then send an http response - to the browser to do a client side redirection to your temp file. (Note that using a - server side redirect using RequestDispatcher will not be effective in this case) - </p> - <p> - Note also that when you request a document that is opened with an - external handler, IE sometimes makes two requests to the webserver. So if your - generating process is heavy, it makes sense to write out to a temporary file, so that multiple - requests happen for a static file. - </p> - <p> - None of this is particular to Excel. The same problem arises when you try to - generate any binary file dynamically to an IE client. For example, if you generate - pdf files using - <link href="http://xml.apache.org/fop">FOP</link>, you will come across many of the same issues. - - </p> - <!-- Thanks to Avik for the answer --> - </answer> - </faq> - <faq> - <question> - I want to set a cell format (Data format of a cell) of a excel sheet as ###,###,###.#### or ###,###,###.0000. Is it possible using POI ? - </question> - <answer> - <p> - Yes. You first need to get a DataFormat object from the workbook and call getFormat with the desired format. Some examples are <link href="spreadsheet/quick-guide.html#DataFormats">here</link>. - </p> - </answer> - </faq> - <faq> - <question> - I want to set a cell format (Data format of a cell) of a excel sheet as text. Is it possible using POI ? - </question> - <answer> - <p> - Yes. This is a built-in format for excel that you can get from DataFormat object using the format string "@". Also, the string "text" will alias this format. - </p> - </answer> - </faq> - <faq> - <question> - How do I add a border around a merged cell? - </question> - <answer> - <p>Add blank cells around where the cells normally would have been and set the borders individually for each cell. - We will probably enhance HSSF in the future to make this process easier.</p> - </answer> - </faq> - <faq> - <question> - I am using styles when creating a workbook in POI, but Excel refuses to open the file, complaining about "Too Many Styles". - </question> - <answer> - <p>You just create the styles OUTSIDE of the loop in which you create cells.</p> - <p>GOOD:</p> - <source> + } + </source> + </answer> + </faq> + <faq> + <question> + I'm trying to stream an XLS file from a servlet and I'm having some trouble. What's the problem? + </question> + <answer> + <p> + The problem usually manifests itself as the junk characters being shown on + screen. The problem persists even though you have set the correct mime type. + </p> + <p> + The short answer is, don't depend on IE to display a binary file type properly if you stream it via a + servlet. Every minor version of IE has different bugs on this issue. + </p> + <p> + The problem in most versions of IE is that it does not use the mime type on + the HTTP response to determine the file type; rather it uses the file extension + on the request. Thus you might want to add a + <strong>.xls</strong> to your request + string. For example + <em>http://yourserver.com/myServelet.xls?param1=xx</em>. This is + easily accomplished through URL mapping in any servlet container. Sometimes + a request like + <em>http://yourserver.com/myServelet?param1=xx&dummy=file.xls</em> is also + known to work. + </p> + <p> + To guarantee opening the file properly in Excel from IE, write out your file to a + temporary file under your web root from your servelet. Then send an http response + to the browser to do a client side redirection to your temp file. (Note that using a + server side redirect using RequestDispatcher will not be effective in this case) + </p> + <p> + Note also that when you request a document that is opened with an + external handler, IE sometimes makes two requests to the webserver. So if your + generating process is heavy, it makes sense to write out to a temporary file, so that multiple + requests happen for a static file. + </p> + <p> + None of this is particular to Excel. The same problem arises when you try to + generate any binary file dynamically to an IE client. For example, if you generate + pdf files using + <link href="http://xml.apache.org/fop">FOP</link>, you will come across many of the same issues. + </p> + <!-- Thanks to Avik for the answer --> + </answer> + </faq> + <faq> + <question> + I want to set a cell format (Data format of a cell) of a excel sheet as ###,###,###.#### or ###,###,###.0000. Is it possible using POI ? + </question> + <answer> + <p> + Yes. You first need to get a DataFormat object from the workbook and call getFormat with the desired format. Some examples are <link href="spreadsheet/quick-guide.html#DataFormats">here</link>. + </p> + </answer> + </faq> + <faq> + <question> + I want to set a cell format (Data format of a cell) of a excel sheet as text. Is it possible using POI ? + </question> + <answer> + <p> + Yes. This is a built-in format for excel that you can get from DataFormat object using the format string "@". Also, the string "text" will alias this format. + </p> + </answer> + </faq> + <faq> + <question> + How do I add a border around a merged cell? + </question> + <answer> + <p>Add blank cells around where the cells normally would have been and set the borders individually for each cell. + We will probably enhance HSSF in the future to make this process easier.</p> + </answer> + </faq> + <faq> + <question> + I am using styles when creating a workbook in POI, but Excel refuses to open the file, complaining about "Too Many Styles". + </question> + <answer> + <p>You just create the styles OUTSIDE of the loop in which you create cells.</p> + <p>GOOD:</p> + <source> HSSFWorkbook wb = new HSSFWorkbook(); HSSFSheet sheet = wb.createSheet("new sheet"); HSSFRow row = null; @@ -214,7 +214,8 @@ System.out.println("Core POI came from " + path); cell.setCellValue("X"); cell.setCellStyle(style); - // Orange "foreground", foreground being the fill foreground not the font color. + // Orange "foreground", + // foreground being the fill foreground not the font color. style = wb.createCellStyle(); style.setFillForegroundColor(HSSFColor.ORANGE.index); style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND); @@ -234,42 +235,44 @@ System.out.println("Core POI came from " + path); // Write the output to a file FileOutputStream fileOut = new FileOutputStream("workbook.xls"); wb.write(fileOut); - fileOut.close(); </source> - - <p>BAD:</p> - <source> + fileOut.close(); + </source> + <p>BAD:</p> + <source> HSSFWorkbook wb = new HSSFWorkbook(); HSSFSheet sheet = wb.createSheet("new sheet"); HSSFRow row = null; for (int x = 0; x < 1000; x++) { - // Aqua background - HSSFCellStyle style = wb.createCellStyle(); - style.setFillBackgroundColor(HSSFColor.AQUA.index); - style.setFillPattern(HSSFCellStyle.BIG_SPOTS); - HSSFCell cell = row.createCell((short) 1); - cell.setCellValue("X"); - cell.setCellStyle(style); + // Aqua background + HSSFCellStyle style = wb.createCellStyle(); + style.setFillBackgroundColor(HSSFColor.AQUA.index); + style.setFillPattern(HSSFCellStyle.BIG_SPOTS); + HSSFCell cell = row.createCell((short) 1); + cell.setCellValue("X"); + cell.setCellStyle(style); - // Orange "foreground", foreground being the fill foreground not the font color. - style = wb.createCellStyle(); - style.setFillForegroundColor(HSSFColor.ORANGE.index); - style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND); + // Orange "foreground", + // foreground being the fill foreground not the font color. + style = wb.createCellStyle(); + style.setFillForegroundColor(HSSFColor.ORANGE.index); + style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND); - // Create a row and put some cells in it. Rows are 0 based. - row = sheet.createRow((short) k); + // Create a row and put some cells in it. Rows are 0 based. + row = sheet.createRow((short) k); - for (int y = 0; y < 100; y++) { - cell = row.createCell((short) k); - cell.setCellValue("X"); - cell.setCellStyle(style); - } + for (int y = 0; y < 100; y++) { + cell = row.createCell((short) k); + cell.setCellValue("X"); + cell.setCellStyle(style); + } } // Write the output to a file FileOutputStream fileOut = new FileOutputStream("workbook.xls"); wb.write(fileOut); - fileOut.close(); </source> - </answer> - </faq> + fileOut.close(); + </source> + </answer> + </faq> </faqs> diff --git a/src/documentation/content/xdocs/guidelines.xml b/src/documentation/content/xdocs/guidelines.xml new file mode 100644 index 0000000000..485d6db8fb --- /dev/null +++ b/src/documentation/content/xdocs/guidelines.xml @@ -0,0 +1,197 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN" "./dtd/document-v11.dtd"> + +<document> + <header> + <title>Apache POI - Contribution Guidelines</title> + <authors> + <person name="Nicola Ken Barozzi" email="barozzi@nicolaken.com"/> + <person name="Marc Johnson" email="mjohnson@apache.org"/> + <person name="Andrew C. Oliver" email="acoliver@apache.org"/> + <person name="Tetsuya Kitahata" email="tetsuya.kitahata@nifty.com"/> + </authors> + </header> + + <body> + + <section><title>Introduction</title> + <section><title>Disclaimer</title> + <p> + Any information in here that might be perceived as legal information is + informational only. We're not lawyers, so consult a legal professional + if needed. + </p> + </section> + <section><title>The Licensing</title> + <p> + The POI project is <link href="http://www.opensource.org">OpenSource</link> + and developed/distributed under the <link + href="http://www.apache.org/foundation/licence-FAQ.html"> + Apache Software License</link>. Unlike other licenses this license allows + free open source development; however, it does not require you to release + your source or use any particular license for your source. If you wish + to contribute to POI (which you're very welcome and encouraged to do so) + then you must agree to release the rights of your source to us under this + license. + </p> + </section> + <section><title>Publicly Available Information on the file formats</title> + <p> + In early 2008, Microsoft made a fairly complete set of documentation + on the binary file formats freely and publicly available. These were + released under the <link href="http://www.microsoft.com/interop/osp">Open + Specification Promise</link>, which does allow us to use them for + building open source software under the <link + href="http://www.apache.org/foundation/licence-FAQ.html"> + Apache Software License</link>. + </p> + <p> + You can download the documentation on Excel, Word, PowerPoint and + Escher (drawing) from + <link href="http://www.microsoft.com/interop/docs/OfficeBinaryFormats.mspx">http://www.microsoft.com/interop/docs/OfficeBinaryFormats.mspx</link>. + Documentation on a few of the supporting technologies used in these + file formats can be downloaded from + <link href="http://www.microsoft.com/interop/docs/supportingtechnologies.mspx">http://www.microsoft.com/interop/docs/supportingtechnologies.mspx</link>. + </p> + <p> + Previously, Microsoft published a book on the Excel 97 file format. + It can still be of plenty of use, and is handy dead tree form. Pick up + a copy of "Excel 97 Developer's Kit" from your favourite second hand + book store. + </p> + <p> + The newer Office Open XML (ooxml) file formats are documented as part + of the ECMA / ISO standardisation effort for the formats. This + documentation is quite large, but you can normally find the bit you + need without too much effort! This can be downloaded from + <link href="http://www.ecma-international.org/publications/standards/Ecma-376.htm">http://www.ecma-international.org/publications/standards/Ecma-376.htm</link>, + and is also under the <link href="http://www.microsoft.com/interop/osp">OSP</link>. + </p> + <p> + It is also worth checking the documentation and code of the other + open source implementations of the file formats. + </p> + </section> + <section><title>I just signed an NDA to get a spec from Microsoft and I'd like to contribute</title> + <p> + In short, stay away, stay far far away. Implementing these file formats + in POI is done strictly by using public information. Public information + includes sources from other open source projects, books that state the + purpose intended is for allowing implementation of the file format and + do not require any non-disclosure agreement and just hard work. + We are intent on keeping it + legal, by contributing patches you agree to do the same. + </p> + <p> + If you've ever received information regarding the OLE 2 Compound Document + Format under any type of exclusionary agreement from Microsoft, or + (possibly illegally) received such information from a person bound by + such an agreement, you cannot participate in this project. (Sorry) + </p> + <p> + Those submitting patches that show insight into the file format may be + asked to state explicitly that they have only ever read the publicly + available file format information, and not any received under an NDA + or similar. + </p> + </section> + </section> + <section><title>I just want to get involved but don't know where to start</title> + <ul> + <li>Read the rest of the website, understand what POI is and what it does, + the project vision, etc.</li> + <li>Use POI a bit, look for gaps in the documentation and examples.</li> + <li>Join the mail lists and share your knowledge with others.</li> + <li>Get <link href="subversion.html">Subversion</link> and check out the POI source tree</li> + <li>Documentation is always the best place to start contributing, maybe you found that if the documentation just told you how to do X then it would make more sense, modify the documentation.</li> + <li>Get used to building POI, you'll be doing it a lot, be one with the build, know its targets, etc.</li> + <li>Write Unit Tests. Great way to understand POI. Look for classes that aren't tested, or aren't tested on a public/protected method level, start there.</li> + <li>Download the file format documentation from Microsoft - + <link href="http://www.microsoft.com/interop/docs/OfficeBinaryFormats.mspx">OLE2 Binary File Formats</link> or + <link href="http://www.ecma-international.org/publications/standards/Ecma-376.htm">OOXML XML File Formats</link></li> + <li>Submit patches (see below) of your contributions, modifications.</li> + <li>Fill out new features, see <link href="http://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bug database</link> for suggestions.</li> + </ul> + </section> + <section><title>Submitting Patches</title> + <p> + Create patches by getting the latest sources from Subversion. + Alter or add files as appropriate. Then, from the poi directiory, + type svn diff > mypatch.patch. This will capture all of your changes + in a patch file of the appropriate format. However, svn diff won't + capture any new files you may have added. So, if you've added any + files, create an archive (tar.bz2 preferred as its the smallest) in a + path-preserving archive format, relative to your poi directory. + You'll attach both files in the next step. + </p> + <p> + Patches are submitted via the <link href="http://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bug Database</link>. + Create a new bug, set the subject to [PATCH] followed by a brief description. Explain you patch and any special instructions and submit/save it. + Next, go back to the bug, and create attachements for the patch files you + created. Be sure to describe not only the files purpose, but its format. + (Is that ZIP or a tgz or a bz2 or what?). + </p> + <p> + Make sure your patches include the @author tag on any files you've altered + or created. Make sure you've documented your changes and altered the + examples/etc to reflect them. Any new additions should have unit tests. + Lastly, ensure that you've provided approriate javadoc. (see + <link href="http://poi.apache.org/resolutions/res001.html">Coding + Standards</link>). Patches that are of low quality may be rejected or + the contributer may be asked to bring them up to spec. + </p> + <p>If you use a unix shell, you may find the following following + sequence of commands useful for building the files to attach.</p> + <source> +# Run this in the root of the checkout, i.e. the directory holding +# build.xml and poi.pom + +# Build the directory to hold new files +mkdir /tmp/poi-patch/ +mkdir /tmp/poi-patch/new-files/ + +# Get changes to existing files +svn diff > /tmp/poi-patch/diff.txt + +# Capture any new files, as svn diff won't include them +# Preserve the path +svn status | grep "^\?" | \ + awk '{printf "cp --parents %s /tmp/poi-patch/new-files/\n", $2 }' | sh -s + +# tar up the new files +cd /tmp/poi-patch/new-files/ +tar jcvf ../new-files.tar.bz2 +cd .. + +# Upload these to bugzilla +echo "Please upload to bugzilla:" +echo " /tmp/poi-patch/diff.txt" +echo " /tmp/poi-patch/new-files.tar.bz2" + </source> + </section> + +</body> + <footer> + <legal> + Copyright (c) @year@ The Apache Software Foundation. All rights reserved. + </legal> + </footer> +</document> diff --git a/src/documentation/content/xdocs/howtobuild.xml b/src/documentation/content/xdocs/howtobuild.xml index f6610703e1..27a5e9e21a 100644 --- a/src/documentation/content/xdocs/howtobuild.xml +++ b/src/documentation/content/xdocs/howtobuild.xml @@ -21,69 +21,72 @@ <document> <header> - <title>How To Build POI</title> + <title>Apache POI - How To Build</title> <authors> <person email="user@poi.apache.org" name="Glen Stampoultzis" id="GS"/> <person email="tetsuya@apache.org" name="Tetsuya Kitahata" id="TK"/> + <person email="dfisher@jmlafferty.com" name="David Fisher" id="DF"/> </authors> </header> <body> <section> - <title>JDK</title> + <title>JDK Version</title> <p> POI 3.5 and later requires the JDK version 1.5 or later. Versions prior to 3.5 require JDK 1.4+ </p> </section> <section> - <title>Installing Ant</title> + <title>Install Apache Ant</title> <p> - The POI build system requires two components to perform a - build. - <link href="ext:ant.apache.org/">Ant</link> and - <link href="ext:xml.apache.org/forrest">Forrest</link>. + The POI build system requires <link href="http://ant.apache.org/bindownload.cgi">Apache Ant</link> </p> <p> Specifically the build has been tested to work with Ant version - 1.7.1 and Forrest 0.5. To install these products download - the distributions and follow the instructions in their - documentation. Make sure you don't forget to set the - environment variables FORREST_HOME and ANT_HOME. The - ANT_HOME/bin directory should be in the path. + 1.7.1. To install the product download the distribution and follow the instructions. + </p> + <p> + Remember to set the ANT_HOME environment variable and add ANT_HOME/bin + to your shell's PATH. </p> + </section> + <section> + <title>Install JUnit</title> <p> - One these products are installed you will also need to - download some extra jar files required by the build. + Running unit tests and building a distribution requires <link href="http://www.junit.org/">JUnit</link>. </p> - <table> - <tr> - <th>Library</th> - <th>Location</th> - </tr> - <tr> - <td>junit</td> - <td>http://www.junit.org</td> - </tr> - </table> <p> - Just pick the latest versions of these jars and place - them in ANT_HOME/lib and make sure that optional.jar is - in ANT_HOME/lib . + Just pick the latest versions of the jars from + <link href="http://sourceforge.net/projects/junit/files/junit/">SourceForge</link> and place + them in ANT_HOME/lib. Make sure that optional.jar is in ANT_HOME/lib. + </p> + </section> + <section> + <title>Install Apache Forrest</title> + <p> + The POI build system requires <link href="http://forrest.apache.org/">Apache Forrest</link> to build the documentation. + </p> + <p> + Specifically the build has been tested to work with Forrest 0.5. This is an old release which is available + <link href="http://archive.apache.org/dist/forrest/pre-0.6/">here</link>. + </p> + <p> + Remember to set the FORREST_HOME environment variable. </p> </section> <section> - <title>Running the Build</title> + <title>Building Targets with Ant</title> <p> The main targets of interest to our users are: </p> <table> <tr> - <th>Target</th> + <th>Ant Target</th> <th>Description</th> </tr> <tr> <td>clean</td> - <td>Erase all build work products (ie, everything in the + <td>Erase all build work products (ie. everything in the build directory</td> </tr> <tr> @@ -92,23 +95,28 @@ </tr> <tr> <td>test</td> - <td>Run all unit tests from main, contrib and scratchpad</td> - </tr> - <tr> - <td>docs</td> - <td>Generate all documentation for the system</td> + <td>Run all unit tests from main, contrib and scratchpad (Requires JUnit)</td> </tr> <tr> <td>jar</td> <td>Produce jar files</td> </tr> <tr> + <td>docs</td> + <td>Generate all documentation (Requires Apache Forrest)</td> + </tr> + <tr> <td>dist</td> - <td>Create a distribution.</td> + <td>Create a distribution (Requires JUnit and Apache Forrest)</td> </tr> </table> </section> </body> + <footer> + <legal> + Copyright (c) @year@ The Apache Software Foundation. All rights reserved. + </legal> + </footer> </document> diff --git a/src/documentation/content/xdocs/index.xml b/src/documentation/content/xdocs/index.xml index 7532599e11..bca06da917 100644 --- a/src/documentation/content/xdocs/index.xml +++ b/src/documentation/content/xdocs/index.xml @@ -21,54 +21,70 @@ <document> <header> - <title>Apache POI - Java API To Access Microsoft Format Files</title> + <title>Apache POI - the Java API for Microsoft Documents</title> <authors> <person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/> <person id="GJS" name="Glen Stampoultzis" email="user@poi.apache.org"/> <person id="AS" name="Avik Sengupta" email="user@poi.apache.org"/> <person id="RK" name="Rainer Klute" email="klute@apache.org"/> + <person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/> </authors> </header> <body> - <section><title>28 September 2009 - POI 3.5-FINAL available</title> + <section><title>28 September 2009 - POI 3.5-FINAL is now available</title> <p>The Apache POI team is pleased to announce the release of 3.5 FINAL. This release brings many improvements including support for the new OOXML formats introduced in Office 2007, such as XLSX and DOCX. </p> <p>A full list of changes is available in the <link href="changes.html">change log</link>. - People interested should also follow the <link href="mailinglists.html">dev list</link> to track progress.</p> + People interested should also follow the <link href="mailinglists.html">dev mailing list</link> to track further progress.</p> <p>See the <link href="download.html">downloads</link> page for more details.</p> </section> - <section><title>Purpose</title> + <section><title>Mission Statement</title> <p> - The POI project consists of APIs for manipulating various file formats - based upon Microsoft's OLE 2 Compound Document format, and Office OpenXML format, using - pure Java. In short, you can read and write MS Excel files using Java. In addition, - you can read and write MS Word and MS PowerPoint files using Java. POI is your Java Excel - solution (for Excel 97-2007). However, we have a complete API for porting other OLE 2 - Compound Document formats and welcome others to participate. + The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats + based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2). + In short, you can read and write MS Excel files using Java. + In addition, you can read and write MS Word and MS PowerPoint files using Java. Apache POI is your Java Excel + solution (for Excel 97-2008). We have a complete API for porting other OOXML and OLE2 formats and welcome others to participate. </p> <p> - OLE 2 Compound Document Format based files include most Microsoft Office - files such as XLS and DOC as well as MFC serialization API based file formats. + OLE2 files include most Microsoft Office files such as XLS, DOC, and PPT as well as MFC serialization API based file formats. + The project provides APIs for the <link href="poifs/index.html">OLE2 Filesystem (POIFS)</link> and + <link href="hpsf/index.html">OLE2 Document Properties (HPSF)</link>. </p> <p> - Office OpenXML Format based files include the new (2007+) xml based file formats, - including Microsoft office files such as XLSX, DOCX and PPTX. + Office OpenXML Format is the new standards based XML file format found in Microsoft Office 2007 and 2008. + This includes XLSX, DOCX and PPTX. The project provides a low level API to support the Open Packaging Conventions + using <link href="oxml4j/index.html">openxml4j</link>. </p> <p> - As a general policy we try to collaborate as much as possible with other projects to + For each MS Office application there exists a component module that attempts to provide a common high level Java api to both OLE2 and OOXML + document formats. This is most developed for <link href="spreadsheet/index.html">Excel workbooks (SS=HSSF+XSSF)</link>. + Work is progressing for <link href="hwpf/index.html">Word documents (HWPF+XWPF)</link> and + <link href="slideshow/index.html">PowerPoint presentations (HSLF+XSLF)</link>. + </p> + <p> + The project has recently added support for <link href="hsmf/index.html">Outlook (HSMF)</link>. Microsoft opened the specifications + to this format in October 2007. We would welcome contributions. + </p> + <p> + There are also projects for <link href="hdgf/index.html">Visio (HDGF)</link> and <link href="hpbf/index.html">Publisher (HPBF)</link>. + </p> + <p> + As a general policy we collaborate as much as possible with other projects to provide this functionality. Examples include: <link href="http://xml.apache.org/cocoon">Cocoon</link> for which there are serializers for HSSF; <link href="http://www.openoffice.org">Open Office.org</link> with whom we collaborate in documenting the XLS format; and <link href="http://lucene.apache.org/">Lucene</link> for which we provide format interpretors. When practical, we donate - components directly to those projects for POI-enabling them. + components directly to those projects for POI-enabling them. + </p> + <section><title>Why should I use Apache POI?</title> + <p> + A major use of the Apache POI api is for <link href="text-extraction.html">Text Extraction</link> applications + such as web spiders, index builders, and content management systems. </p> - <section><title>Why/when would I use POI?</title> - <p> - We'll tackle this on a component level. POI refers to the whole project. - </p> <p> So why should you use POIFS, HSSF or XSSF? </p> @@ -85,136 +101,26 @@ using Java. </p> </section> + <section><title>Components</title> + <p> + The Apache POI Project provides several component modules some of which may not be of interest to you. + Use the information on our <link href="overview.html#components">Components</link> page to determine which + jar files to include in your classpath. + </p> + </section> </section> - - <section><title>Components To Date</title> - <section><title>Overview</title> - <p>The following are components of the entire POI project and a brief - summary of their purpose.</p> - </section> - <section><title>POIFS for OLE 2 Documents</title> - <p>POIFS is the oldest and most stable part of the project. It is our port of the OLE 2 Compound Document Format to - pure Java. It supports both read and write functionality. All of our components ultimately rely on it by - definition. Please see <link href="./poifs/index.html">the POIFS project page</link> for more information.</p> - </section> - <section><title>HSSF and XSSF for Excel Documents</title> - <p>HSSF is our port of the Microsoft Excel 97(-2007) file format (BIFF8) to pure - Java. XSSF is our port of the Microsoft Excel XML (2007+) file format (OOXML) to - pure Java. They both supports read and write capability. Please see - <link href="./spreadsheet/index.html">the HSSF+XSSF project page</link> for more - information.</p> - </section> - <section><title>HWPF for Word Documents</title> - <p>HWPF is our port of the Microsoft Word 97 file format to pure - Java. It supports read, and limited write capabilities. Please see <link - href="./hwpf/index.html">the HWPF project page for more - information</link>. This component is in the early stages of - development. It can already read and write simple files.</p> - <p>Presently we are looking for a contributor to foster the HWPF - development. Jump in!</p> - </section> - <section><title>HSLF for PowerPoint Documents</title> - <p>HSLF is our port of the Microsoft PowerPoint 97(-2003) file format to pure - Java. It supports read and write capabilities. Please see <link - href="./slideshow/index.html">the HSLF project page for more - information</link>.</p> - </section> - <section><title>HPSF for Document Properties</title> - <p>HPSF is our port of the OLE 2 property set format to pure - Java. Property sets are mostly use to store a document's properties - (title, author, date of last modification etc.), but they can be used - for application-specific purposes as well.</p> - - <p>HPSF supports both reading and writing of properties.</p> - <p>Please see <link href="./hpsf/index.html">the HPSF project - page</link> for more information.</p> - </section> - <section><title>HDGF for Visio Documents</title> - <p>HDGF is our port of the Microsoft Viso 97(-2003) file format to pure - Java. It currently only supports reading at a very low level, and - simple text extraction. Please see <link - href="./hdgf/index.html">the HDGF project page for more - information</link>.</p> - </section> - <section><title>HPBF for Publisher Documents</title> - <p>HPBF is our port of the Microsoft Publisher 98(-2007) file format to pure - Java. It currently only supports reading at a low level for around - half of the file parts, and simple text extraction. Please see <link - href="./hpbf/index.html">the HPBF project page for more - information</link>.</p> - </section> - <section><title>Component map</title> - <anchor id="components"/> - <p> - The POI distribution consists of several JAR files. Not all of them are needed in every case. The following table - shows the relationships between POI components and the JAR files. - </p> - <table> - <tr> - <th>Component</th> - <th>JAR</th> - <th>Maven artifactId</th> - </tr> - <tr> - <td><link href="./poifs/index.html">POIFS</link></td> - <td>poi-version-yyyymmdd.jar</td> - <td>poi</td> - </tr> - <tr> - <td><link href="./hpsf/index.html">HPSF</link></td> - <td>poi-version-yyyymmdd.jar</td> - <td>poi</td> - </tr> - <tr> - <td><link href="./spreadsheet/index.html">HSSF</link></td> - <td>poi-version-yyyymmdd.jar</td> - <td>poi</td> - </tr> - <tr> - <td><link href="./spreadsheet/index.html">XSSF</link></td> - <td>poi-ooxml-version-yyyymmdd.jar</td> - <td>poi-ooxml</td> - </tr> - <tr> - <td><link href="./slideshow/index.html">HLSF</link></td> - <td>poi-scratchpad-version-yyyymmdd.jar</td> - <td>poi-scratchpad</td> - </tr> - <tr> - <td><link href="./hwpf/index.html">HWPF</link></td> - <td>poi-scratchpad-version-yyyymmdd.jar</td> - <td>poi-scratchpad</td> - </tr> - <tr> - <td><link href="./hdgf/index.html">HDGF</link></td> - <td>poi-scratchpad-version-yyyymmdd.jar</td> - <td>poi-scratchpad</td> - </tr> - <tr> - <td><link href="./hpbf/index.html">HPBF</link></td> - <td>poi-scratchpad-version-yyyymmdd.jar</td> - <td>poi-scratchpad</td> - </tr> - <tr> - <td><link href="./hsmf/index.html">HSMF</link></td> - <td>poi-scratchpad-version-yyyymmdd.jar</td> - <td>poi-scratchpad</td> - </tr> - </table> - </section> - </section> <section><title>Contributing </title> <p> So you'd like to contribute to the project? Great! We need enthusiastic, hard-working, talented folks to help - us on the project in several areas. The first is bug reports and feature requests! The second is documentation - - we'll be at your every beck and call if you've got a critique or you'd like to contribute or otherwise improve - the documentation. Last, but not least, we could use some binary crunching Java coders to chew through the - complexity that characterizes Microsoft's file formats and help us port new ones to a superior Java platform! - </p> - <p>So if you're motivated, ready, and have the time, join the mail lists and we'll be happy to help you get started on the - project! + us on the project. So if you're motivated, ready, and have the time time download the source from the + <link href="subversion.html">Subversion Repository</link>, <link href="howtobuild.html">build the code</link>, + join the <link href="mailinglists.html">mailing lists</link> and we'll be happy to help you get started on the project! </p> + <p> + Please read our <link href="guidelines.html">Contribution Guidelines</link>. When your contribution is ready + submit a patch to our <link href="https://issues.apache.org/bugzilla/buglist.cgi?product=POI">Bug Database</link>. + </p> </section> diff --git a/src/documentation/content/xdocs/legal.xml b/src/documentation/content/xdocs/legal.xml index 4950187155..f96087c13d 100644 --- a/src/documentation/content/xdocs/legal.xml +++ b/src/documentation/content/xdocs/legal.xml @@ -24,28 +24,34 @@ <title>Apache POI - Legal Stuff</title> <authors> <person id="TK" name="Tetsuya Kitahata" email="tetsuya@apache.org"/> + <person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/> </authors> </header> <body> - <section><title>Apache POI - Legal Stuff</title> + <section><title>License and Notice</title> +<p> + Apache POI releases are available under the <link href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0.</link> + See the NOTICE file contained in each release artifact for applicable copyright attribution notices. Release artifacts are available + from the <link href="download.html">Download</link> page. +</p> +</section> + <section><title>Copyrights and Trademarks</title> <p> All material on this website is Copyright © 2002-2009, The Apache -Software Foundation +Software Foundation. </p> - <p> Sun, Sun Microsystems, Solaris, Java, JavaServer Web Development Kit, and JavaServer Pages are trademarks or registered trademarks of Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through 'The Open Group'. -Microsoft, Windows, WindowsNT, Excel, Word, PowerPoint and Win32 are -registered trademarks of -Microsoft Corporation. Linux is a registered trademark of Linus Torvalds. +Microsoft, Windows, WindowsNT, Excel, Word, PowerPoint, Viso, Publisher, Outlook, +and Win32 are registered trademarks of Microsoft Corporation. +Linux is a registered trademark of Linus Torvalds. All other product names mentioned herein and throughout the entire web site are trademarks of their respective owners. </p> - <section><title>Cryptography Notice</title> <p> This distribution includes cryptographic software. The country in diff --git a/src/documentation/content/xdocs/mailinglists.xml b/src/documentation/content/xdocs/mailinglists.xml index b984399370..833bcb500f 100644 --- a/src/documentation/content/xdocs/mailinglists.xml +++ b/src/documentation/content/xdocs/mailinglists.xml @@ -90,8 +90,7 @@ </body> <footer> <legal> - Copyright 2007 The Apache Software Foundation or its licensors, as applicable. - $Revision: 496536 $ $Date: 2007-01-15 23:11:09 +0000 (Mon, 15 Jan 2007) $ + Copyright (c) @year@ The Apache Software Foundation. All rights reserved. </legal> </footer> </document> diff --git a/src/documentation/content/xdocs/overview.xml b/src/documentation/content/xdocs/overview.xml index 19dd9f62eb..7e284f130c 100644 --- a/src/documentation/content/xdocs/overview.xml +++ b/src/documentation/content/xdocs/overview.xml @@ -21,104 +21,256 @@ <document> <header> - <title>Overview</title> + <title>Apache POI - Component Overview</title> <authors> <person id="AO" name="Andrew C. Oliver" email="acoliver@apache.org"/> <person id="RK" name="Rainer Klute" email="klute@apache.org"/> + <person id="DF" name="David Fisher" email="dfisher@jmlafferty.com"/> </authors> </header> - <body> + <section><title>Apache POI Project Components</title> + <section><title>POIFS for OLE 2 Documents</title> + <p> + POIFS is the oldest and most stable part of the project. It is our port of the OLE 2 Compound Document Format to + pure Java. It supports both read and write functionality. All of our components ultimately rely on it by + definition. Please see <link href="./poifs/index.html">the POIFS project page</link> for more information. + </p> + </section> + <section><title>HSSF and XSSF for Excel Documents</title> + <p> + HSSF is our port of the Microsoft Excel 97(-2007) file format (BIFF8) to pure + Java. XSSF is our port of the Microsoft Excel XML (2007+) file format (OOXML) to + pure Java. SS is a package that provides common support for both formats with a common API. + They both support read and write capability. Please see + <link href="./spreadsheet/index.html">the HSSF+XSSF project page</link> for more + information. + </p> + </section> + <section><title>HWPF and XWPF for Word Documents</title> + <p> + HWPF is our port of the Microsoft Word 97 (-2003) file format to pure + Java. It supports read, and limited write capabilities. Please see <link + href="./hwpf/index.html">the HWPF project page for more + information</link>. This component remains in early stages of + development. It can already read and write simple files. + </p> + <p> + We are also working on the XWPF for the WordprocessingML (2007+) format from the + OOXML specification. + </p> + </section> + <section><title>HSLF and XSLF for PowerPoint Documents</title> + <p> + HSLF is our port of the Microsoft PowerPoint 97(-2003) file format to pure + Java. It supports read and write capabilities. Please see <link + href="./slideshow/index.html">the HSLF project page for more + information</link>. + </p> + <p> + We are also working on the XSLF for the PresentationML (2007+) format from the + OOXML specification. + </p> + </section> + <section><title>HPSF for OLE 2 Document Properties</title> + <p> + HPSF is our port of the OLE 2 property set format to pure + Java. Property sets are mostly use to store a document's properties + (title, author, date of last modification etc.), but they can be used + for application-specific purposes as well. + </p> + <p> + HPSF supports both reading and writing of properties. + </p> + <p> + Please see <link href="./hpsf/index.html">the HPSF project + page</link> for more information. + </p> + </section> + <section><title>HDGF for Visio Documents</title> + <p> + HDGF is our port of the Microsoft Viso 97(-2003) file format to pure + Java. It currently only supports reading at a very low level, and + simple text extraction. Please see <link + href="./hdgf/index.html">the HDGF project page for more + information</link>. + </p> + </section> + <section><title>HPBF for Publisher Documents</title> + <p> + HPBF is our port of the Microsoft Publisher 98(-2007) file format to pure + Java. It currently only supports reading at a low level for around + half of the file parts, and simple text extraction. Please see <link + href="./hpbf/index.html">the HPBF project page for more + information</link>. + </p> + </section> + <section><title>HSMF for Outlook Messages</title> + <p> + HSMF is our port of the Microsoft Outlook message file format to pure + Java. It currently only some of the textual content of MSG files. + Further support and documentation is expected over the comming weeks and months. + For now, users are advised to consult the unit tests for example use. + Please see <link href="./hsmf/index.html">the HPBF project page for more + information</link>. + </p> + <p> + Microsoft has recently added the Outlook file format to its OSP. More information + is now available making implementation of this API an easier task. + </p> + </section> + </section> <section><title>What is it?</title> - <p>The POI project is the master project for developing pure + <p>The Apache POI project is the master project for developing pure Java ports of file formats based on Microsoft's OLE 2 Compound Document Format. OLE 2 Compound Document Format is used by Microsoft Office Documents, as well as by programs using MFC property sets to serialize their document objects. </p> + <p>Apache POI is also the master project for developing pure + Java ports of file formats based on Office Open XML (ooxml.) + OOXML is part of an ECMA / ISO standardisation effort. This + documentation is quite large, but you can normally find the bit you + need without too much effort! + <link href="http://www.ecma-international.org/publications/standards/Ecma-376.htm">ECMA-376 standard is here</link>, + and is also under the <link href="http://www.microsoft.com/interop/osp">Microsoft OSP</link>. + </p> </section> - <section><title>Sub-Projects</title> + <section id="components"><title>Component Map</title> <p> - There following are ports, packages or components contained in the POI project. + The Apache POI distribution consists of support for many document file formats. This support is provided + in several Jar files. Not all of the Jars are needed for every format. The following tables + show the relationships between POI components, Maven repository tags, and the project's Jar files. + </p> + <table> + <tr> + <th>Component</th> + <th>Application type</th> + <th>Maven artifactId</th> + </tr> + <tr> + <td><link href="./poifs/index.html">POIFS</link></td> + <td>OLE2 Filesystem</td> + <td>poi</td> + </tr> + <tr> + <td><link href="./hpsf/index.html">HPSF</link></td> + <td>OLE2 Property Sets</td> + <td>poi</td> + </tr> + <tr> + <td><link href="./spreadsheet/index.html">HSSF</link></td> + <td>Excel XLS</td> + <td>poi</td> + </tr> + <tr> + <td><link href="./slideshow/index.html">HSLF</link></td> + <td>PowerPoint PPT</td> + <td>poi-scratchpad</td> + </tr> + <tr> + <td><link href="./hwpf/index.html">HWPF</link></td> + <td>Word DOC</td> + <td>poi-scratchpad</td> + </tr> + <tr> + <td><link href="./hdgf/index.html">HDGF</link></td> + <td>Visio VSD</td> + <td>poi-scratchpad</td> + </tr> + <tr> + <td><link href="./hpbf/index.html">HPBF</link></td> + <td>Publisher PUB</td> + <td>poi-scratchpad</td> + </tr> + <tr> + <td><link href="./hsmf/index.html">HSMF</link></td> + <td>Outlook MSG</td> + <td>poi-scratchpad</td> + </tr> + <tr> + <td><link href="./spreadsheet/index.html">XSSF</link></td> + <td>Excel XLSX</td> + <td>poi-ooxml</td> + </tr> + <tr> + <td><link href="./slideshow/index.html">XSLF</link></td> + <td>PowerPoint PPTX</td> + <td>poi-ooxml</td> + </tr> + <tr> + <td><link href="./hwpf/index.html">XWPF</link></td> + <td>Word DOCX</td> + <td>poi-ooxml</td> + </tr> + <tr> + <td><link href="./oxml4j/index.html">OpenXML4J</link></td> + <td>OOXML</td> + <td>poi-ooxml, ooxml-schemas</td> + </tr> + </table> + <p> + This table maps artifacts into the jar file name. "version-yyyymmdd" is the POI version stamp. For the latest release it is + 3.5-FINAL-20090928. + </p> + <table> + <tr> + <th>Maven artifactId</th> + <th>JAR</th> + </tr> + <tr> + <td>poi</td> + <td>poi-version-yyyymmdd.jar</td> + </tr> + <tr> + <td>poi-scratchpad</td> + <td>poi-scratchpad-version-yyyymmdd.jar</td> + </tr> + <tr> + <td>poi-ooxml</td> + <td>poi-ooxml-version-yyyymmdd.jar</td> + </tr> + <tr> + <td>ooxml-schemas</td> + <td>ooxml-schemas-1.0.jar</td> + </tr> + </table> + <p> + OOXML file formats (poi-ooxml) also require ooxml-schemas-1.0.jar. This jar is large and we are experimenting with smaller versions. + Details are available on the dev mailing list. </p> - <section><title>POIFS</title> - <p> - <link href="poifs/index.html">POIFS</link> is the set of APIs - for reading and writing OLE 2 Compound Document Formats using (only) Java. - </p> - </section> - - <section><title>HSSF and XSSF</title> - <p> - <link href="spreadsheet/index.html">HSSF and XSSF</link> are - the set of APIs for reading and writing Microsoft Excel - 97-2007 and OOXML spreadsheets using (only) Java. - </p> - </section> - - <section><title>HWPF</title> - <p> - <link href="hwpf/index.html">HWPF</link> is the set of APIs - for reading and writing Microsoft Word 97(-XP) documents using (only) Java. - </p> - </section> - - <section><title>HSLF</title> - <p> - <link href="slideshow/index.html">HSLF</link> is the set of APIs - for reading and writing Microsoft PowerPoint 97(-XP) documents using (only) Java. - </p> - </section> - - <section><title>HPSF</title> - <p> - <link href="hpsf/index.html">HPSF</link> is the set of APIs - for reading property sets using (only) Java. - </p> - </section> - - <section><title>POI-Utils</title> - <p> - <link href="utils/index.html">POI-Utils</link> are general purpose artifacts - from POI development that have not yet been implemented elsewhere. We're - always looking to donate these and maintain them as part of a general library - used in another project. These are things we need to complete our mission but - are generally outside of it. - </p> - </section> </section> - - <section> - <title>Examples</title> - - <p>Small sample programs using the POI API are available in the + <section><title>Examples</title> + <p> + Small sample programs using the POI API are available in the <em>src/examples</em> directory of the source distribution. Before studying the source code you might want to have a look at the - "Examples" section of the <link - href="apidocs/overview-summary.html">POI API - documentation</link>.</p> + "Examples" section of the <link href="apidocs/overview-summary.html">POI API + documentation</link>. + </p> </section> - <section><title>Contributed Software</title> - <p>Besides the "official" components outlined above there is some further - software distributed with POI. This is called "contributed" software. It + <p> + Besides the "official" components outlined above there is some further + software distributed with POI. This is called "contributed" software. It is not explicitly recommended or even maintained by the POI team, but - it might still be useful to you.</p> - - <section> - <title>POI Browser</title> - <p>The POI Browser is a very simple Swing GUI tool that displays the + it might still be useful to you. + </p> + <section><title>POI Browser</title> + <p> + The POI Browser is a very simple Swing GUI tool that displays the internal structure of a Microsoft Office file and especially the property set streams. Further information and instructions how to execute it can be found in the <link - href="apidocs/org/apache/poi/contrib/poibrowser/package-summary.html#package_description">POI - Browser package description</link>.</p> + href="apidocs/org/apache/poi/contrib/poibrowser/package-summary.html#package_description">POI + Browser package description</link>. + </p> </section> </section> </body> <footer> <legal> - Copyright 2007 The Apache Software Foundation or its licensors, as applicable. + Copyright (c) @year@ The Apache Software Foundation. All rights reserved. </legal> </footer> </document> diff --git a/src/documentation/content/xdocs/subversion.xml b/src/documentation/content/xdocs/subversion.xml index 61c097622a..6e5d342eb6 100644 --- a/src/documentation/content/xdocs/subversion.xml +++ b/src/documentation/content/xdocs/subversion.xml @@ -21,7 +21,7 @@ <document> <header> - <title>Source Code Repository</title> + <title>Apache POI - Source Code Repository</title> <authors> <person id="NB" name="Nick Burch" email="nick@apache.org"/> </authors> @@ -42,9 +42,11 @@ see the <link href="http://www.apache.org/dev/version-control.html">version control page.</link> </p> - - <p>Subversion is an open-source version control system. The root url - of the ASF Subversion repository is + <p>Subversion is an open-source version control system. It has been contributed to the Apache Software Foundation and is + now available <link href="http://incubator.apache.org/projects/subversion.html">here</link>. + </p> + <p> + The root url of the ASF Subversion repository is <link href="http://svn.apache.org/repos/asf/">http://svn.apache.org/repos/asf/</link> for non-committers and <link href="https://svn.apache.org/repos/asf/">https://svn.apache.org/repos/asf/</link> @@ -82,8 +84,7 @@ </body> <footer> <legal> - Copyright 2007 The Apache Software Foundation or its licensors, as applicable. - $Revision: 496536 $ $Date: 2007-01-15 23:11:09 +0000 (Mon, 15 Jan 2007) $ + Copyright (c) @year@ The Apache Software Foundation. All rights reserved. </legal> </footer> </document> diff --git a/src/documentation/content/xdocs/text-extraction.xml b/src/documentation/content/xdocs/text-extraction.xml index fa7474bc0f..61bc5c4643 100644 --- a/src/documentation/content/xdocs/text-extraction.xml +++ b/src/documentation/content/xdocs/text-extraction.xml @@ -21,7 +21,7 @@ <document> <header> - <title>POI Text Extraction</title> + <title>Apache POI - Text Extraction</title> <authors> <person id="NB" name="Nick Burch" email="nick@apache.org"/> </authors> @@ -29,7 +29,7 @@ <body> <section><title>Overview</title> - <p>POI provides text extraction for all the supported file + <p>Apache POI provides text extraction for all the supported file formats. In addition, it provides access to the metadata associated with a given file, such as title and author.</p> <p>In addition to providing direct text extraction classes, @@ -108,50 +108,53 @@ if one of these objects is embedded into a worksheet, the ExtractorFactory class can be used to recover an extractor for it. </p> <source> - FileInputStream fis = new FileInputStream(inputFile); - POIFSFileSystem fileSystem = new POIFSFileSystem(fis); - // Firstly, get an extractor for the Workbook - POIOLE2TextExtractor oleTextExtractor = ExtractorFactory.createExtractor(fileSystem); - // Then a List of extractors for any embedded Excel, Word, PowerPoint - // or Visio objects embedded into it. - POITextExtractor[] embeddedExtractors = ExtractorFactory.getEmbededDocsTextExtractors(oleTextExtractor); - for (POITextExtractor textExtractor : embeddedExtractors) { - // If the embedded object was an Excel spreadsheet. - if (textExtractor instanceof ExcelExtractor) { - ExcelExtractor excelExtractor = (ExcelExtractor) textExtractor; - System.out.println(excelExtractor.getText()); +FileInputStream fis = new FileInputStream(inputFile); +POIFSFileSystem fileSystem = new POIFSFileSystem(fis); +// Firstly, get an extractor for the Workbook +POIOLE2TextExtractor oleTextExtractor = + ExtractorFactory.createExtractor(fileSystem); +// Then a List of extractors for any embedded Excel, Word, PowerPoint +// or Visio objects embedded into it. +POITextExtractor[] embeddedExtractors = + ExtractorFactory.getEmbededDocsTextExtractors(oleTextExtractor); +for (POITextExtractor textExtractor : embeddedExtractors) { + // If the embedded object was an Excel spreadsheet. + if (textExtractor instanceof ExcelExtractor) { + ExcelExtractor excelExtractor = (ExcelExtractor) textExtractor; + System.out.println(excelExtractor.getText()); + } + // A Word Document + else if (textExtractor instanceof WordExtractor) { + WordExtractor wordExtractor = (WordExtractor) textExtractor; + String[] paragraphText = wordExtractor.getParagraphText(); + for (String paragraph : paragraphText) { + System.out.println(paragraph); } - // A Word Document - else if (textExtractor instanceof WordExtractor) { - WordExtractor wordExtractor = (WordExtractor) textExtractor; - String[] paragraphText = wordExtractor.getParagraphText(); - for (String paragraph : paragraphText) { - System.out.println(paragraph); - } - // Display the document's header and footer text - System.out.println("Footer text: " + wordExtractor.getFooterText()); - System.out.println("Header text: " + wordExtractor.getHeaderText()); - } - // PowerPoint Presentation. - else if (textExtractor instanceof PowerPointExtractor) { - PowerPointExtractor powerPointExtractor = (PowerPointExtractor) textExtractor; - System.out.println("Text: " + powerPointExtractor.getText()); - System.out.println("Notes: " + powerPointExtractor.getNotes()); - } - // Visio Drawing - else if (textExtractor instanceof VisioTextExtractor) { - VisioTextExtractor visioTextExtractor = (VisioTextExtractor) textExtractor; - System.out.println("Text: " + visioTextExtractor.getText()); - } - } + // Display the document's header and footer text + System.out.println("Footer text: " + wordExtractor.getFooterText()); + System.out.println("Header text: " + wordExtractor.getHeaderText()); + } + // PowerPoint Presentation. + else if (textExtractor instanceof PowerPointExtractor) { + PowerPointExtractor powerPointExtractor = + (PowerPointExtractor) textExtractor; + System.out.println("Text: " + powerPointExtractor.getText()); + System.out.println("Notes: " + powerPointExtractor.getNotes()); + } + // Visio Drawing + else if (textExtractor instanceof VisioTextExtractor) { + VisioTextExtractor visioTextExtractor = + (VisioTextExtractor) textExtractor; + System.out.println("Text: " + visioTextExtractor.getText()); + } +} </source> </section> </body> <footer> <legal> - Copyright 2005 The Apache Software Foundation or its licensors, as applicable. - $Revision: 639487 $ $Date: 2008-03-20 22:31:15 +0000 (Thu, 20 Mar 2008) $ + Copyright (c) @year@ The Apache Software Foundation. All rights reserved. </legal> </footer> </document> diff --git a/src/documentation/content/xdocs/who.xml b/src/documentation/content/xdocs/who.xml index 928fc4c499..37f354daba 100644 --- a/src/documentation/content/xdocs/who.xml +++ b/src/documentation/content/xdocs/who.xml @@ -21,9 +21,9 @@ <document> <header> - <title>Apache POI - Who we are</title> + <title>Apache POI - Who We Are</title> <authors> - <person name="Apache POI Documentation Team" email="dev@poi.apache.org"/> + <person name="Apache POI Developers" email="dev@poi.apache.org"/> </authors> </header> @@ -31,7 +31,7 @@ <section><title>Apache POI - Who we are</title> <p> - The POI Project operates on a meritocracy: the more you do, the more + The Apache POI Project operates on a meritocracy: the more you do, the more responsibility you will obtain. This page lists all of the people who have gone the extra mile and are Committers. If you would like to get involved, the first step is to join the <link href="mailinglists.html">mailing lists</link>. @@ -41,11 +41,14 @@ We ask that you please do not send us emails privately asking for support. We are non-paid volunteers who help out with the project and we do not necessarily have the time or energy to help people on an individual basis. - Instead, we have set up mailing lists which often contain hundreds of - individuals who will help answer detailed requests for help. The benefit of - using mailing lists over private communication is that it is a shared - resource where others can also learn from common mistakes and as a - community we all grow together. + The <link href="mailinglists.html">mailing lists</link> have many individuals + who will help answer detailed requests for help. The benefit of + using mailing lists over private communication is that they are a shared + resource where others can also learn from common questions. + </p> + <p> + POI Developers count on feedback from the mailing lists. Many developers do take + an active role on the lists. </p> <!-- <section><title>Advisors</title>--> @@ -60,25 +63,15 @@ <li>Nick Burch (nick at apache dot org)</li> </ul> </section> - - <section><title>Emeritus Committers</title> - <ul> - <li><link href="http://people.apache.org/~acoliver/">Andrew C. Oliver</link> (acoliver at apache dot org)</li> - <li>Nicola Ken Barozzi (barozzi at nicolaken dot com)</li> - <li>Ryan Ackley (sackley at apache dot org)</li> - </ul> - </section> - <section><title>Committers</title> <ul> - <li><link href="http://www.marcj.com/">Marc Johnson</link> (mjohnson at apache dot org)</li> + <li>Marc Johnson (mjohnson at apache dot org)</li> <li><link href="http://members.iinet.net.au/~gstamp/glen/">Glen Stampoultzis</link> (glens at apache.org)</li> <li><link href="http://www.rainer-klute.de/">Rainer Klute</link> (klute at apache dot org)</li> <li>Avik Sengupta (avik at apache dot org)</li> <li>Shawn Laubach (slaubach at apache dot org)</li> <li>Danny Mui (dmui at apache dot org)</li> <li>Jason Height (jheight at apache dot org)</li> - <li><link href="http://www.apache.org/~tetsuya/">Tetsuya Kitahata</link> (tetsuya at apache dot org)</li> <li>Yegor Kozlov (yegor at apache dot org)</li> <li>Amol S Deshmukh (amol at apache dot org)</li> <li>David Fisher (wave at apache dot org)</li> @@ -90,6 +83,14 @@ <li>(Please add your name here!!)</li> </ul> </section> + <section><title>Emeritus Committers</title> + <ul> + <li>Andrew C. Oliver (acoliver at gmail dot com)</li> + <li>Nicola Ken Barozzi (barozzi at nicolaken dot com)</li> + <li>Ryan Ackley (sackley at apache dot org)</li> + <li>Tetsuya Kitahata (ai at spa dot nifty dot com)</li> + </ul> + </section> </section> </body> |