aboutsummaryrefslogtreecommitdiffstats
path: root/src/documentation/content/xdocs/components/spreadsheet/eval.xml
diff options
context:
space:
mode:
Diffstat (limited to 'src/documentation/content/xdocs/components/spreadsheet/eval.xml')
-rw-r--r--src/documentation/content/xdocs/components/spreadsheet/eval.xml410
1 files changed, 410 insertions, 0 deletions
diff --git a/src/documentation/content/xdocs/components/spreadsheet/eval.xml b/src/documentation/content/xdocs/components/spreadsheet/eval.xml
new file mode 100644
index 0000000000..aee0c38008
--- /dev/null
+++ b/src/documentation/content/xdocs/components/spreadsheet/eval.xml
@@ -0,0 +1,410 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ ====================================================================
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+ ====================================================================
+-->
+<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">
+
+<document>
+ <header>
+ <title>Formula Evaluation</title>
+ </header>
+ <body>
+ <section><title>Introduction</title>
+ <p>The POI formula evaluation code enables you to calculate the result of
+ formulas in Excels sheets read-in, or created in POI. This document explains
+ how to use the API to evaluate your formulas.
+ </p>
+ </section>
+
+ <anchor id="WhyEvaluate"/>
+ <section><title>Why do I need to evaluate formulas?</title>
+ <p>The Excel file format (both .xls and .xlsx) stores a "cached" result for
+ every formula along with the formula itself. This means that when the file
+ is opened, it can be quickly displayed, without needing to spend a long
+ time calculating all of the formula results. It also means that when reading
+ a file through Apache POI, the result is quickly available to you too!
+ </p>
+ <p>After making changes with Apache POI to either Formula Cells themselves,
+ or those that they depend on, you should normally perform a Formula
+ Evaluation to have these "cached" results updated. This is normally done
+ after all changes have been performed, but before you write the file out.
+ If you don't do this, there's a good chance that when you open the file in
+ Excel, until you go to the cell and hit enter or F9, you will either see
+ the old value or '#VALUE!' for the cell. (Sometimes Excel will notice
+ itself, and trigger a recalculation on load, but unless you know you are
+ using volatile functions it's generally best to trigger a <a href="#recalculation">Recalulation</a>
+ through POI)
+ </p>
+ </section>
+
+ <anchor id="Status"/>
+ <section><title>Status</title>
+ <p>The code currently provides implementations for all the arithmatic operators.
+ It also provides implementations for approx. 140 built in
+ functions in Excel. The framework however makes it easy to add
+ implementation of new functions. See the <a href="eval-devguide.html"> Formula
+ evaluation development guide</a> and <a href="../../apidocs/dev/org/apache/poi/hssf/record/formula/functions/package-summary.html">javadocs</a>
+ for details. </p>
+ <p> Both HSSFWorkbook and XSSFWorkbook are supported, so you can
+ evaluate formulas on both .xls and .xlsx files.</p>
+ <p> User-defined functions are <a href="user-defined-functions.html">supported</a>,
+ but must be rewritten in Java and registered with the macro-enabled workbook in order to be evaluated.
+ </p>
+ </section>
+ <section><title>User API How-TO</title>
+ <p>The following code demonstrates how to use the FormulaEvaluator
+ in the context of other POI excel reading code.
+ </p>
+ <p>There are several ways in which you can use the FormulaEvalutator API.</p>
+
+ <anchor id="Evaluate"/>
+ <section><title>Using FormulaEvaluator.<strong>evaluate</strong>(Cell cell)</title>
+ <p>This evaluates a given cell, and returns the new value,
+ without affecting the cell</p>
+ <source>
+FileInputStream fis = new FileInputStream("c:/temp/test.xls");
+Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("c:/temp/test.xls")
+Sheet sheet = wb.getSheetAt(0);
+FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
+
+// suppose your formula is in B3
+CellReference cellReference = new CellReference("B3");
+Row row = sheet.getRow(cellReference.getRow());
+Cell cell = row.getCell(cellReference.getCol());
+
+CellValue cellValue = evaluator.evaluate(cell);
+
+switch (cellValue.getCellType()) {
+ case Cell.CELL_TYPE_BOOLEAN:
+ System.out.println(cellValue.getBooleanValue());
+ break;
+ case Cell.CELL_TYPE_NUMERIC:
+ System.out.println(cellValue.getNumberValue());
+ break;
+ case Cell.CELL_TYPE_STRING:
+ System.out.println(cellValue.getStringValue());
+ break;
+ case Cell.CELL_TYPE_BLANK:
+ break;
+ case Cell.CELL_TYPE_ERROR:
+ break;
+
+ // CELL_TYPE_FORMULA will never happen
+ case Cell.CELL_TYPE_FORMULA:
+ break;
+}
+ </source>
+ <p>Thus using the retrieved value (of type
+ FormulaEvaluator.CellValue - a nested class) returned
+ by FormulaEvaluator is similar to using a Cell object
+ containing the value of the formula evaluation. CellValue is
+ a simple value object and does not maintain reference
+ to the original cell.
+ </p>
+ </section>
+
+ <anchor id="EvaluateFormulaCell"/>
+ <section><title>Using FormulaEvaluator.<strong>evaluateFormulaCell</strong>(Cell cell)</title>
+ <p><strong>evaluateFormulaCell</strong>(Cell cell)
+ will check to see if the supplied cell is a formula cell.
+ If it isn't, then no changes will be made to it. If it is,
+ then the formula is evaluated. The value for the formula
+ is saved alongside it, to be displayed in excel. The
+ formula remains in the cell, just with a new value</p>
+ <p>The return of the function is the type of the
+ formula result, such as Cell.CELL_TYPE_BOOLEAN</p>
+ <source>
+FileInputStream fis = new FileInputStream("/somepath/test.xls");
+Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
+Sheet sheet = wb.getSheetAt(0);
+FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
+
+// suppose your formula is in B3
+CellReference cellReference = new CellReference("B3");
+Row row = sheet.getRow(cellReference.getRow());
+Cell cell = row.getCell(cellReference.getCol());
+
+if (cell!=null) {
+ switch (evaluator.evaluateFormulaCell(cell)) {
+ case Cell.CELL_TYPE_BOOLEAN:
+ System.out.println(cell.getBooleanCellValue());
+ break;
+ case Cell.CELL_TYPE_NUMERIC:
+ System.out.println(cell.getNumericCellValue());
+ break;
+ case Cell.CELL_TYPE_STRING:
+ System.out.println(cell.getStringCellValue());
+ break;
+ case Cell.CELL_TYPE_BLANK:
+ break;
+ case Cell.CELL_TYPE_ERROR:
+ System.out.println(cell.getErrorCellValue());
+ break;
+
+ // CELL_TYPE_FORMULA will never occur
+ case Cell.CELL_TYPE_FORMULA:
+ break;
+ }
+}
+ </source>
+ </section>
+
+ <anchor id="EvaluateInCell"/>
+ <section><title>Using FormulaEvaluator.<strong>evaluateInCell</strong>(Cell cell)</title>
+ <p><strong>evaluateInCell</strong>(Cell cell) will check to
+ see if the supplied cell is a formula cell. If it isn't,
+ then no changes will be made to it. If it is, then the
+ formula is evaluated, and the new value saved into the cell,
+ in place of the old formula.</p>
+ <source>
+FileInputStream fis = new FileInputStream("/somepath/test.xls");
+Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
+Sheet sheet = wb.getSheetAt(0);
+FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
+
+// suppose your formula is in B3
+CellReference cellReference = new CellReference("B3");
+Row row = sheet.getRow(cellReference.getRow());
+Cell cell = row.getCell(cellReference.getCol());
+
+if (cell!=null) {
+ switch (evaluator.<strong>evaluateInCell</strong>(cell).getCellType()) {
+ case Cell.CELL_TYPE_BOOLEAN:
+ System.out.println(cell.getBooleanCellValue());
+ break;
+ case Cell.CELL_TYPE_NUMERIC:
+ System.out.println(cell.getNumericCellValue());
+ break;
+ case Cell.CELL_TYPE_STRING:
+ System.out.println(cell.getStringCellValue());
+ break;
+ case Cell.CELL_TYPE_BLANK:
+ break;
+ case Cell.CELL_TYPE_ERROR:
+ System.out.println(cell.getErrorCellValue());
+ break;
+
+ // CELL_TYPE_FORMULA will never occur
+ case Cell.CELL_TYPE_FORMULA:
+ break;
+ }
+}
+
+ </source>
+ </section>
+
+ <anchor id="EvaluateAll"/>
+ <section><title>Re-calculating all formulas in a Workbook</title>
+ <source>
+FileInputStream fis = new FileInputStream("/somepath/test.xls");
+Workbook wb = new HSSFWorkbook(fis); //or new XSSFWorkbook("/somepath/test.xls")
+FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
+for (Sheet sheet : wb) {
+ for (Row r : sheet) {
+ for (Cell c : r) {
+ if (c.getCellType() == Cell.CELL_TYPE_FORMULA) {
+ evaluator.evaluateFormulaCell(c);
+ }
+ }
+ }
+}
+ </source>
+
+ <p>Alternately, if you know which of HSSF or XSSF you're working
+ with, then you can call the static
+ <strong>evaluateAllFormulaCells</strong> method on the appropriate
+ HSSFFormulaEvaluator or XSSFFormulaEvaluator class.</p>
+ </section>
+ </section>
+
+ <anchor id="recalculation"/>
+ <section><title>Recalculation of Formulas</title>
+ <p>
+ In certain cases you may want to force Excel to re-calculate formulas when the workbook is opened.
+ Consider the following example:
+ </p>
+ <p>
+ Open Excel and create a new workbook. On the first sheet set A1=1, B1=1, C1=A1+B1.
+ Excel automatically calculates formulas and the value in C1 is 2. So far so good.
+ </p>
+ <p>
+ Now modify the workbook with POI:
+ </p>
+ <source>
+ Workbook wb = WorkbookFactory.create(new FileInputStream("workbook.xls"));
+
+ Sheet sh = wb.getSheetAt(0);
+ sh.getRow(0).getCell(0).setCellValue(2); // set A1=2
+
+ FileOutputStream out = new FileOutputStream("workbook2.xls");
+ wb.write(out);
+ out.close();
+ </source>
+ <p>
+ Now open workbook2.xls in Excel and the value in C1 is still 2 while you expected 3. Wrong? No!
+ The point is that Excel caches previously calculated results and you need to trigger recalculation to updated them.
+ It is not an issue when you are creating new workbooks from scratch, but important to remember when you are modifing
+ existing workbooks with formulas. This can be done in two ways:
+ </p>
+ <p>
+ 1. Re-evaluate formulas with POI's FormulaEvaluator:
+ </p>
+ <source>
+ Workbook wb = WorkbookFactory.create(new FileInputStream("workbook.xls"));
+
+ Sheet sh = wb.getSheetAt(0);
+ sh.getRow(0).getCell(0).setCellValue(2); // set A1=2
+
+ wb.getCreationHelper().createFormulaEvaluator().evaluateAll();
+ </source>
+ <p>
+ 2. Delegate re-calculation to Excel. The application will perform a full recalculation when the workbook is opened:
+ </p>
+ <source>
+ Workbook wb = WorkbookFactory.create(new FileInputStream("workbook.xls"));
+
+ Sheet sh = wb.getSheetAt(0);
+ sh.getRow(0).getCell(0).setCellValue(2); // set A1=2
+
+ wb.setForceFormulaRecalculation(true);
+ </source>
+ </section>
+
+ <anchor id="external"/>
+ <section><title>External (Cross-Workbook) references</title>
+ <p>It is possible for a formula in an Excel spreadsheet to
+ refer to a Named Range or Cell in a different workbook.
+ These cross-workbook references are normally called <em>External
+ References</em>. These are formulas which look something like:</p>
+ <source>
+ =SUM([Finances.xlsx]Numbers!D10:D25)
+ =SUM('C:\Data\[Finances.xlsx]Numbers'!D10:D25)
+ =SUM([Finances.xlsx]Range20)
+ </source>
+ <p>If you don't have access to these other workbooks, then you
+ should call
+ <a href="../../apidocs/dev/org/apache/poi/ss/usermodel/FormulaEvaluator.html#setIgnoreMissingWorkbooks(boolean)">setIgnoreMissingWorkbooks(true)</a>
+ to tell the Formula Evaluator to skip evaluating any external
+ references it can't look up.</p>
+ <p>In order for POI to be able to evaluate external references, it
+ needs access to the workbooks in question. As these don't necessarily
+ have the same names on your system as in the workbook, you need to
+ give POI a map of external references to open workbooks, through
+ the
+ <a href="../../apidocs/dev/org/apache/poi/ss/usermodel/FormulaEvaluator.html#setupReferencedWorkbooks(java.util.Map)">setupReferencedWorkbooks(java.util.Map&lt;java.lang.String,FormulaEvaluator&gt; workbooks)</a>
+ method. You should normally do something like:</p>
+ <source>
+// Create a FormulaEvaluator to use
+FormulaEvaluator mainWorkbookEvaluator = workbook.getCreationHelper().createFormulaEvaluator();
+
+// Track the workbook references
+Map&lt;String,FormulaEvaluator> workbooks = new HashMap&lt;String, FormulaEvaluator>();
+// Add this workbook
+workbooks.put("report.xlsx", mainWorkbookEvaluator);
+// Add two others
+workbooks.put("input.xls", WorkbookFactory.create("C:\\temp\\input22.xls").getCreationHelper().createFormulaEvaluator());
+workbooks.put("lookups.xlsx", WorkbookFactory.create("/home/poi/data/tmp-lookups.xlsx").getCreationHelper().createFormulaEvaluator());
+
+// Attach them
+mainWorkbookEvaluator.setupReferencedWorkbooks(workbooks);
+
+// Evaluate
+mainWorkbookEvaluator.evaluateAll();
+ </source>
+ </section>
+
+ <anchor id="Performance"/>
+ <section><title>Performance Notes</title>
+ <ul>
+ <li>Generally you should have to create only one FormulaEvaluator
+ instance per Workbook. The FormulaEvaluator will cache
+ evaluations of dependent cells, so if you have multiple
+ formulas all depending on a cell then subsequent evaluations
+ will be faster.
+ </li>
+ <li>You should normally perform all of your updates to cells,
+ before triggering the evaluation, rather than doing one
+ cell at a time. By waiting until all the updates/sets are
+ performed, you'll be able to take best advantage of the caching
+ for complex formulas.
+ </li>
+ <li>If you do end up making changes to cells part way through
+ evaluation, you should call <em>notifySetFormula</em> or
+ <em>notifyUpdateCell</em> to trigger suitable cache clearance.
+ Alternately, you could instantiate a new FormulaEvaluator,
+ which will start with empty caches.
+ </li>
+ <li>Also note that FormulaEvaluator maintains a reference to
+ the sheet and workbook, so ensure that the evaluator instance
+ is available for garbage collection when you are done with it
+ (in other words don't maintain long lived reference to
+ FormulaEvaluator if you don't really need to - unless
+ all references to the sheet and workbook are removed, these
+ don't get garbage collected and continue to occupy potentially
+ large amounts of memory).
+ </li>
+ <li>CellValue instances however do not maintain reference to the
+ Cell or the sheet or workbook, so these can be long-lived
+ objects without any adverse effect on performance.
+ </li>
+ </ul>
+ </section>
+ <section><title>Formula Evaluation Debugging</title>
+ <p>POI is not perfect and you may stumble across formula evaluation problems (Java exceptions
+ or just different results) in your special use case. To support an easy detailed analysis, a special
+ logging of the full evaluation is provided.</p>
+ <p>POI 5.1.0 and above uses <a href="https://logging.apache.org/log4j/2.x/">Log4J 2.x</a> as a logging framework. Try to set up a logging
+ configuration that lets you see the info and other log messages.</p>
+ <p>Example use:</p>
+ <source>
+ // open your file
+ Workbook wb = new HSSFWorkbook(new FileInputStream("foobar.xls"));
+ FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();
+
+ // get your cell
+ Cell cell = wb.getSheet(0).getRow(0).getCell(0); // just a dummy example
+
+ // perform debug output for the next evaluate-call only
+ evaluator.setDebugEvaluationOutputForNextEval(true);
+ evaluator.evaluateFormulaCell(cell);
+ evaluator.evaluateFormulaCell(cell); // no logging performed for this next evaluate-call
+ </source>
+ <p>The special Logger called "POI.FormulaEval" is used (useful if you use the CommonsLogger and a detailed logging configuration).
+ The used log levels are WARN and INFO (for detailed parameter info and results) - the level are so high to allow this
+ special logging without being disturbed by the bunch of DEBUG log entries from other classes.</p>
+ </section>
+
+ <anchor id="sxssf"/>
+ <section><title>Formula Evaluation and SXSSF</title>
+ <p>For versions before 3.13 final, no formula evaluation is possible with
+ SXSSF.</p>
+ <p>If you are using POI 3.13 final or newer, formula evaluation is possible with SXSSF,
+ but with some caveats.</p>
+ <p>The biggest restriction is that, since evaluating a cell needs that cell in memory
+ and any others it depends on, only pure-function formulas and formulas referencing
+ nearby cells can be evaluated with SXSSF. If a formula references a cell that hasn't
+ yet been written, or one which has already been flushed to disk, then it won't be
+ possible to evaluate it.</p>
+ <p>Because of this, a call to <em>wb.getCreationHelper().createFormulaEvaluator().evaluateAll();</em>
+ will very rarely work on SXSSF, as it's very rare that all the cells wil be available
+ and in memory at any time! Instead, it is suggested to evaluate formula cells just
+ after writing them, or shortly after when cells they depend on are added. Just make
+ sure that all cells needing or needed for evaluation are inside the window.</p>
+ </section>
+ </body>
+</document>