Walkthrough of an "evaluate()" implementation.

So here is the fun part - lets walk through the implementation of the excel - function... SQRT()

+ function... NOT()

The Code -public class Sqrt extends NumericFunction { - - private static final ValueEvalToNumericXlator NUM_XLATOR = - new ValueEvalToNumericXlator((short) - ( ValueEvalToNumericXlator.BOOL_IS_PARSED - | ValueEvalToNumericXlator.EVALUATED_REF_BOOL_IS_PARSED - | ValueEvalToNumericXlator.EVALUATED_REF_STRING_IS_PARSED - | ValueEvalToNumericXlator.REF_BOOL_IS_PARSED - | ValueEvalToNumericXlator.STRING_IS_PARSED - )); - - protected ValueEvalToNumericXlator getXlator() { - return NUM_XLATOR; - } - - public Eval evaluate(Eval[] operands, int srcRow, short srcCol) { - double d = 0; - ValueEval retval = null; - - switch (operands.length) { - default: - retval = ErrorEval.VALUE_INVALID; - break; - case 1: - ValueEval ve = singleOperandEvaluate(operands[0], srcRow, srcCol); - if (ve instanceof NumericValueEval) { - NumericValueEval ne = (NumericValueEval) ve; - d = ne.getNumberValue(); - } - else if (ve instanceof BlankEval) { - // do nothing - } - else { - retval = ErrorEval.NUM_ERROR; - } - } - - if (retval == null) { - d = Math.sqrt(d); - retval = (Double.isNaN(d)) ? (ValueEval) ErrorEval.VALUE_INVALID : new NumberEval(d); - } - return retval; - } +public final class Not implements Function { + public Eval evaluate(Eval[] args, int srcCellRow, short srcCellCol) { + if (args.length != 1) { + return ErrorEval.VALUE_INVALID; + } + boolean boolArgVal; + try { + ValueEval ve = OperandResolver.getSingleValue(args[0], srcCellRow, srcCellCol); + Boolean b = OperandResolver.coerceValueToBoolean(ve, false); + boolArgVal = b == null ? false : b.booleanValue(); + } catch (EvaluationException e) { + return e.getErrorEval(); + } + + return BoolEval.valueOf(!boolArgVal); + } } +

Implementation Details

The first thing to realise is that classes already exist, even for functions that are not yet implemented. - Just that they extend from DefaultFunctionImpl whose behaviour is to return an ErrorEval.FUNCTION_NOT_IMPLEMENTED value.
In order to implement SQRT(..), we need to: a. Extend from the correct Abstract super class; b. implement the evaluate(..) method
Hence we extend SQRT(..) from the predefined class NumericFunction
Since SQRT(..) takes a single argument, we verify the length of the operands array else set the return value to ErrorEval.VALUE_INVALID
In order to implement NOT(..), we need to implement the interface 'Function' which has a method 'evaluate(..)'
Since NOT(..) takes a single argument, we verify the length of the operands array else set the return value to ErrorEval.VALUE_INVALID
Next we normalize each operand to a limited set of ValueEval subtypes, specifically, we call the function - singleOperandEvaluate(..) to do conversions of different value eval types to one of: NumericValueEval, - BlankEval and ErrorEval. The conversion logic is configured by a ValueEvalToNumericXlator instance which - is returned by the Factory method: getXlator(..) The flags used to create the ValueEvalToNumericXlator - instance are briefly explained as follows: - BOOL_IS_PARSED means whether this function treats Boolean values as 1, - REF_BOOL_IS_PARSED means whether Boolean values in cell references are parsed or not. - So also, EVALUATED_REF_BOOL_IS_PARSED means if the operand was a RefEval that was assigned a - Boolean value as a result of evaluation of the formula that it contained. - eg. SQRT(TRUE) returns 1: This means BOOL_IS_PARSED should be set. - SQRT(A1) returns 1 when A1 has TRUE: This means REF_BOOL_IS_PARSED should be set. - SQRT(A1) returns 1 when A1 has a formula that evaluates to TRUE: This means EVALUATED_REF_BOOL_IS_PARSED should be set. - If the flag is not set for a particular case, that case is ignored (treated as if the cell is blank) _unless_ - there is a flag like: STRING_IS_INVALID_VALUE (which means that Strings should be treated as resulting in VALUE_INVALID ErrorEval) + OperandResolver.getSingleValue to do conversions of different value eval types to one of: NumericValueEval, + BlankEval and ErrorEval. The conversion logic is performed by OperandResolver.coerceValueToBoolean()
Next perform the appropriate Math function on the double value (if an error didnt occur already).
Finally before returning the NumberEval wrapping the double value that - you computed, do one final check to see if the double is a NaN, (or if it is "Infinite") - If it is return the appropriate ErrorEval instance. Note: The OpenOffice.org error codes - should NOT be preferred. Instead use the excel specific error codes like VALUE_INVALID, NUM_ERROR, DIV_ZERO etc. - (Thanks to Avik for bringing this issue up early!) The Oo.o ErrorCodes will be removed (if they havent already been :)
Next perform the appropriate java operation (if an error didnt occur already).
Finally return the BoolEval wrapping primitive boolean result. Note - in the case of numeric results + you should check for NaN and Infinity, because exel likes these translated into specific error codes like + VALUE_INVALID, NUM_ERROR, DIV_ZERO etc. + (Thanks to Avik for bringing this issue up early!)

Modelling Excel Semantics @@ -170,8 +131,8 @@ public class Sqrt extends NumericFunction { Because when you use TRUE in referenced cells with arithmetic functions, it evaluates to blank - meaning it is not evaluated - as if it was string or a blank cell. eg. "=SUM(1,A1)" when A1 is TRUE evaluates to 1. This behaviour changes depending on which function you are using. eg. SQRT(..) that was - described earlier treats a TRUE as 1 in all cases. This is why the configurable ValueEvalToNumericXlator - class had to be written. + described earlier treats a TRUE as 1 in all cases. The various conversion logic has been refactored into common places like the following classes: + OperandResolver, TextFunction, NumericFunction, MultiOperandNumericFunction and FinanceFunction.

Note that when you are extending from an abstract function class like NumericFunction (rather than implementing the interface o.a.p.hssf.record.formula.eval.Function directly) @@ -186,18 +147,16 @@ public class Sqrt extends NumericFunction {

Testing Framework

Automated testing of the implemented Function is easy. - The source code for this is in the file: o.a.p.h.record.formula.GenericFormulaTestCase.java - This class has a reference to the test xls file (not /a/ test xls, /the/ test xls :) - which may need to be changed for your environment. Once you do that, in the test xls, + The source code for this is in the file: o.a.p.h.record.formula.TestFormulasFromSpreadsheet.java + This class has a reference to the Excel test sample file 'FormulaEvalTestData.xls'. In this file, locate the entry for the function that you have implemented and enter different tests in a cell in the FORMULA row. Then copy the "value of" the formula that you entered in the cell just below it (this is easily done in excel as: [copy the formula cell] > [go to cell below] > Edit > Paste Special > Values > "ok"). You can enter multiple such formulas and paste their values in the cell below and the test framework will automatically test if the formula evaluation matches the expected - value (Again, hard to put in words, so if you will, please take time to quickly look - at the code and the currently entered tests in the patch attachment "FormulaEvalTestData.xls" - file). + value (Please take time to quickly look at the code and the currently entered tests in the + file "FormulaEvalTestData.xls").

diff --git a/src/documentation/content/xdocs/hssf/eval.xml b/src/documentation/content/xdocs/hssf/eval.xml index 32279cc4a3..61cb663747 100644 --- a/src/documentation/content/xdocs/hssf/eval.xml +++ b/src/documentation/content/xdocs/hssf/eval.xml @@ -20,13 +20,13 @@ -

- Formula Evaluation - +

+ Formula Evaluation + - -

- + +

Introduction

The POI formula evaluation code enables you to calculate the result of formulas in Excels sheets read-in, or created in POI. This document explains @@ -66,35 +66,34 @@ FileInputStream fis = new FileInputStream("c:/temp/test.xls"); HSSFWorkbook wb = new HSSFWorkbook(fis); HSSFSheet sheet = wb.getSheetAt(0); -HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb); +HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(wb); // suppose your formula is in B3 -CellReference cellReference = new CellReference("B3"); +CellReference cellReference = new CellReference("B3"); HSSFRow row = sheet.getRow(cellReference.getRow()); -HSSFCell cell = row.getCell(cellReference.getCol()); +HSSFCell cell = row.getCell((int)cellReference.getCol()); -evaluator.setCurrentRow(row); HSSFFormulaEvaluator.CellValue cellValue = evaluator.evaluate(cell); switch (cellValue.getCellType()) { case HSSFCell.CELL_TYPE_BOOLEAN: - System.out.println(cellValue.getBooleanValue()); - break; + System.out.println(cellValue.getBooleanValue()); + break; case HSSFCell.CELL_TYPE_NUMERIC: - System.out.println(cellValue.getNumberValue()); - break; + System.out.println(cellValue.getNumberValue()); + break; case HSSFCell.CELL_TYPE_STRING: - System.out.println(cellValue.getStringValue()); - break; + System.out.println(cellValue.getStringValue()); + break; case HSSFCell.CELL_TYPE_BLANK: - break; + break; case HSSFCell.CELL_TYPE_ERROR: - break; + break; // CELL_TYPE_FORMULA will never happen - case HSSFCell.CELL_TYPE_FORMULA: - break; -} + case HSSFCell.CELL_TYPE_FORMULA: + break; +}

Thus using the retrieved value (of type HSSFFormulaEvaluator.CellValue - a nested class) returned @@ -119,36 +118,42 @@ switch (cellValue.getCellType()) { FileInputStream fis = new FileInputStream("/somepath/test.xls"); HSSFWorkbook wb = new HSSFWorkbook(fis); HSSFSheet sheet = wb.getSheetAt(0); -HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb); +HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(wb); // suppose your formula is in B3 -CellReference cellReference = new CellReference("B3"); +CellReference cellReference = new CellReference("B3"); HSSFRow row = sheet.getRow(cellReference.getRow()); -HSSFCell cell = row.getCell(cellReference.getCol()); -evaluator.setCurrentRow(row); +HSSFCell cell = row.getCell((int)cellReference.getCol()); if (cell!=null) { - switch (evaluator.evaluateFormulaCell(cell)) { + int valueType = evaluator.evaluateFormulaCell(cell); + if (valueType == -1) { + // Cell was not a formula cell + // but we can read the plain value from the cell just the same as a formula result + valueType = cell.getCellType(); + } + switch (valueType) { case HSSFCell.CELL_TYPE_BOOLEAN: - System.out.println(cell.getBooleanCellValue()); - break; + System.out.println(cell.getBooleanCellValue()); + break; case HSSFCell.CELL_TYPE_NUMERIC: - System.out.println(cell.getNumberCellValue()); - break; + System.out.println(cell.getNumericCellValue()); + break; case HSSFCell.CELL_TYPE_STRING: - System.out.println(cell.getStringCellValue()); - break; + System.out.println(cell.getRichStringCellValue().getString()); + break; case HSSFCell.CELL_TYPE_BLANK: - break; + break; case HSSFCell.CELL_TYPE_ERROR: - System.out.println(cell.getErrorCellValue()); - break; - - // CELL_TYPE_FORMULA will never occur - case HSSFCell.CELL_TYPE_FORMULA: - break; + System.out.println(HSSFErrorConstants.getText(cell.getErrorCellValue())); + break; + + case HSSFCell.CELL_TYPE_FORMULA: + throw new IllegalStateException("Result-type 'formula' cannot occur"); } } + +

@@ -163,36 +168,36 @@ if (cell!=null) { FileInputStream fis = new FileInputStream("/somepath/test.xls"); HSSFWorkbook wb = new HSSFWorkbook(fis); HSSFSheet sheet = wb.getSheetAt(0); -HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(sheet, wb); +HSSFFormulaEvaluator evaluator = new HSSFFormulaEvaluator(wb); // suppose your formula is in B3 -CellReference cellReference = new CellReference("B3"); +CellReference cellReference = new CellReference("B3"); HSSFRow row = sheet.getRow(cellReference.getRow()); -HSSFCell cell = row.getCell(cellReference.getCol()); -evaluator.setCurrentRow(row); +HSSFCell cell = row.getCell((int)cellReference.getCol()); if (cell!=null) { - switch (evaluator.evaluateInCell(cell).getCellType()) { + switch (evaluator.evaluateInCell(cell).getCellType()) { case HSSFCell.CELL_TYPE_BOOLEAN: - System.out.println(cell.getBooleanCellValue()); - break; + System.out.println(cell.getBooleanCellValue()); + break; case HSSFCell.CELL_TYPE_NUMERIC: - System.out.println(cell.getNumberCellValue()); - break; + System.out.println(cell.getNumericCellValue()); + break; case HSSFCell.CELL_TYPE_STRING: - System.out.println(cell.getStringCellValue()); - break; + System.out.println(cell.getRichStringCellValue().getString()); + break; case HSSFCell.CELL_TYPE_BLANK: - break; + break; case HSSFCell.CELL_TYPE_ERROR: - System.out.println(cell.getErrorCellValue()); - break; - + System.out.println(cell.getErrorCellValue()); + break; + // CELL_TYPE_FORMULA will never occur - case HSSFCell.CELL_TYPE_FORMULA: - break; + case HSSFCell.CELL_TYPE_FORMULA: + break; } } +