Busy Developers' Guide to HSSF and XSSF FeaturesBusy Developers' Guide to Features
Want to use HSSF and XSSF read and write spreadsheets in a hurry? This
guide is for you. If you're after more in-depth coverage of the HSSF and
XSSF user-APIs, please consult the HOWTO
guide as it contains actual descriptions of how to use this stuff.
Index of Features
How to create a new workbook
How to create a sheet
How to create cells
How to create date cells
Working with different types of cells
Iterate over rows and cells
Getting the cell contents
Text Extraction
Aligning cells
Working with borders
Fills and color
Merging cells
Working with fonts
Custom colors
Reading and writing
Use newlines in cells.
Create user defined data formats
Fit Sheet to One Page
Set print area for a sheet
Set page numbers on the footer of a sheet
Shift rows
Set a sheet as selected
Set the zoom magnification for a sheet
Create split and freeze panes
Repeating rows and columns
Headers and Footers
Drawing Shapes
Styling Shapes
Shapes and Graphics2d
Outlining
Images
Named Ranges and Named Cells
How to set cell comments
How to adjust column width to fit the contents
Hyperlinks
Data Validation
Embedded Objects
Autofilters
Conditional Formatting
FeaturesNew WorkbookNew SheetCreating CellsCreating Date CellsWorking with different types of cellsDemonstrates various alignment optionsWorking with bordersIterate over rows and cells
Sometimes, you'd like to just iterate over all the rows in
a sheet, or all the cells in a row. This is possible with
a simple for loop.
Luckily, this is very easy. Row defines a
CellIterator inner class to handle iterating over
the cells (get one with a call to row.cellIterator()),
and Sheet provides a rowIterator() method to
give an iterator over all the rows.
Alternately, Sheet and Row both implement java.lang.Iterable,
so using Java 1.5 you can simply take advantage
of the built in "foreach" support - see below.
Iterate over rows and cells using Java 1.5 foreach loops
Sometimes, you'd like to just iterate over all the rows in
a sheet, or all the cells in a row. If you are using Java
5 or later, then this is especially handy, as it'll allow the
new foreach loop support to work.
Luckily, this is very easy. Both Sheet and Row
implement java.lang.Iterable to allow foreach
loops. For Row this allows access to the
CellIterator inner class to handle iterating over
the cells, and for Sheet gives the
rowIterator() to iterator over all the rows.
Getting the cell contents
To get the contents of a cell, you first need to
know what kind of cell it is (asking a string cell
for its numeric contents will get you a
NumberFormatException for example). So, you will
want to switch on the cell's type, and then call
the appropriate getter for that cell.
In the code below, we loop over every cell
in one sheet, print out the cell's reference
(eg A3), and then the cell's contents.
Text Extraction
For most text extraction requirements, the standard
ExcelExtractor class should provide all you need.
For very fancy text extraction, XLS to CSV etc,
take a look at
/src/examples/src/org/apache/poi/hssf/eventusermodel/examples/XLS2CSVmra.java
Fills and colorsMerging cellsWorking with fonts
Note, the maximum number of unique fonts in a workbook is limited to 32767 (
the maximum positive short). You should re-use fonts in your apllications instead of
creating a font for each cell.
Examples:
Wrong:
Correct:
Custom colors
HSSF:
XSSF:
Reading and Rewriting WorkbooksUsing newlines in cellsData FormatsFit Sheet to One PageSet Print AreaSet Page Numbers on FooterUsing the Convenience Functions
The convenience functions provide
utility features such as setting borders around merged
regions and changing style attributes without explicitly
creating new styles.
Shift rows up or down on a sheetSet a sheet as selectedSet the zoom magnification
The zoom is expressed as a fraction. For example to
express a zoom of 75% use 3 for the numerator and
4 for the denominator.
Splits and freeze panes
There are two types of panes you can create; freeze panes and split panes.
A freeze pane is split by columns and rows. You create
a freeze pane using the following mechanism:
sheet1.createFreezePane( 3, 2, 3, 2 );
The first two parameters are the columns and rows you
wish to split by. The second two parameters indicate
the cells that are visible in the bottom right quadrant.
Split pains appear differently. The split area is
divided into four separate work area's. The split
occurs at the pixel level and the user is able to
adjust the split by dragging it to a new position.
The first parameter is the x position of the split.
This is in 1/20th of a point. A point in this case
seems to equate to a pixel. The second parameter is
the y position of the split. Again in 1/20th of a point.
The last parameter indicates which pane currently has
the focus. This will be one of Sheet.PANE_LOWER_LEFT,
PANE_LOWER_RIGHT, PANE_UPPER_RIGHT or PANE_UPPER_LEFT.
Repeating rows and columns
It's possible to set up repeating rows and columns in
your printouts by using the setRepeatingRowsAndColumns()
function in the HSSFWorkbook class.
This function Contains 5 parameters.
The first parameter is the index to the sheet (0 = first sheet).
The second and third parameters specify the range for the columns to repreat.
To stop the columns from repeating pass in -1 as the start and end column.
The fourth and fifth parameters specify the range for the rows to repeat.
To stop the columns from repeating pass in -1 as the start and end rows.
Headers and Footers
Example is for headers but applies directly to footers.
Drawing Shapes
POI supports drawing shapes using the Microsoft Office
drawing tools. Shapes on a sheet are organized in a
hiearchy of groups and and shapes. The top-most shape
is the patriarch. This is not visisble on the sheet
at all. To start drawing you need to call createPatriarch
on the HSSFSheet class. This has the
effect erasing any other shape information stored
in that sheet. By default POI will leave shape
records alone in the sheet unless you make a call to
this method.
To create a shape you have to go through the following
steps:
Create the patriarch.
Create an anchor to position the shape on the sheet.
Ask the patriarch to create the shape.
Set the shape type (line, oval, rectangle etc...)
Set any other style details converning the shape. (eg:
line thickness, etc...)
Text boxes are created using a different call:
It's possible to use different fonts to style parts of
the text in the textbox. Here's how:
Just as can be done manually using Excel, it is possible
to group shapes together. This is done by calling
createGroup() and then creating the shapes
using those groups.
It's also possible to create groups within groups.
Any group you create should contain at least two
other shapes or subgroups.
Here's how to create a shape group:
If you're being observant you'll noticed that the shapes
that are added to the group use a new type of anchor:
the HSSFChildAnchor. What happens is that
the created group has it's own coordinate space for
shapes that are placed into it. POI defaults this to
(0,0,1023,255) but you are able to change it as desired.
Here's how:
If you create a group within a group it's also going
to have it's own coordinate space.
Styling Shapes
By default shapes can look a little plain. It's possible
to apply different styles to the shapes however. The
sorts of things that can currently be done are:
Change the fill color.
Make a shape with no fill color.
Change the thickness of the lines.
Change the style of the lines. Eg: dashed, dotted.
Change the line color.
Here's an examples of how this is done:
Shapes and Graphics2d
While the native POI shape drawing commands are the
recommended way to draw shapes in a shape it's sometimes
desirable to use a standard API for compatibility with
external libraries. With this in mind we created some
wrappers for Graphics and Graphics2d.
It's important to not however before continuing that
Graphics2d is a poor match to the capabilities
of the Microsoft Office drawing commands. The older
Graphics class offers a closer match but is
still a square peg in a round hole.
All Graphics commands are issued into an HSSFShapeGroup.
Here's how it's done:
The first thing we do is create the group and set it's coordinates
to match what we plan to draw. Next we calculate a reasonable
fontSizeMultipler then create the EscherGraphics object.
Since what we really want is a Graphics2d
object we create an EscherGraphics2d object and pass in
the graphics object we created. Finally we call a routine
that draws into the EscherGraphics2d object.
The vertical points per pixel deserves some more explanation.
One of the difficulties in converting Graphics calls
into escher drawing calls is that Excel does not have
the concept of absolute pixel positions. It measures
it's cell widths in 'characters' and the cell heights in points.
Unfortunately it's not defined exactly what type of character it's
measuring. Presumably this is due to the fact that the Excel will be
using different fonts on different platforms or even within the same
platform.
Because of this constraint we've had to implement the concept of a
verticalPointsPerPixel. This the amount the font should be scaled by when
you issue commands such as drawString(). To calculate this value
use the follow formula:
The height of the group is calculated fairly simply by calculating the
difference between the y coordinates of the bounding box of the shape. The
height of the group can be calculated by using a convenience called
HSSFClientAnchor.getAnchorHeightInPoints().
Many of the functions supported by the graphics classes
are not complete. Here's some of the functions that are known
to work.
fillRect()
fillOval()
drawString()
drawOval()
drawLine()
clearRect()
Functions that are not supported will return and log a message
using the POI logging infrastructure (disabled by default).
Outlining
Outlines are great for grouping sections of information
together and can be added easily to columns and rows
using the POI API. Here's how:
To collapse (or expand) an outline use the following calls:
The row/column you choose should contain an already
created group. It can be anywhere within the group.
Images
Images are part of the drawing support. To add an image just
call createPicture() on the drawing patriarch.
At the time of writing the following types are supported:
PNG
JPG
DIB
It should be noted that any existing drawings may be erased
once you add a image to a sheet.
Picture.resize() works only for JPEG and PNG. Other formats are not yet supported.
Reading images from a workbook:
Named Ranges and Named Cells
Named Range is a way to refer to a group of cells by a name. Named Cell is a
degenerate case of Named Range in that the 'group of cells' contains exactly one
cell. You can create as well as refer to cells in a workbook by their named range.
When working with Named Ranges, the classes: org.apache.poi.hssf.util.CellReference and
& org.apache.poi.hssf.util.AreaReference are used (these
work for both XSSF and HSSF, despite the package name).
Creating Named Range / Named Cell
Reading from Named Range / Named Cell
Reading from non-contiguous Named Ranges
Note, when a cell is deleted, Excel does not delete the
attached named range. As result, workbook can contain
named ranges that point to cells that no longer exist.
You should check the validity of a reference before
constructing AreaReference
Cell Comments - HSSF and XSSF
A comment is a rich text note that is attached to &
associated with a cell, separate from other cell content.
Comment content is stored separate from the cell, and is displayed in a drawing object (like a text box)
that is separate from, but associated with, a cell
Reading cell comments
Adjust column width to fit the contents
Note, that Sheet#autoSizeColumn() does not evaluate formula cells,
the width of formula cells is calculated based on the cached formula result.
If your workbook has many formulas then it is a good idea to evaluate them before auto-sizing.
To calculate column width Sheet.autoSizeColumn uses Java2D classes
that throw exception if graphical environment is not available. In case if graphical environment
is not available, you must tell Java that you are running in headless mode and
set the following system property: java.awt.headless=true .
How to read hyperlinksHow to create hyperlinksData Validations
Check the value a user enters into a cell against one or more predefined value(s).
The following code will limit the value the user can enter into cell A1 to one of three integer values, 10, 20 or 30.
Drop Down Lists:
This code will do the same but offer the user a drop down list to select a value from.
Messages On Error:
To create a message box that will be shown to the user if the value they enter is invalid.
Replace 'Box Title' with the text you wish to display in the message box's title bar
and 'Message Text' with the text of your error message.
Prompts:
To create a prompt that the user will see when the cell containing the data validation receives focus
The text encapsulated in the first parameter passed to the createPromptBox() method will appear emboldened
and as a title to the prompt whilst the second will be displayed as the text of the message.
The createExplicitListConstraint() method can be passed and array of String(s) containing interger, floating point, dates or text values.
Further Data Validations:
To obtain a validation that would check the value entered was, for example, an integer between 10 and 100,
use the DVConstraint.createNumericConstraint(int, int, String, String) factory method.
Look at the javadoc for the other validation and operator types; also note that not all validation
types are supported for this method. The values passed to the two String parameters can be formulas; the '=' symbol is used to denote a formula
It is not possible to create a drop down list if the createNumericConstraint() method is called,
the setSuppressDropDownArrow(false) method call will simply be ignored.
Date and time constraints can be created by calling the createDateConstraint(int, String, String, String)
or the createTimeConstraint(int, String, String). Both are very similar to the above and are explained in the javadoc.
Creating Data Validations From Spreadsheet Cells.
The contents of specific cells can be used to provide the values for the data validation
and the DVConstraint.createFormulaListConstraint(String) method supports this.
To specify that the values come from a contiguous range of cells do either of the following:
or
and in both cases the user will be able to select from a drop down list containing the values from cells A1, A2 and A3.
The data does not have to be as the data validation. To select the data from a different sheet however, the sheet
must be given a name when created and that name should be used in the formula. So assuming the existence of a sheet named 'Data Sheet' this will work:
as will this:
whilst this will not:
and nor will this:
Embedded Objects
It is possible to perform more detailed processing of an embedded Excel, Word or PowerPoint document,
or to work with any other type of embedded object.
HSSF:
XSSF:
(Since POI-3.7)
AutofiltersConditional Formatting
See more examples on Excel conditional formatting in
ConditionalFormats.java