FOP is a print formatter for XSL formatting objects.
It can be used to render an XML file containing XSL formatting objects into a page layout. The main target is PDF but other rendering targets are supported, such as AWT, PCL, text and direct printing.
FOP provides both an application and a library that converts an XSL FO document into paginated output.
The FOP command line application can be directly used to transform XML into PDF, PostScript, PCL and other formats, there is also an AWT based viewer integrated.
The library can be used in servlets and other Java applications.
It's an acronym for Formatting Object Processor
FOP is distributed with Cocoon as a PDF serializer for XSL (FO) documents.
Batik can be used with FOP to transcode an SVG image into a PDF document.
XSL is a W3C standard concerned with publishing XML documents. It consists of two parts: XSLT and XSLFO. The acronym expands to eXtensible Stylesheet Language.
XSLFO is an XML vocabulary that is used to specify a pagination and other styling for page layout output. The acronym “FO” stands for Formatting Objects. XSLFO can be used in conjunction with XSLT to convert from any XML format into a paginated layout ready for printing or displaying.
XSLFO defines a set of elements in XML that describes the way pages are set up. The contents of the pages are filled from flows. There can be static flows that appear on every page (for headers and footers) and the main flow which fills the body of the page.
Synonyms: XSL FO, XSL (FO), XSL:FO, XSL-FO, Formatting Objects
XSLT describes the transformation of arbitrary XML input into other XML (like XSLFO), HTML or plain text. The “T” comes from Transformation. For historical reasons, a transformation is often also called a “style sheet”.
Synonyms: XSL transformation, XSL:T, XSL style sheet.
There are numerous ways that you can help. They are outlined in the Developer's Introduction.
FOP was changed to be in accordance with the latest standard (see XSL standard).The page master for a fo:page-sequence is now refereced by the master-reference attribute. Replace the master-name attributes of your fo:page-sequence elements by master-reference attributes. You have to do this also for fo:single-page-master-reference, fo:repeatable-page-master-reference and fo:conditional-page-master-reference elements in you page master definitions.
See also release notes.
This is typically a problem with your classpath.
If you are running FOP from the command line:
If you run FOP embedded in your servlet, web application or other Java application, check the classpath of the application.
This is usually caused by an older version of one of the FOP jars or old XML tools in the classpath. Check in particular for parser.jar, jaxp.jar, xml4j.jar or lotusxsl.jar.
Incompatible versions of Batik may also cause this problem. Use the version of Batik that comes with FOP. It is not always possible to use a more recent version of Batik.
FOP can consume quite a bit of memory, even though this has been continually improved. The memory consumption is partially inherent to the formatting process and partially caused by implementation choices. For certain layouts, all FO processors currently on the market have memory problems.
Some hints regarding your document structure:
There are also some bugs which cause FOP to go into an nonterminating loop, which also often results in a memory overflow. A characteristic symptom are continuous box overflows. Most of them are triggered by elements not fitting in the available space, like big images and improperly specified width of nested block elements. Look for such stuff and correct it.
Reducing memory consumption in general and squishing bugs is an ongoing effort, partially addressed in the redesign.
What you probably think of as "file names" are usually URLs, in particular the src attribute of fo:external-graphic.
Because usage of URLs is growing, you should make yourself familiar with it. The relevant specification is RFC 2396.
In a nutshell, the correct syntax for an absolute file URL is
file:///some/path/file.ext
on Unix and
file:///z:/some/path/file.ext
on Windows systems. Note
the triple slash, and also that only forward slashes are used, even on
windows.
A relative file URL starts with anything but a slash, and doesn't have
the file:
prefix, for example file.ext
,
path/file.ext
or ../file.ext
. The string
file:path/file.ext
is not a relative URL,
in fact, it isn't a valid URL at all. A relative URL is subject to a
resolving process, which transforms it into an absolute
URL.
This is often caused by an invalid FO document. Currently only very common errors are intercepted and produce a comprehensible error message. If you forgot container elements like fo:page-sequence or fo:flow and put blocks and inline elements directly as children of fo:root or fo:page-sequence, you'll only get a NullPointerException. Check whether your FO file has a proper structure. In some cases there are mandatory properties, like the master-reference in fo:conditional-page-master-reference, check also whether you got them right.
You may find it helpful to use the validation tools to validate your FO document. This will catch most problems, but should not be relied upon to catch all.
If you use XSLT, problems in your style sheet and in your source XML also often produce a NullPointerException. Run the transformation separately to check for this, usually you'll get a detailed error message from the XSLT processor.
If you turn on debugging with the "-d" option you may be able to see more detailed information.
See the article "Review FOP's Standards Compliance".
The most likely reason is a known problem with the Java run time environment which is triggered by rendering SVGs. Suns JDK 1.4 does not have this problem. See also FOP does not exit if a SVG is included.
Another possibility is that FOP went into a non terminating loop. Usually this is indicated by lots of log messages of the form "[INFO]: [NNNN]" which indicate a new page has been started or box overflows. After some time, FOP will crash with an OutOfMemoryException.
If you called the FOP command line application from some other programm, for example from Java using Runtime.exec(), it may hang while trying to write log entries to the output pipe. You have to read the FOP output regularly to empty the pipe buffer. It is best to avoid exec'ing FOP, use the library interface instead.
There is something too large to fit into the intended place, usually a large image, a table whose rows are kept together or a block with a space-before or space-after larger than the page size. Catch the first page showing this phenomenon and check it. If it is not obvious which element causes the trouble, remove stuff until the problem goes away. Decrease the dimensions of the offending element or property, or increase the dimension of the enclosing element or container, or remove keep-with-* properties.
The src attribute of the fo:external-graphics element takes an URI, not a file name.
Relative URLs are resolved against the baseDir property of FOP. For the command line FOP application, the baseDir is the directory of the input file, either the FO file or the XML source. If FOP is used embedded in a servlet, baseDir can be set explicitely. If it's not set, it is usually the current working directory of the process which runs FOP.
Did you get: «Failed to read font metrics file C:\foo\arial.xml : File "C:\foo\arial.xml" not found²? The value for the metrics-file attribute in the user config file is actually an URL, not a file name. Use "file:///C:/foo/arial.xml" instead.
If you used a relative URL, make sure your application has the working directory you expect. Currently FOP does not use the baseDir for resolving relative URLs pointing to font metric files.
This is because spec conformance has been improved.
The force-page-count
property controls how a FO processor pads page sequences in
order to get certain page counts or last page numbers. The default is
"auto
". With this setting, if the next page sequence
begins with an odd page number because you set the
initial-page-number, and the current page sequence also ends with an
odd page number, the processor inserts a blank page to keep odd and
even page numbers alternating (similar for the case the current page
sequence ends with an even page number and the next page sequence
starts with an even page number.
If you don't want to have this blank page, use
force-page-count="no-force"
.
The Jimi image library, which is by default used for processing
images in PNG and other formats, was removed from the distribution
for licensing reasons. You have to
Extract the file "JimiProClasses.zip" from the archive you've downloaded, rename it to "jimi-1.0.jar" and move it to FOP's lib directory.
An alternative to Jimi is to use Sun's JAI. It is much faster, but not available for all platforms.
These properties are not implemented, except on table rows. In order to take advantage of them, you have to nest stuff to be kept together in a table.
The concept is called “blind table”. The table is used for pure layout reasons and not obvious in the output.
An example of an image and the image caption to be kept together:
Check for fo:table-body around the rows. FOP up to 0.20.4 doesn't raise an error if it is omitted, it just drops the content. More recent releases will catch this problem.
Also, the fo:table-with-caption element is not implemented, tables within such an element are dropped too. FOP generates an error message for this problem. The DocBook style sheets generate fo:table-with-caption elements, so watch out.
Clipping as specified by the overflow="hidden"
is not yet
implemented. If you have long words overflowing table cells, try to
get them hyphenated. Artificial names like product identifications or
long numbers usually aren't hyphenated. You can try special processing
at XSLT level, like
Check the XSL FAQ and the XSL list archive for how to perform these tasks.
This happens for fo:page-number-citation elements if the citation occurs before FOP formatted the requested page, usually in TOC or index pages.
It is caused by the problem that FOP has to guess how much space the yet unknown page number will occupy, and usually the guesses are somewhat off. You can try to use a non-proportional font like Courier to remedy this. However, this is likely to look ugly, and wont fix the problem completely.
The most common reason is that the file is not found because of an empty or wrong baseDir setting, spelling errors in the file name, in particular using the wrong case, or, if the image is retrieved over HTTP, the image was not delivered because of security settings in the server, missing cookies or other authorization information, or because of server misconfigurations. One way to check this is to cut&paste the source URL from the fo:external-graphic into the Location field of a browser on the machine where the FOP process will be running.
Several other possibilities:
Set the language attribute somewhere. Check whether you use a language for which hyphenation is supported. Supported languages can be deduced from the files in the hyph directory of the FOP source distribution.
Look at the servlet example.
A rather minimal code snippet to demonstrate the basics:
Caveat: Internet Explorer will not automatically show the PDF. Thats a well known IEx problem, not with the servlet. You can download the PDF with IEx and view it later. There are other problems with this code.
Please look into Embedding FOP for all kinds of details.
Use the TraxInputHandler if both the source XML and XSL are read from files.
A demonstration:
This minimal code snippet has the same problems as the one from the question above. Please inform yourself about the details.
If your source XML is generated on the fly, for example from a database, a web service, or another servlet, you have to create a transformer object explicitely and use a SAX event stream to feed the transformation result into FOP.
A demonstration:
You don't have to call run() or render() on the driver object.
The xmlsource
is a placeholder for your actual XML
source. You can supply a new StreamSource( new
StringReader(xmlstring))
if you have to read the XML from a
string. Constructing an XML string and reparse it is not always a
good idea, consider to use a SAXSource if you generate your XML. You
can, of course, supply a DOMSource or whatever you like. You can also
use dynamically generated XSL if you want to.
Because you have an explicit transformer object, you can set parameters for the transformation run too.
See the end of the answer for the question above.
Declare the fonts in the userconfig.xml
file as
usual. See loading the user configuration
file for further steps.
Use:
or
See using a user configuration file for caveats.
Use:
No further reference to the options
variable is
necessary. It is recommended to load the user configuration file only
once, preferably in the init()
method of the servlet. If
you have multiple servlets running FOP, or if you have to change the
configuration often, it is best to place the configuration changing
code and the FOP driver call into a synchronized method, or perhaps a
singleton class, in order to avoid problems in multithreaded
environments.
There are various classpath issues, and possible conflicts with existing XML/XSLT libraries. Because servlet containers often use their own classloaders for loading webapps, bugs and security problems can be bothersome as well.
Tomcat comes with detailed instructions for installing FOP and Cocoon, check the documentation. There are known bugs to be circumvented, in particular in Tomcat 4.0.3.
Websphere 3.5: See next question.
Put a copy of a working parser in some directory where WebSphere can access it, for example, if /usr/webapps/yourapp/servlets is the classpath for your servlets, copy the Xerces jar into it (any other directory would also be fine). Do not add the jar to the servlet classpath, but add it to the classpath of the application server which contains your web application. In the WebSphere administration console, click on the "environment" button in the "general" tab. Fill CLASSPATH in the "variable name" box and /usr/webapps/yourapp/servlets/Xerces.jar (or whatever your complete path is) in the value box, press "OK", then apply the change and restart the application server.
FOP is not completely thread safe. At the very least you'll have to create a Driver object for every thread unless you prefer your threads being blocked.
Even though the relevant methods of the Driver object are synchronized, there are still problems because FOP uses static variables for configuration data and loading images. Be sure not to change the configuration data while there is a Driver object rendering. It is recommended to setup the configuration only once while initialising the servlet. If you have to change the configuration data more often, or if you have several servlets within the same webapp using FOP, consider implementing a singleton class encapsulating both the configuration settings and running FOP in synchronized methods.
See Placing SVG Text into PDF.
Batik uses AWT classes for rendering SVG, which in turn needs an X server on Unixish systems. If you run a server without X, or if you can't connect to the X server due to security restrictions or policies, SVG rendering will fail.
There are still several options:
Applies to older FOP versions and JDK 1.3 and older. That's because there is an AWT thread hanging around. The solution is to put a System.exit(0) somewhere.
This is really a "resolving relative URI" problem with some
twists. The problem is that the #stuff
URL fragment
identifier is resolved within the current SVG document. So the
reference must be valid within the XML subset and it cannot
reference other SVG documents in the same XML file. Some options
to try:
fill="url(file:///c:/refstuff/grad.svg#PurpleToWhite)"
.
fill="url(grad.svg#PurpleToWhite)"
. This may be easier
to deploy.
In any case, the referenced stuff has to be pointed to by an URL. It doesn't necessarily have to be a file URL, HTTP should also work. Also, expect a performance hit in all cases, because another XML file has to be retrieved and parsed.
Ultimately, both FOP and especially Batik should be fixed to make your code work as expected, but this will not only take some time but also some effort by a standard committee in order to make the semantics of this kind of references in embedded SVG clearer.
See also MalformedURLException
See the Fonts page for information about embedding fonts.
There are a few fonts supplied with Acrobat Reader. If you use other fonts, the font must be available on the machine where the PDF is viewed or it must have been embedded in the PDF file. See embedding fonts.
Furthermore, if you select a certain font family, the font must contain glyphs for the desired character. There is an overview available for the default PDF fonts. For most symbols, it is better to select the symbol font explicitely, for example in order to get the symbol for the mathematical empty set, write:
The "#" shows up if the selected font does not define a glyph for the required character, for example if you try:
FOP does not currently support this feature. Possible workarounds include those mentioned in the PDF Post-Processing FAQ.
Some sample code for encrypting a FOP generated PDF with iText to get you started:
Check the iText tutorial and documentation for setting access flags, password, encryption strength and other parameters.
FOP does not currently support this feature. Possible workarounds include those mentioned in the PDF Post-Processing FAQ.
FOP does not currently support this feature. Possible workarounds:
Check the paper size in Acrobat settings and the "fit to page" print setting. Contorted printing is often caused by a mismatched paper format, for example if the setting is "US Letter" but the PDF was made for A4. Sometimes also the printer driver interferes, check its settings too.
FOP does not currently support this feature. Possible workarounds include those mentioned in the PDF Post-Processing FAQ.
This is a problem of Internet Explorer requesting the content several times. Some suggestions:
.pdf
, like
http://myserver/servlet/stuff.pdf
. Yes, the servlet can
be configured to handle this. If the URL has to contain parameters,
try to have both the base URL as well as the last parameter end in
.pdf
, if necessary append a dummy parameter, like
http://myserver/servlet/stuff.pdf?par1=a&par2=b&d=.pdf
. The
effect may depend on IEx version.
It depends whether you mean "printing to a printer under control of the server" or "printing on the client's printer".
For the first problem, look at the print servlet in the FOP examples. You'll have to gather any printer settings in a HTML form and send it to the server.
For the second task, you can use some client side script to start Acrobat Reader in print mode, or use a Java applet based on the FOP print servlet. This depends heavily on the client installation and should not relied on except in tightly controlled environments.
See also http://marc.theaimsgroup.com/?l=fop-dev&m=101065988325115&w=2
Use display-align="center". FOP implements this for block containers and table cell. A small self-contained document centering an image on a page:
You can add a column left and right wich pad the table so that the visible part is centered.
If your table is more complicated, or if defining borders on individual cells becomes too much work, use the code above and nest your table within the moddle cell.
Place different static content on odd and even pages.
There are examples in the FO distribution and in the XSL FAQ FO section.
Define a page master with alternating pages masters for odd and even pages, specify appropriate regions in these page masters, and be sure to give them different names. You use these names to put different static content in these regions. A self contained document demonstrating this:
You can insert it into the flow instead of the static content. Alternatively, use a page master referring to different page masters for the first page and the rest. It is quite similar to the odd/even page mechanism. A code sample:
A blank page can be forced by a break-before="page-even"
or similar properties, or by a force-page-count="end-on-odd" on a page
sequence, which ensures a new chapter or something starts on the
preferred page.
You can define a conditional page master with a page master specific for blank pages. This allows you to specify static content for blank pages (by definition, a page is blank if no content from a flow is rendered on the page). You can omit your normal headers and footers, and use for example an extended header to print the "..left blank" statement.
Try to look the character up in the Unicode reference at the Unicode Consortium, in particular search the reference by name.
Use XML character references to put the character into your source XML, XSLT or FO.
For example, the following will result in an Euro sign:
The selected font family must have a glyph for the character you want to show. This is actually a somewhat tricky issue, especially for symbol characters.
Some environments provide also a character table utility (like Win2K or WinXP), which can also help you to get an idea what glyphs are available in a certain font.
Alternative: Use an embedded graphic: GIF, PNG, SVG, whatever.
The specification provides some properties for this: white space collapsing and line feed treatment. In FOP, use white-space-collapse="false" on an enclosing block. This will also preserve line breaks (which is actually a bug, expect this to be changed).
This is an XSL FAQ.
Put an empty block with an id at the end of the flow:
Get the number of the last page as follows:
This does not work for all problems, for example if you have multiple page sequences, an initial page number different from 1, or if you force a certain page count, thereby producing blank pages at the end.
There is no reliable way to get the real total page count with FO mechanisms, you can only get page numbers.
The FOP library provides a method to get the total page count after a FO document has been rendered. You can implement your own wrapper to do a dummy rendering, inquire the total page count and the perform the real rendering, passing the total page count to the XSLT processor to splice it into the generated FO. A sample code:
Declare and use the parameter "page-count" in your XSLT. Be aware you may run into convergence problems: replacing the "#" placeholder from the first run by the actual page count may change it.
Contrary to popular opinion, the regions on a page may overlap. Defining a certain body region does not automatically constrain other regions, this has to be done explicitely.
If you have a header region with an extent of 20mm, you should define a margin for the body region of at least 20mm too, otherwise the header content may overwrite some stuff in the body region. This applies similarly to the extent of the after region and the bottom margin of the body region.
The overlap effect can be used creatively for some purposes.
Several possibilities:
RenderX has provided an Unofficial DTD for FO Documents. This document may be helpful in validating general FO issues.
FOP also maintains an Unofficial FOP Schema in the FOP CVS Repository. This document can be used either to validate against the FO standard, or against the actual FOP implementation. See the notes near the beginning of the document for instructions on how to use it.
. How
do I get a non-breaking space in FO?Use   everywhere. In your own XML, you could also use a DTD which declares the entity.
ü
which used to work in HTML. How do I enter
special characters like in HTML?Don't use names as in HTML, use numbers (unless you have a DTD which declares the entities). For predefined HTML entities and their Unicode codepoints see Character entity references in HTML 4
Make sure ampersands in text and attributes are written as &, "<" is written as < and ">" as >. It's not necessary everywhere but do it just to be sure.
The XML parser should give the proper line and possibly column for offending characters.
Refer to the XML specification or to a good tutorial for details of the XML file format.
Usually, this is a character encoding problem. See XSL FAQ. Many software packages producing XML, in particular most XSLT processors, produce by default UTF-8 encoded files. If you view them with something not aware of the encoding, like Notepad for Win95/98/ME/NT, funny characters are displayed. A Å is a giveaway.
See the Bugs page for information about bugs already reported and how to report new ones.