FOP is a print formatter for
It can be used to render an XML file containing XSL formatting objects into a page layout. The main target is PDF but other rendering targets are supported, such as AWT, PCL, text and direct printing.
FOP provides both an application and a library that converts an XSL FO document into paginated output.
The FOP command line application can be directly used to transform XML into PDF, PostScript, PCL and other formats, there is also an AWT based viewer integrated.
The library can be used in servlets and other Java applications.
It's an acronym for Formatting Object Processor
FOP is distributed with Cocoon as a PDF serializer for XSL (FO) documents.
Batik can be used with FOP to transcode an SVG image into a PDF document.
XSL is a W3C standard concerned with publishing XML documents. It
consists of two parts:
XSLFO is an XML vocabulary that is used to specify a pagination and
other styling for page layout output. The acronym “FO” stands for
Formatting Objects. XSLFO can be
used in conjunction with
XSLFO defines a set of elements in XML that describes the way pages are set up. The contents of the pages are filled from flows. There can be static flows that appear on every page (for headers and footers) and the main flow which fills the body of the page.
Synonyms: XSL FO, XSL (FO), XSL:FO, XSL-FO, Formatting Objects
XSLT describes the transformation of arbitrary XML input into other XML (like XSLFO), HTML or plain text. The “T” comes from Transformation. For historical reasons, a transformation is often also called a “style sheet”.
Synonyms: XSL transformation, XSL:T, XSL style sheet.
There is always plenty of things to do. See limitations and bugzilla.
FOP was changed to be in accordance with the latest standard (see XSL standard).The page master for a fo:page-sequence is now refereced by the master-reference attribute. Replace the master-name attributes of your fo:page-sequence elements by master-reference attributes. You have to do this also for fo:single-page-master-reference, fo:repeatable-page-master-reference and fo:conditional-page-master-reference elements in you page master definitions.
See also release notes.
The Jimi image library, which is used for processing images in PNG and
other formats, was removed from the distribution for licensing
reasons. You have to
This is typically a problem with your classpath.
If you are running FOP from the command line:
If you run FOP embedded in your servlet, web application or other Java application, check the classpath of the application.
This is usually caused by an older version of one of the FOP jars or old XML tools in the classpath. Check in particular for parser.jar, jaxp.jar, xml4j.jar or lotusxsl.jar.
Incompatible versions of Batik may also cause this problem. Use the version of Batik that comes with FOP.
FOP can consume quite a bit of memory, even though this has been continually improved. The memory consumption is partially inherent to the formatting process and partially caused by implementation choices. For certain layouts, all FO processors currently on the market have memory problems.
Some hints regarding your document structure:
There are also some bugs which cause FOP to go into an nonterminating
loop, which also often results in a memory overflow. A characteristic
symptom are continuous
Reducing memory consumption in general and squishing bugs is an ongoing effort, partially addressed in the redesign.
What you probably think of as "file names" are usually URLs, in particular the src attribute of fo:external-graphic.
Because usage of URLs is growing, you should make yourself familiar with it. The relevant specification is RFC 2396.
In a nutshell, the correct syntax for an absolute file URL is
file:///some/path/file.ext
on Unix and
file:///z:/some/path/file.ext
on Windows systems. Note
the triple slash, and also that only forward slashes are used, even on
windows.
A relative file URL starts with anything but a slash, and doesn't have
the file:
prefix, for example file.ext
,
path/file.ext
or ../file.ext
. The string
file:path/file.ext
is not a relative URL,
in fact, it isn't a valid URL at all. A relative URL is subject to a
resolving process, which transforms it into an absolute
URL.
See Understanding URIs and URLs and Understanding URL resolving.
Most often, you supplied an invalid FO document to FOP. Currently only very common errors are intercepted and produce a comprehensible error message. If you forgot container elements like fo:page-sequence or fo:flow and put blocks and inline elements directly as children of fo:root or fo:page-sequence, you'll only get a NullPointerException. Check whether your FO file has a proper structure. In some cases there are mandatory properties, like the master-reference in fo:conditional-page-master-reference, check also whether you got them right.
You can use the FOP DTD or FOP Schema to validate your soure. This will catch most, but still not all problems.
If you use XSLT, problems in your style sheet and in your source XML also often produce a NullPointerException. Run the transformation separately to check for this, usually you'll get a detailed error message from the XSLT processor.
If you turn on debugging with the "-d" option you may be able to see more detailed information.
The most likely reason is a known problem with the Java run time
environment which is triggered by rendering SVGs. Suns JDK 1.4 does
not have this problem. See also
Another possibility is that FOP went into a non terminating
loop. Usually this is indicated by lots of log messages of the form
"[INFO]: [NNNN]" which indicate a new page has been started or
If you called the FOP command line application from some other programm, for example from Java using Runtime.exec(), it may hang while trying to write log entries to the output pipe. You have to read the FOP output regularly to empty the pipe buffer. It is best to avoid exec'ing FOP, use the library interface.
There is something too large to fit into the intended place, usually a large image, a table whose rows are kept together or a block with a space-before or space-after larger than the page size. Catch the first page showing this phenomenon and check it. If it is not obvious which element causes the trouble, remove stuff until the problem goes away. Decrease the dimensions of the offending element or property, or increase the dimension of the enclosing element or container, or remove keep-with-* properties.
The src attribute of the fo:external-graphics element takes an URI, not a file name.
Relative URLs are resolved against the baseDir property of FOP. For the command line FOP application, the baseDir is the directory of the input file, either the FO file or the XML source. If FOP is used embedded in a servlet, baseDir can be set explicitely. If it's not set, it is usually the current working directory of the process which runs FOP.
See Understanding URIs and URLs and Understanding URL resolving.
Did you get: «Failed to read font metrics file C:\foo\arial.xml : File "C:\foo\arial.xml" not found»? The value for the metrics-file attribute in the user config file is actually an URL, not a file name. Use "file:///C:/foo/arial.xml" instead.
If you used a relative URL, make sure your application has the working directory you expect. Currently FOP does not use the baseDir for resolving relative URLs pointing to font metric files.
These properties are not implemented, except for keep-with-next and keep-with-previous on table rows. In order to take advantage of them, you have to nest stuff to be kept together in a table.
The concept is called “blind table”. The table is used for pure layout reasons and not obvious in the output.
An example of an image and the image caption to be kept together:
Check for fo:table-body around the rows. FOP doesn't raise an error if it is omitted, it just drops the content.
Also, the fo:table-with-caption element is not implemented, tables within such an element are dropped too. The DocBook style sheets generate fo:table-with-caption elements, so watch out.
Clipping as specified by the overflow="hidden"
is not yet
implemented. If you have long words overflowing table cells, try to
get them hyphenated. Artificial names like product identifications or
long numbers usually aren't hyphenated. You can try special processing
at XSLT level, like
Check the XSL FAQ and the XSL list archive for how to perform these tasks.
This happens for fo:page-number-citation elements if the citation occurs before FOP formatted the requested page, usually in TOC or index pages.
It is caused by the problem that FOP has to guess how much space the yet unknown page number will occupy, and usually the guesses are somewhat off. You can try to use a non-proportional font like Courier to remedy this. However, this is likely to look ugly, and wont fix the problem completely.
Several possibilities:
See also supported image formats.
Set the language attribute somewhere. Check whether you use a language for which hyphenation is supported. Supported languages can be deduced from the files in the hyph directory of the FOP source distribution.
Look at the servlet example.
A rather minimal code snippet to demonstrate the basics:
Caveat: Internet Explorer will not automatically show the PDF. Thats a well known IEx problem, not with the servlet. You can download the PDF with IEx and view it later. There are other problems with this code.
Please look into Howto embed FOP in a servlet for all kinds of details.
Use the TraxInputHandler if both the source XML and XSL are read from files.
A demonstration:
This minimal code snippet has the same problems as the one from the question above. Please inform yourself about the details.
If your source XML is generated on the fly, for example from a database, a web service, or another servlet, you have to create a transformer object explicitely and use a SAX event stream to feed the transformation result into FOP.
A demonstration:
You don't have to call run() or render() on the driver object.
The xmlsource
is a placeholder for your actual XML
source. You can supply a new StreamSource( new
StringReader(xmlstring))
if you have to read the XML from a
string. Constructing an XML string and reparse it is not always a good
iea, consider to use a SAXSource if you generate your XML. You can, of
course, supply a DOMSource or whatever you like. You can also use
dynamically generated XSL if you want to.
Because you have an explicit transformer object, you can set parameters for the transformation run too.
See the end of the answer for the question above.
Declare the fonts in the userconfig.xml
file as
usual. See
Use:
or
See
Use:
No further reference to the options
variable is
necessary. It is recommended to load the user configuration file only
once, preferably in the init()
method of the servlet. If
you have multiple servlets running FOP, or if you have to change the
configuration often, it is best to place the configuration changing
code and the FOP driver call into a synchronized method, or perhaps a
singleton class, in order to avoid problems in multithreaded
environments.
There are various classpath issues, and possible conflicts with existing XML/XSLT libraries. Because servlet containers often use their own classloaders for loading webapps, bugs and security problems can be bothersome as well.
Tomcat comes with detailed instructions for installing FOP and Cocoon, check the documentation. There are known bugs to be circumvented, in particular in Tomcat 4.0.3.
Websphere 3.5: See next question.
Put a copy of a working parser in some directory where WebSphere can access it, for example, if /usr/webapps/yourapp/servlets is the classpath for your servlets, copy the Xerces jar into it (any other directory would also be fine). Do not add the jar to the servlet classpath, but add it to the classpath of the application server which contains your web application. In the WebSphere administration console, click on the "environment" button in the "general" tab. Fill CLASSPATH in the "variable name" box and /usr/webapps/yourapp/servlets/Xerces.jar (or whatever your complete path is) in the value box, press "OK", then apply the change and restart the application server.
FOP is not completely thread safe. At the very least you'll have to create a Driver object for every thread unless you prefer your threads being blocked.
Even though the relevant methods of the Driver object are synchronized, there are still problems because FOP uses static variables for configuration data and loading images. Be sure not to change the configuration data while there is a Driver object rendering. It is recommended to setup the configuration only once while initialising the servlet. If you have to change the configuration data more often, or if you have several servlets within the same webapp using FOP, consider implementing a singleton class encapsulating both the configuration settings and running FOP in synchronized methods.
The svg text is rendered as shapes, the Acrobat viewer displays it with bad quality unless you turn on smooth line art in the Acrobat preferences. The printout is always ok, it's only the screen view which is of bad quality by default.
You can force Batik not to render SVG text by setting the strokeSVGText property to false. You can do this in the user configuration file:
In a servlet environment, you can set it directly:
See also
Batik uses AWT classes for rendering SVG, which in turn needs an X server on Unixish systems. If you run a server without X, or if you can't connect to the X server due to security restrictions or policies, SVG rendering will fail.
There are still several options:
Applies to older FOP versions and JDK 1.3 and older. That's because there is an AWT thread hanging around. The solution is to put a System.exit(0) somewhere.
This is really a "resolving relative URI" problem with some
twists. The problem is that the #stuff
URL fragment
identifier is resolved within the current SVG document. So the
reference must be valid within the XML subset and it cannot
reference other SVG documents in the same XML file. Some options
to try:
fill="url(file:///c:/refstuff/grad.svg#PurpleToWhite)"
.
fill="url(grad.svg#PurpleToWhite)"
. This may be easier
to deploy.
fill="url(my.xsl#PurpleToWhite)"
.
In any case, the referenced stuff has to be pointed to by an URL. It doesn't necessarily have to be a file URL, HTTP should also work. Also, expect a performance hit in all cases, because another XML file has to be retrieved and parsed.
Ultimately, both FOP and especially Batik should be fixed to make your code work as expected, but this will not only take some time but also some effort by a standard committee in order to make the semantics of this kind of references in embedded SVG clearer.
See also MalformedURLException
Provide$$$
Answers are that fonts must be available on the target platform, and the selected font must contain glyphs for the desired character.
For example, for most symbols, the symbol font has to be selected explicitely (actually: is this a feature or a bug?):
<fo:inline font-family="Symbol">∅</fo:inline>
gives EMPTY SET while the same characters in the default font results in AE LIGATURE (which happens to occupy the same place in the default font as the EMPTY SET in the Symbol font). The "#" shows up if the selected font does not define a glyph for the translated index.
(Still applicable in 0.20.3?)
use some other tool to postprocess the PDF (itext, or something?)
Answer: see 3.3, or use a a region overlapping the flowing text and put an image there:
> From: Trevor_Campbell@kaz.com.au Use the region-before. Make it large enough to contain your image and then include a block (and if required an absolutely positioned block-container) with your image in the static-content for the region-before. Could use some code here...
Check paper size in Acrobat settings and "fit to page" (or something)
Not possible with FOP. Postprocess the PDF.
see #later
This is a problem of Internet Explorer requesting the content several times. Some suggestions:
.pdf
, like
http://myserver/servlet/stuff.pdf
. Yes, the servlet can
be configured to handle this. If the URL has to contain parameters,
try to have both the base URL as well as the last parameter end in
.pdf
, if necessary append a dummy parameter, like
http://myserver/servlet/stuff.pdf?par1=a&par2=b&d=.pdf
. The
effect may depend on IEx version.
It depends whether you mean "printing to a printer under control of the server" or "printing on the client's printer".
For the first problem, look at the print servlet in the FOP examples. You'll have to gather any printer settings in an HTML form and send it to the server.
For the second task, you can use some client side script to start Acrobat Reader in print mode, or use a Java applet based on the FOP print servlet. This depends heavily on the client installation and should not relied on except in tightly controlled environments.
See also http://marc.theaimsgroup.com/?l=fop-dev&m=101065988325115&w=2
Use display-align="center". FOP implements this for block containers and table cell. A small self-contained document centering an image on a page:
That's about different static content on
You can insert it into the flow instead of the static content. Alternatively, use a page master referring to different page masters for the first page and the rest. It is quite similar to the odd/even page mechanism. A code sample:
There are examples in the FO distribution and in the XSL FAQ FO section http://www.dpawson.co.uk/xsl/sect3/index.html
Define a page master with alternating pages masters for odd and even pages, specify appropriate regions in these page masters, and be sure to give them different names. You use these names to put different static content in these regions. A self contained document demonstrating this:
A blank page can be forced by a break-before="page-even"
or similar properties, or by a force-page-count="end-on-odd" on a page
sequence, which ensures a new chapter or something starts on the
preferred page.
You can define a conditional page master with a page master specific for blank pages. This allows you to specify static content for blank pages (by definition, a page is blank if no content from a flow is rendered on the page). You can omit your normal headers and footers, and use for example an extended header to print the "..left blank" statement.
Try to look it up in the Unicode reference at the Unicode Consortium, in particular search the reference by name. Use XML character references to put the character into your source XML, XSLT or FO.
Watch out for font traps, see #, change font temporarily using fo:inline if necessary.
Alternative: Use an embedded graphic: GIF, PNG, SVG, whatever.
The specification provides some properties for this: white space collapsing and line feed treatment. In FOP, use white-space-collapse="false" on an enclosing block. This will also preserve line breaks (which is actually a bug, expect this to be changed).
(XSL FAQ)
Put an empty block with an id at the end of the flow:
Get the number of the last page as follows:
This does not work for all problems, for example if you have multiple page sequences, an initial page number different from 1, or if you force a certain page count, thereby producing blank pages at the end.
There is no reliable way to get the real total page count with FO mechanisms, you can only get page numbers.
The FOP library provides a method to get the total page count after a FO document has been rendered. You can implement your own wrapper to do a dummy rendering, inquire the total page count and the perform the real rendering, passing the total page count to the XSLT processor to splice it into the generated FO. A sample code:
Declare and use the parameter "page-count" in your XSLT. Be aware you may run into convergence problems: replacing the "#" placeholder from the first run by the actual page count may change it.
Contrary to popular opinion, the regions on a page may overlap. Defining a certain body region does not automatically constrain other regions, this has to be done explicitely.
If you have a header region with an extent of 20mm, you should define a margin for the body region of at least 20mm too, otherwise the header content may overwrite some stuff in the body region. This applies similarly to the extent of the after region and the bottom margin of the body region.
The overlap effect can be used creatively for some purposes.
Several possibilities:
Use   everywhere. In your own XML, you could also use a DTD which declares the entity.
Don't use names as in HTML, use numbers (unless you have a DTD which declares the entities). For predefined HTML entities and their Unicode codepoints see Character entity references in HTML 4
Make sure ampersands in text and attributes are written as &, "<" is written as < and ">" as >. It's not necessary everywhere but do it just to be sure.
The XML parser should give the proper line and possibly column for offending characters.
Refer to the XML specification or to a good tutorial for details of the XML file format.
Usually, this is a character encoding problem. See XSL FAQ. Many software packages producing XML, in particular most XSLT processors, produce by default UTF-8 encoded files. If you view them with something not aware of the encoding, like Notepad for Win95/98/ME/NT, funny characters are displayed. A Å is a giveaway.
See docs. See also
Decide where to post: