src/documentation/content/xdocs/design/fotree.xml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160

<?xml version="1.0" standalone="no"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
    "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
<document>
  <header>
    <title>FO Tree</title>
    <subtitle>Design of FO Tree Structure</subtitle>
    <authors>
      <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
    </authors>
  </header>
  <body>
      <section id="issue-fo-recycle">
        <title>Process FO Elements ASAP</title>
        <p>The issue here is that we wish to recycle FO Tree memory as much as possible. There are at least three possible places that FO Tree fragments can be passed to the Layout process, and their memory recycled:</p>
        <ul>
          <li>
            <strong>fo:block</strong> It might be tempting to start laying out pages as soon as the first fo:block object is finished. However, there are many downstream things that can affect the placement of that block on a page, such as graphics and footnotes. So, in order to maintain conformance to the XSL-FO specification, and create high-quality output, we must see more of the document.</li>
          <li>
            <strong>fo:root</strong> The other extreme is to wait until the entire document is read in before processing any of it. This essentially means that there is no memory recycling. Processing the document correctly is more important than saving memory, so this option would be used if there were no better alternative.</li>
          <li>
            <strong>fo:page-sequence</strong> The page-sequence object provides a nice clean break in the document. Content from one page-sequence will never interfere with nor affect the placement of the content of another. FOP uses this option as the optimum way to maintain compliance with the standard and to minimize memory consumption.</li>
        </ul>
      </section>
      <section id="issue-fo-serialize">
        <title>Serialize FO Tree as Necessary</title>
        <p>This issue is implied by the requirement to process documents of arbitrary size. Unless some arbitrary limit is placed on the size of page-sequence objects, FOP must be able to serialize FO tree fragments as necessary.</p>
      </section>
    <section id="intro">
      <title>Introduction</title>
      <p>The FO Tree is an internal representation of the input XSL-FO document.
The tree is created by building the elements and attributes from the SAX events.
The process of building the FO Tree corresponds to the <strong>Objectify</strong> step from the spec.
The <strong>Refinement</strong> step is part of reading and using the properties which may happen immediately or during the layout process.</p>
      <p>The FO Tree is used as an intermediatory structure which is converted
into the area tree. The complete FO tree should not be held in memory
since FOP should be able to handle FO documents of any size.</p>
      <p>The FO Tree is simply a heirarchy of java objects that represent the fo elements from xml.
The traversal is done by the layout or structure process only in the flow elements.</p>
    </section>
      <section id="fonode">
        <title>FONode</title>
        <p>The base class for all objects in the tree is FONode. The base class for
all FO Objects is FObj.</p>
        <p>The class inheritance described above only describes the nature of the
content. Every FO in FOP also has a parent, and a Vector of children. The
parent attribute (in the Java sense), in particular, is used to enforce
constraints required by the FO hierarchy.</p>
        <p>FONode, among other things, ensures that FO's have a parent and that they
may have children.</p>
        <p>Each xml element is represented by a java object. For pagination the
classes are in <code>org.apache.fop.fo.pagination.*</code>, for elements in the flow
they are in <code>org.apache.fop.fo.flow.*</code> and some others are in
<code>org.apache.fop.fo.*.</code></p>
      </section>
      <section id="create-fo">
        <title>Making FO's</title>
        <p>There is a class for each element in the FO set. An object is created for
each element in the FO Tree. This object holds the properties for the FO
Object.</p>
        <p>Some validity checking is done during these steps. The user can be warned of the error and processing can continue if possible.</p>
        <p>When the object is created it is setup.
It is given its element name, the FOUserAgent - for resolving properties etc. - the logger and the attributes.
The methods <code>handleAttributes()</code> and <code>setuserAgent()</code>, common to <code>FONode</code>, are used in this process.
The object will then be given any text data or child elements.
Then the <code>end()</code> method is called.
The end method is used by a number of elements to indicate that it can do certain processing since all the children have been added.</p>
        <p>An FO maker is read from a hashmap lookup using the namespace and
element name. This maker is then used to create a new class that
represents an FO element. This is then added to the FO tree as a child
of the current parent.</p>
      </section>
      <section id="properties">
        <title>Properties</title>
        <p>The XML attributes on each element are passed to the object. The objects
that represent FO objects then convert the attributes into properties.</p>
        <p>Since properties can be inherited the PropertyList class handles resolving
properties for a particular element.
All properties are specified in an XML file. Classes are created
automatically during the build process.</p>
        <p>In some cases the element may be moved to have a different parent, for
example markers, or the inheritance could be different, for example
initial property set.</p>
      <p>Properties (recall that FO's have properties, areas have traits, and XML
nodes have attributes) are also a concern of <em>FOTreeBuilder</em>. It
accomplishes this by using a <em>PropertyListBuilder</em>. There is a
separate <em>PropertyListBuilder</em> for each namespace encountered
while building the FO tree. Each Builder object contains a hash of
property names and <em>their</em> respective makers. It may also
contain element-specific property maker hashes; these are based on the
<em>local name</em> of the flow object, ie. <em>table-row</em>, not
<em>fo:table-row</em>. If an element-specific property mapping exists,
it is preferred to the generic mapping.</p>
      <p>The base class for all
properties is <em>Property</em>, and all the property makers extend
<em>Property.Maker</em>. A more complete discussion of the property
architecture may be found in <jump href="properties.html">Properties</jump>.</p>
    </section>
    <section id="foreign">
      <title>Foreign XML</title>
      <p>FOP supports the handlingof foreign XML.
The XML is converted internally into a DOM, this is then available to
the FO tree to convert the DOM into another format which can be rendered.
In the case of SVG the DOM needs to be created with Batik, so an element
mapping is used to read all elements in the SVG namespace and pass them
into the Batik DOM.</p>
      <p>The base class for foreign XML is XMLObj. This class handles creating a
DOM Element and the setting of attributes. It also can create a DOM
Document if it is a top level element, class XMLElement.
This class must be extended for the namespace of the XML elements. For
unknown namespaces the class is UnknowXMLObj.</p>
      <p>If some special processing is needed then the top level element can extend
the XMLObj. For example the SVGElement makes the special DOM required for
batik and gets the size of the svg.</p>
      <p>Foreign XML will usually be in an fo:instream-foreign-object, the XML will
be passed to the render as a DOM where the render will be able to handle
it. Other XML from an unknwon namespace will be ignored.</p>
      <p>By using element mappings it is possible to read other XML and either</p>
      <ul>
        <li>set information on the area tree</li>
        <li>create pseudo FO Objects that create areas in the area tree</li>
        <li>create FO Objects</li>
      </ul>
    </section>
    <section id="unknown">
      <title>Unknown Elements</title>
      <p>If an element is in a known namespace but the element is unknown then an
Unknown object is created. This is mainly to provide information to the
user.
This could happen if the fo document contains an element from a different
version or the element is misspelt.</p>
    </section>
    <section id="extensions">
      <title>Extensions</title>
      <p>It is possible to add extensions to FOP so that you can extend the ability of
FOP with respect to render output, document specific information or extended
layout functionality.</p>
    </section>
    <section id="page-master">
      <title>Page Masters</title>
      <p>The first elements in a document are the elements for the page master setup.
This is usually only a small number and will be used throughout the document to create new pages.
These elements are kept as a factory to create the page and appropriate regions whenever a new page is requested by the layout.
The objects in the FO Tree that represent these elements are themselves the factory.
The root element keeps these objects as a factory for the page sequences.</p>
    </section>
    <section id="flow">
      <title>Flow</title>
      <p>The elements that are in the flow of the document are a set of elements
that is needed for the layout process. Each element is important in the
creation of areas.</p>
    </section>
    <section id="other-elements">
      <title>Other Elements</title>
      <p>The remaining FO Objects are things like page-sequence, title and color-profile.
These are handled by their parent element; i.e. the root looks after the declarations and the declarations maintains a list of colour profiles.
The page-sequences are direct descendents of root.</p>
    </section>
  </body>
</document>