diff options
Diffstat (limited to 'src/documentation')
-rw-r--r-- | src/documentation/content/xdocs/design/fotree.xml | 181 | ||||
-rw-r--r-- | src/documentation/content/xdocs/design/layout.xml | 1 | ||||
-rw-r--r-- | src/documentation/content/xdocs/design/parsing.xml | 23 | ||||
-rw-r--r-- | src/documentation/content/xdocs/design/properties.xml | 32 |
4 files changed, 115 insertions, 122 deletions
diff --git a/src/documentation/content/xdocs/design/fotree.xml b/src/documentation/content/xdocs/design/fotree.xml index 1c2456606..f91c6117f 100644 --- a/src/documentation/content/xdocs/design/fotree.xml +++ b/src/documentation/content/xdocs/design/fotree.xml @@ -3,41 +3,75 @@ "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd"> <document> <header> - <title>FO Tree</title> - <subtitle>Design of FO Tree Structure</subtitle> + <title>FOP Design: FO Tree</title> <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> </authors> </header> <body> - <section id="issue-fo-recycle"> - <title>Process FO Elements ASAP</title> - <p>The issue here is that we wish to recycle FO Tree memory as much as possible. There are at least three possible places that FO Tree fragments can be passed to the Layout process, and their memory recycled:</p> - <ul> - <li> - <strong>fo:block</strong> It might be tempting to start laying out pages as soon as the first fo:block object is finished. However, there are many downstream things that can affect the placement of that block on a page, such as graphics and footnotes. So, in order to maintain conformance to the XSL-FO specification, and create high-quality output, we must see more of the document.</li> - <li> - <strong>fo:root</strong> The other extreme is to wait until the entire document is read in before processing any of it. This essentially means that there is no memory recycling. Processing the document correctly is more important than saving memory, so this option would be used if there were no better alternative.</li> - <li> - <strong>fo:page-sequence</strong> The page-sequence object provides a nice clean break in the document. Content from one page-sequence will never interfere with nor affect the placement of the content of another. FOP uses this option as the optimum way to maintain compliance with the standard and to minimize memory consumption.</li> - </ul> - </section> - <section id="issue-fo-serialize"> - <title>Serialize FO Tree as Necessary</title> - <p>This issue is implied by the requirement to process documents of arbitrary size. Unless some arbitrary limit is placed on the size of page-sequence objects, FOP must be able to serialize FO tree fragments as necessary.</p> - </section> <section id="intro"> <title>Introduction</title> - <p>The FO Tree is an internal representation of the input XSL-FO document. -The tree is created by building the elements and attributes from the SAX events. -The process of building the FO Tree corresponds to the <strong>Objectify</strong> step from the spec. -The <strong>Refinement</strong> step is part of reading and using the properties which may happen immediately or during the layout process.</p> - <p>The FO Tree is used as an intermediatory structure which is converted -into the area tree. The complete FO tree should not be held in memory -since FOP should be able to handle FO documents of any size.</p> - <p>The FO Tree is simply a heirarchy of java objects that represent the fo elements from xml. -The traversal is done by the layout or structure process only in the flow elements.</p> + <p>The FO Tree is an internal hierarchical representation (java objects and properties) of the input XSL-FO document, and is created from the <link href="parsing.html">parsing</link> of that XSL-FO document. +The process of building the FO Tree corresponds to the <strong>Objectify</strong> step in the XSL-FO spec. +The FO Tree is an intermediate structure which will later be <link href="layout.html">converted into the area tree</link>.</p> + </section> + <section id="process"> + <title>Processing</title> + <p>The SAX Events that are fired by the parsing process are caught by the FO Tree system. +Events for starting an element, ending an element, and text data are assembled by the FO Tree system into a set of objects that represent the input FO document.</p> + <p>For attributes attached to an XSL-FO element, a property list mapping is used to convert the attribute into properties of the object related to the element.</p> + <p>Elements from <link href="parsing.html#namespaces">foreign namespaces</link> that are recognized by FOP fall into the following categories:</p> + <ul> + <li>Pass-thru: These are placed into a DOM object, which is then passed through FOP directly to the renderer. SVG is an example.</li> + <li>FOP Internal: These are placed into objects that can then be used by FOP. An example of this would be an element that the layout process will use to create an area. Another example would be an element that contains setup information for the renderer.</li> + </ul> + <p>For unrecognized namespaces, a dummy object or a generic DOM is created.</p> + <p>While the tree building is mainly about creating the FO Tree, some FO Tree events trigger processes in other parts of FOP. +The end of a page-sequence element triggers the layout process for that page-sequence (see discussion of <link href="#recycle">Recycling</link>). +Also, the end of the XML document tells the renderer that it can finalize the output document.</p> + </section> + <section id="recycle"> + <title>Recycling FO Tree Memory</title> + <p>To minimize the amount of memory used by FOP, we wish to recycle FO Tree memory as much as possible. +There are at least three possible places that FO Tree fragments could be passed to the Layout process, so that their memory can be reused:</p> + <ul> + <li> + <strong>fo:block</strong> It might be tempting to start laying out pages as soon as the first fo:block object is finished. However, there are many downstream things that can affect the placement of that block on a page, such as graphics and footnotes. So, in order to maintain conformance to the XSL-FO specification, and create high-quality output, we must see more of the document.</li> + <li> + <strong>fo:root</strong> The other extreme is to wait until the entire document is read in before processing any of it. This essentially means that there is no memory recycling. Processing the document correctly is more important than saving memory, so this option would be used if there were no better alternative.</li> + <li> + <strong>fo:page-sequence</strong> The page-sequence object provides a nice clean break in the document. Content from one page-sequence will never interfere with nor affect the placement of the content of another. FOP uses this option as the optimum way to maintain compliance with the standard and to minimize memory consumption.</li> + </ul> + </section> + <section id="serialize"> + <title>FO Tree Serialization</title> + <p>This issue is implied by the requirement to process documents of arbitrary size. Unless some arbitrary limit is placed on the size of page-sequence objects, FOP must be able to serialize FO tree fragments as necessary.</p> </section> + <section id="specific-elements"> + <title>Notes About Specific Elements</title> + <section id="page-master"> + <title>page-master</title> + <p>The first elements in a document are the elements for the page master setup. +This is usually only a small number and will be used throughout the document to create new pages. +These elements are kept as a factory to create the page and appropriate regions whenever a new page is requested by the layout. +The objects in the FO Tree that represent these elements are themselves the factory. +The root element keeps these objects as a factory for the page sequences.</p> + </section> + <section id="flow"> + <title>flow</title> + <p>The elements that are in the flow of the document are a set of elements +that is needed for the layout process. Each element is important in the +creation of areas.</p> + </section> + <section id="other-elements"> + <title>Other Elements</title> + <p>The remaining FO Objects are things like page-sequence, title and color-profile. +These are handled by their parent element; i.e. the root looks after the declarations and the declarations maintains a list of colour profiles. +The page-sequences are direct descendents of root.</p> + </section> + </section> + <section id="implement"> + <title>Implementation Notes</title> <section id="fonode"> <title>FONode</title> <p>The base class for all objects in the tree is FONode. The base class for @@ -51,7 +85,8 @@ may have children.</p> <p>Each xml element is represented by a java object. For pagination the classes are in <code>org.apache.fop.fo.pagination.*</code>, for elements in the flow they are in <code>org.apache.fop.fo.flow.*</code> and some others are in -<code>org.apache.fop.fo.*.</code></p> +<code>org.apache.fop.fo.*.</code> + </p> </section> <section id="create-fo"> <title>Making FO's</title> @@ -70,91 +105,35 @@ element name. This maker is then used to create a new class that represents an FO element. This is then added to the FO tree as a child of the current parent.</p> </section> - <section id="properties"> - <title>Properties</title> - <p>The XML attributes on each element are passed to the object. The objects -that represent FO objects then convert the attributes into properties.</p> - <p>Since properties can be inherited the PropertyList class handles resolving -properties for a particular element. -All properties are specified in an XML file. Classes are created -automatically during the build process.</p> - <p>In some cases the element may be moved to have a different parent, for -example markers, or the inheritance could be different, for example -initial property set.</p> - <p>Properties (recall that FO's have properties, areas have traits, and XML -nodes have attributes) are also a concern of <em>FOTreeBuilder</em>. It -accomplishes this by using a <em>PropertyListBuilder</em>. There is a -separate <em>PropertyListBuilder</em> for each namespace encountered -while building the FO tree. Each Builder object contains a hash of -property names and <em>their</em> respective makers. It may also -contain element-specific property maker hashes; these are based on the -<em>local name</em> of the flow object, ie. <em>table-row</em>, not -<em>fo:table-row</em>. If an element-specific property mapping exists, -it is preferred to the generic mapping.</p> - <p>The base class for all -properties is <em>Property</em>, and all the property makers extend -<em>Property.Maker</em>. A more complete discussion of the property -architecture may be found in <jump href="properties.html">Properties</jump>.</p> - </section> - <section id="foreign"> - <title>Foreign XML</title> - <p>FOP supports the handlingof foreign XML. -The XML is converted internally into a DOM, this is then available to -the FO tree to convert the DOM into another format which can be rendered. -In the case of SVG the DOM needs to be created with Batik, so an element -mapping is used to read all elements in the SVG namespace and pass them -into the Batik DOM.</p> - <p>The base class for foreign XML is XMLObj. This class handles creating a + <section id="foreign"> + <title>Foreign XML</title> + <p>For SVG, the DOM needs to be created with Batik, so an element mapping is used to read all elements in the SVG namespace and pass them into the Batik DOM.</p> + <p>The base class for foreign XML is XMLObj. This class handles creating a DOM Element and the setting of attributes. It also can create a DOM Document if it is a top level element, class XMLElement. This class must be extended for the namespace of the XML elements. For unknown namespaces the class is UnknowXMLObj.</p> - <p>If some special processing is needed then the top level element can extend + <p>If some special processing is needed then the top level element can extend the XMLObj. For example the SVGElement makes the special DOM required for batik and gets the size of the svg.</p> - <p>Foreign XML will usually be in an fo:instream-foreign-object, the XML will + <p>Foreign XML will usually be in an fo:instream-foreign-object, the XML will be passed to the render as a DOM where the render will be able to handle it. Other XML from an unknwon namespace will be ignored.</p> - <p>By using element mappings it is possible to read other XML and either</p> - <ul> - <li>set information on the area tree</li> - <li>create pseudo FO Objects that create areas in the area tree</li> - <li>create FO Objects</li> - </ul> - </section> - <section id="unknown"> - <title>Unknown Elements</title> - <p>If an element is in a known namespace but the element is unknown then an + <p>By using element mappings it is possible to read other XML and either</p> + <ul> + <li>set information on the area tree</li> + <li>create pseudo FO Objects that create areas in the area tree</li> + <li>create FO Objects</li> + </ul> + </section> + <section id="unknown"> + <title>Unknown Elements</title> + <p>If an element is in a known namespace but the element is unknown then an Unknown object is created. This is mainly to provide information to the user. This could happen if the fo document contains an element from a different version or the element is misspelt.</p> - </section> - <section id="extensions"> - <title>Extensions</title> - <p>It is possible to add extensions to FOP so that you can extend the ability of -FOP with respect to render output, document specific information or extended -layout functionality.</p> - </section> - <section id="page-master"> - <title>Page Masters</title> - <p>The first elements in a document are the elements for the page master setup. -This is usually only a small number and will be used throughout the document to create new pages. -These elements are kept as a factory to create the page and appropriate regions whenever a new page is requested by the layout. -The objects in the FO Tree that represent these elements are themselves the factory. -The root element keeps these objects as a factory for the page sequences.</p> - </section> - <section id="flow"> - <title>Flow</title> - <p>The elements that are in the flow of the document are a set of elements -that is needed for the layout process. Each element is important in the -creation of areas.</p> - </section> - <section id="other-elements"> - <title>Other Elements</title> - <p>The remaining FO Objects are things like page-sequence, title and color-profile. -These are handled by their parent element; i.e. the root looks after the declarations and the declarations maintains a list of colour profiles. -The page-sequences are direct descendents of root.</p> + </section> </section> </body> </document> diff --git a/src/documentation/content/xdocs/design/layout.xml b/src/documentation/content/xdocs/design/layout.xml index 85105f801..3057e97de 100644 --- a/src/documentation/content/xdocs/design/layout.xml +++ b/src/documentation/content/xdocs/design/layout.xml @@ -27,6 +27,7 @@ Note: it may be possible to start immediately after a block formatting object ha It is also possible to layout all pages in a page sequence after each page sequence has been added from the xml.</p> <p>The layout process is handled by a set of layout managers. The block level layout managers are used to create the block areas which are added to the region area of a page.</p> + <p>The traversal is done by the layout or structure process only in the flow elements.</p> <section id="issue-simple-layout"> <title>Keep Layouts Simple</title> <p>Layout should handle floats, footnotes and keeps in a simple, straightforward way.</p> diff --git a/src/documentation/content/xdocs/design/parsing.xml b/src/documentation/content/xdocs/design/parsing.xml index 396a1a263..05c5d0ebc 100644 --- a/src/documentation/content/xdocs/design/parsing.xml +++ b/src/documentation/content/xdocs/design/parsing.xml @@ -3,7 +3,7 @@ "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd"> <document> <header> - <title>XML Parsing</title> + <title>FOP Design: Input Parsing</title> </header> <body> <section id="intro"> @@ -24,6 +24,7 @@ Instead, FOP takes SAX events and builds its own tree-like structure. Why?</p> <li>DOM contains an entire document. FOP is able to process individual fo:page-sequence objects discretely, without the need to have the entire document in memory. For documents that have only one fo:page-sequence object, FOP's approach is no advantage, but in other cases it is a huge advantage. A 500-page book that is broken into 100 5-page chapters, each in its own fo:page-sequence, essentially needs only 1% of the document memory that would be required if using DOM as input.</li> </ul> <p>See the <link href="../embedding.html#input">Input Section of the User Embedding Document</link> for a discussion of input usage patterns and some implementation details.</p> + <p>FOP's <link href="fotree.html">FO Tree Mechanism</link> is responsible for catching the SAX events and processing them.</p> </section> <section id="validation"> <title>Validation</title> @@ -37,25 +38,5 @@ Instead, FOP takes SAX events and builds its own tree-like structure. Why?</p> <p>See <link href="../extensions.html">User Extensions</link> for a discussion of standard extensions shipped with FOP, and their related namespaces.</p> <p>See <link href="../dev/extenstions.html">Developer Extensions</link> for a discussion of the mechanisms in place to allow developers to add their own extensions, including how to tell FOP about the foreign namespace.</p> </section> - <section> - <title>Tree Building</title> - <p>The SAX Events will fire all the information for the document with start element, end element, text data etc. -This information is used to build up a representation of the FO document. -To do this for a namespace there is a set of element mappings. -When an element + namepsace mapping is found then it can create an object for that element. -If the element is not found then it creates a dummy object or a generic DOM for unknown namespaces.</p> - <p>The object is then setup and then given attributes for the element. -For the FO Tree the attributes are converted into properties. -The FO objects use a property list mapping to convert the attributes into a list of properties for the element. -For other XML, for example SVG, a DOM of the XML is constructed. -This DOM can then be passed through to the renderer. -Other element mappings can be used in different ways, for example to create elements that create areas during the layout process or setup information for the renderer etc.</p> - <p>While the tree building is mainly about creating the FO Tree there are some stages that can propagate to the renderer. -At the end of a page sequence we know that all pages in the page sequence can be laid out without being effected by any further XML. -The significance of this is that the FO Tree for the page sequence may be able to be disposed of. -The end of the XML document also tells us that we can finalise the output document. -(The layout of individual pages is accomplished by the layout managers page at a time; i.e. they do not need to wait for the end of the page sequence. -The page may not yet be complete, however, containing forward page number references, for example.)</p> - </section> </body> </document> diff --git a/src/documentation/content/xdocs/design/properties.xml b/src/documentation/content/xdocs/design/properties.xml index 6947cc163..8a7797555 100644 --- a/src/documentation/content/xdocs/design/properties.xml +++ b/src/documentation/content/xdocs/design/properties.xml @@ -10,6 +10,38 @@ </authors> </header> <body> + + + <section id="properties"> + <title>Properties</title> + <p>The XML attributes on each element are passed to the object. The objects +that represent FO objects then convert the attributes into properties.</p> + <p>Since properties can be inherited the PropertyList class handles resolving +properties for a particular element. +All properties are specified in an XML file. Classes are created +automatically during the build process.</p> + <p>In some cases the element may be moved to have a different parent, for +example markers, or the inheritance could be different, for example +initial property set.</p> + <p>Properties (recall that FO's have properties, areas have traits, and XML +nodes have attributes) are also a concern of <em>FOTreeBuilder</em>. It +accomplishes this by using a <em>PropertyListBuilder</em>. There is a +separate <em>PropertyListBuilder</em> for each namespace encountered +while building the FO tree. Each Builder object contains a hash of +property names and <em>their</em> respective makers. It may also +contain element-specific property maker hashes; these are based on the +<em>local name</em> of the flow object, ie. <em>table-row</em>, not +<em>fo:table-row</em>. If an element-specific property mapping exists, +it is preferred to the generic mapping.</p> + <p>The base class for all +properties is <em>Property</em>, and all the property makers extend +<em>Property.Maker</em>. A more complete discussion of the property +architecture may be found in <jump href="properties.html">Properties</jump>.</p> + </section> + + +<p>The <strong>Refinement</strong> step is part of reading and using the properties which may happen immediately or during the layout process.</p> + <p>During XML Parsing, the FO tree is constructed. For each FO object (some subclass of FObj), the tree builder then passes the list of all attributes specified on the FO element to the handleAttrs method. This |