--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id: AbsolutePosition.png.xml,v 1.1 2002-01-05 14:46:32+10 pbw
+Exp pbw $ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+<document>
+ <header>
+ <title>AbsolutePosition diagram</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="Properties$AbsolutePosition">
+ <figure src="AbsolutePosition.png" alt="AbsolutePosition diagram"/>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+<document>
+ <header>
+ <title>BorderCommonStyle diagram</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="Properties$BorderCommonStyle">
+ <figure src="BorderCommonStyle.png" alt="BorderCommonStyle diagram"/>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>..fo.PropNames diagram</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="PropNames.class">
+ <figure src="PropNames.png" alt="PropNames.class diagram"/>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>..fo.Properties diagram</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="Properties.class">
+ <figure src="Properties.png" alt="Properties.class diagram"/>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+<document>
+ <header>
+ <title>..fo.PropertyConsts diagram</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="PropertyConsts.class">
+ <figure src="PropertyConsts.png" alt="PropertyConsts.class diagram"/>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>VerticalAlign diagram</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="Properties$VerticalAlign">
+ <figure src="VerticalAlign.png" alt="VerticalAlign diagram"/>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+<document>
+ <header>
+ <title>Implementing Properties</title>
+ <authors>
+ <person id="pbw" name="Peter B. West" email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="An alternative properties implementation">
+ <note>
+ The following discussion focusses on the relationship between
+ Flow Objects in the Flow Object tree, and properties. There
+ is no (or only passing) discussion of the relationship between
+ properties and traits, and by extension, between properties
+ and the Area tree. The discussion is illustrated with some
+ pseudo-UML diagrams.
+ </note>
+ <p>
+ Property handling is complex and expensive. Varying numbers of
+ properties apply to individual Flow Objects
+ <strong>(FOs)</strong> in the <strong>FO
+ tree </strong> but any property may effectively be
+ assigned a value on any element of the tree. If that property
+ is inheritable, its defined value will then be available to
+ any children of the defining FO.
+ </p>
+ <note>
+ <em>(XSL 1.0 Rec)</em> <strong>5.1.4 Inheritance</strong>
+ ...The inheritable properties can be placed on any formatting
+ object.
+ </note>
+ <p>
+ Even if the value is not inheritable, it may be accessed by
+ its children through the <code>inherit</code> keyword or the
+ <code>from-parent()</code> core function, and potentially by
+ any of its descendents through the
+ <code>from-nearest-specified-value()</code> core function.
+ </p>
+ <p>
+ In addition to the assigned values of properties, almost every
+ property has an <strong>initial value</strong> which is used
+ when no value has been assigned.
+ </p>
+ <s2 title="The history problem">
+ </s2>
+ <p>
+ The difficulty and expense of handling properties comes from
+ this univeral inheritance possibility. The list of properties
+ which are assigned values on any particular <em>FO</em>
+ element will not generally be large, but a current value is
+ required for each property which applies to the <em>FO</em>
+ being processed.
+ </p>
+ <p>
+ The environment from which these values may be selected
+ includes, for each <em>FO</em>, for each applicable property,
+ the value assigned on this <em>FO</em>, the value which
+ applied to the parent of this <em>FO</em>, the nearest value
+ specified on an ancestor of this element, and the initial
+ value of the property.
+ </p>
+ <s2 title="Data requirement and structure">
+ <p>
+ This determines the minimum set of properties and associated
+ property value assignments that is necessary for the
+ processing of any individual <em>FO</em>. Implicit in this
+ set is the set of properties and associated values,
+ effective on the current <em>FO</em>, that were assigned on
+ that <em>FO</em>.
+ </p>
+ <p>
+ This minimum requirement - the initial value, the
+ nearest ancestor specified value, the parent computed value
+ and the value assigned to the current element -
+ suggests a stack implementation.
+ </p>
+ </s2>
+ <s2 title="Stack considerations">
+ <p>
+ One possibility is to push to the stack only a minimal set
+ of required elements. When a value is assigned, the
+ relevant form or forms of that value (specified, computed,
+ actual) are pushed onto the stack. As long as each
+ <em>FO</em> maintains a list of the properties which were
+ assigned from it, the value can be popped when the focus of
+ FO processing retreats back up the <em>FO</em> tree.
+ </p>
+ <p>
+ The complication is that, for elements which are not
+ automatically inherited, when an <em>FO</em> is encountered
+ which does <strong>not</strong> assign a value to the
+ property, the initial value must either be already at the
+ top of the stack or be pushed onto the stack.
+ </p>
+ <p>
+ As a first approach, the simplest procedure may be to push a
+ current value onto the stack for every element - initial
+ values for non-inherited properties and the parental value
+ otherwise. Then perform any processing of assigned values.
+ This simplifies program logic at what is hopefully a small
+ cost in memory and processing time. It may be tuned in a
+ later iteration.
+ </p>
+ <s3 title="Stack implementation">
+ <p>
+ Initial attempts at this implementation have used
+ <code>LinkedList</code>s as the stacks, on the assumption
+ that
+ </p>
+ <sl>
+ <!-- one of (dl sl ul ol li) -->
+ <li>random access would not be required</li>
+ <li>
+ pushing and popping of list elements requires nearly
+ constant (low) time
+ </li>
+ <li> no penalty for first addition to an empty list</li>
+ <li>efficient access to both bottom and top of stack</li>
+ </sl>
+ <p>
+ However, it may be required to perform stack access
+ operations from an arbitrary place on the stack, in which
+ case it would probably be more efficient to use
+ <code>ArrayList</code>s instead.
+ </p>
+ </s3>
+ </s2>
+ <s2 title="Class vs instance">
+ <p>
+ An individual stack would contain values for a particular
+ property, and the context of the stack is the property class
+ as a whole. The property instances would be represented by
+ the individual values on the stack. If properties are to be
+ represented as instantiations of the class, the stack
+ entries would presumably be references to, or at least
+ referenced from, individual property objects. However, the
+ most important information about individual property
+ instances is the value assigned, and the relationship of
+ this property object to its ancestors and its descendents.
+ Other information would include the ownership of a property
+ instance by a particular <em>FO</em>, and, in the other
+ direction, the membership of the property in the set of
+ properties for which an <em>FO</em> has defined values.
+ </p>
+ <p>
+ In the presence of a stack, however, none of this required
+ information mandates the instantiation of properties. All
+ of the information mentioned so far can be effectively
+ represented by a stack position and a link to an
+ <em>FO</em>. If the property stack is maintained in
+ parallel with a stack of <em>FOs</em>, even that link is
+ implicit in the stack position.
+ </p>
+ </s2>
+ <p>
+ <strong>Next:</strong> <link href= "classes-overview.html"
+ >property classes overview.</link>
+ </p>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0"?>
+
+<book title="FOP New Design Notes" copyright="2001-2002 The Apache Software Foundation">
+ <external href="http://xml.apache.org/fop/" label="About FOP"/>
+ <separator/>
+ <external href="../index.html" label="NEW DESIGN" />
+ <separator/>
+ <page id="index" label="alt.properties" source="alt.properties.xml"/>
+ <page id="classes-overview" label="Classes overview" source="classes-overview.xml"/>
+ <page id="properties-classes" label="Properties classes" source="properties-classes.xml"/>
+ <page id="Properties" label="Properties" source="Properties.png.xml"/>
+ <page id="PropertyConsts" label="PropertyConsts" source="PropertyConsts.png.xml"/>
+ <page id="PropNames" label="PropNames" source="PropNames.png.xml"/>
+ <page id="AbsolutePosition" label="AbsolutePosition" source="AbsolutePosition.png.xml"/>
+ <page id="VerticalAlign" label="VerticalAlign" source="VerticalAlign.png.xml"/>
+ <page id="BorderCommonStyle" label="BorderCommonStyle" source="BorderCommonStyle.png.xml"/>
+ <separator/>
+ <page id="xml-parsing" label="XML parsing" source="xml-parsing.xml"/>
+ <separator/>
+ <page id="property-parsing" label="Property parsing" source="propertyExpressions.xml"/>
+</book>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>Property classes overview</title>
+ <authors>
+ <person id="pbw" name="Peter B. West"
+ email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="Classes overview">
+ <s2 title="The class of all properties">
+ <p>
+ If individual properties can have a "virtual reality" on the
+ stack, where is the stack itself to be instantiated? One
+ possibility is to have the stacks as <code>static</code>
+ data structures within the individual property classes.
+ However, the reduction of individual property instances to
+ stack entries allows the possibility of further
+ virtualization of property classes. If the individual
+ properties can be represented by an integer, i.e. a
+ <code>static final int</code>, the set of individual
+ property stacks can be collected together into one array.
+ Where to put such an overall collection? Creating an
+ über-class to accommodate everything that applies to
+ property classes as a whole allows this array to be defined
+ as a <em><code>static final</code> something[]</em>.
+ </p>
+ </s2>
+ <s2 title="The overall property classes">
+ <p>
+ This approach has been taken for the experimental code.
+ Rather than simply creating a overall class containing
+ common elements of properties and acting as a superclass,
+ advantage has been taken of the facility for nesting of
+ top-level classes. All of the individual property classes
+ are nested within the <code>Properties</code> class.
+ This has advantages and disadvantages.
+ </p>
+ <dl>
+ <dt>Disadvantages</dt>
+ <dd>
+ The file becomes extremely cumbersome. This can cause
+ problems with "intelligent" editors. E.g.
+ <em>XEmacs</em> syntax highlighting virtually grinds to a
+ halt with the current version of this file.<br/> <br/>
+
+ Possible problems with IDEs. There may be speed problems
+ or even overflow problems with various IDEs. The current
+ version of this and related files had only been tried with
+ the <em>[X]Emacs JDE</em> environment, without difficulties
+ apart from the editor speed problems mentioned
+ above.<br/> <br/>
+
+ Retro look and feel. Not the done Java thing.<br/> <br/>
+ </dd>
+ <dt>Advantages</dt>
+ <dd>
+ Everything to do with properties in the one place (more or
+ less.)<br/> <br/>
+
+ Eliminates the need for a large part of the (sometimes)
+ necessary evil of code generation. The One Big File of
+ <code>foproperties.xml</code>, with its ancillary xsl, is
+ absorbed into the One Bigger File of
+ <code>Properties.java</code>. The huge advantage of this
+ is that it <strong>is</strong> Java.
+ </dd>
+ </dl>
+ </s2>
+ <s2 title="The property information classes">
+ <p>
+ In fact, in order to keep the size of the file down to more
+ a more manageable level, the property information classes of
+ static data and methods have been split tentatively into
+ three:
+ </p>
+ <figure src="PropertyStaticsOverview.png" alt="Top level
+ property classes"/>
+ <dl>
+ <dt><link href="PropNames.html">PropNames</link></dt>
+ <dd>
+ Contains an array, <code>propertyNames</code>, of the names of
+ all properties, and a set of enumeration constants, one
+ for each property name in the <code>PropertyNames</code>
+ array. These constants index the name of the properties
+ in <code>propertyNames</code>, and must be manually kept in
+ sync with the entries in the array. (This was the last of
+ the classes split off from the original single class;
+ hence the naming tiredness.)
+ <br/> <br/>
+ </dd>
+ <dt><link href="PropertyConsts.html">PropertyConsts</link></dt>
+ <dd>
+ Contains two basic sets of data:<br/>
+ Property-indexed arrays and property set
+ definitions.<br/> <br/>
+
+ <strong>Property-indexed arrays</strong> are elaborations
+ of the property indexing idea discussed in relation to the
+ arrays of property stacks. One of the arrays is<br/> <br/>
+
+ <code>public static final LinkedList[]
+ propertyStacks</code><br/> <br/>
+
+ This is an array of stacks, implemented as
+ <code>LinkedList</code>s, one for each property.<br/> <br/>
+
+ The other arrays provide indexed access to fields which
+ are, in most cases, common to all of the properties. An
+ exception is<br/> <br/>
+
+ <code>public static final Method[]
+ complexMethods</code><br/> <br/>
+
+ which contains a reference to the method
+ <code>complex()</code> which is only defined for
+ properties which have complex value parsing requirements.
+ It is likely that a similar array will be defined for
+ properties which allow a value of <em>auto</em>.<br/> <br/>
+
+ The property-indexed arrays are initialized by
+ <code>static</code> initializers in this class. The
+ <code>PropNames</code> class and
+ <code>Properties</code>
+ nested classes are scanned in order to obtain or derive
+ the data necessary for initialization.<br/> <br/>
+
+ <strong>Property set definitions</strong> are
+ <code>HashSet</code>s of properties (represented by
+ integer constants) which belong to each of the categories
+ of properties defined. They are used to simplify the
+ assignment of property sets to individual FOs.
+ Representative <code>HashSet</code>s include
+ <em>backgroundProps</em> and
+ <em>tableProps</em>.<br/> <br/>
+ </dd>
+ <dt><link href="Properties.html">Properties</link></dt>
+ <dd>
+ <br/>
+ This class contains only sets of constants for use by the
+ individual property classes, but it also importantly
+ serves as a container for all of the property classes, and
+ some convenience pseudo-property classes.<br/> <br/>
+
+ <strong>Constants sets</strong> include:<br/> <br/>
+
+ <em>Datatype constants</em>. A bitmap set of
+ integer constants over a possible range of 2^0 to 2^31
+ (represented as -2147483648). E.g.<br/>
+ INTEGER = 1<br/>
+ ENUM = 524288<br/> <br/>
+ Some of the definitions are bit-ORed
+ combinations of the basic values. Used to set the
+ <em>dataTypes</em> field of the property
+ classes.<br/> <br/>
+
+ <em>Trait mapping constants</em>. A bitmap set of
+ integer constants over a possible range of 2^0 to 2^31
+ (represented as -2147483648), representing the manner in
+ which a property maps into a <em>trait</em>. Used to set
+ the <code>traitMapping</code> field of the property
+ classes.<br/> <br/>
+
+ <em>Initial value constants</em>. A sequence of
+ integer constants representing the datatype of the initial
+ value of a property. Used to set the
+ <code>initialValueType</code> field of the property
+ classes.<br/> <br/>
+
+ <em>Inheritance value constants</em>. A sequence
+ of integer constants representing the way in which the
+ property is normally inherited. Used to set the
+ <code>inherited</code> field of the property
+ classes.<br/> <br/>
+
+ <strong>Nested property classes</strong>. The
+ <em>Properties</em> class serves as the holding pen for
+ all of the individual property classes, and for property
+ pseudo-classes which contain data common to a number of
+ actual properties, e.g. <em>ColorCommon</em>.
+ </dd>
+ </dl>
+ </s2>
+ <p>
+ <strong>Previous:</strong> <link href=
+ "alt.properties.html" >alt.properties</link>
+ </p>
+ <p>
+ <strong>Next:</strong> <link href=
+ "properties-classes.html" >Properties classes</link>
+ </p>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<html>
+ <head>
+ <title>alt.design</title>
+ </head>
+ <body>
+ <h3>Directory listing of alt.design</h3>
+ <hr>
+ <pre><code>
+drwxrwxr-x 2 pbw pbw 4096 Jan 31 17:58 .
+drwxrwxr-x 5 pbw pbw 4096 Jan 31 17:57 <a href="../dirlist.html">..</a>
+-rw-rw-r-- 1 pbw pbw 949 Jan 25 17:31 <a href="AbsolutePosition.dia">AbsolutePosition.dia</a>
+-rw-rw-r-- 1 pbw pbw 4890 Jan 25 17:31 <a href="AbsolutePosition.png">AbsolutePosition.png</a>
+-rw-r--r-- 1 pbw pbw 579 Jan 25 23:47 <a href="AbsolutePosition.png.xml">AbsolutePosition.png.xml</a>
+-rw-rw-r-- 1 pbw pbw 4140 Jan 25 17:31 <a href="BorderCommonStyle.png">BorderCommonStyle.png</a>
+-rw-r--r-- 1 pbw pbw 584 Jan 26 12:29 <a href="BorderCommonStyle.png.xml">BorderCommonStyle.png.xml</a>
+-rw-rw-r-- 1 pbw pbw 807 Jan 25 17:31 <a href="PropNames.dia">PropNames.dia</a>
+-rw-rw-r-- 1 pbw pbw 3428 Jan 25 17:31 <a href="PropNames.png">PropNames.png</a>
+-rw-r--r-- 1 pbw pbw 551 Jan 25 23:48 <a href="PropNames.png.xml">PropNames.png.xml</a>
+-rw-rw-r-- 1 pbw pbw 1900 Jan 25 17:31 <a href="Properties.dia">Properties.dia</a>
+-rw-rw-r-- 1 pbw pbw 32437 Jan 25 17:31 <a href="Properties.png">Properties.png</a>
+-rw-r--r-- 1 pbw pbw 556 Jan 25 23:48 <a href="Properties.png.xml">Properties.png.xml</a>
+-rw-rw-r-- 1 pbw pbw 2180 Jan 25 17:31 <a href="PropertyClasses.dia">PropertyClasses.dia</a>
+-rw-rw-r-- 1 pbw pbw 17581 Jan 25 17:31 <a href="PropertyClasses.png">PropertyClasses.png</a>
+-rw-rw-r-- 1 pbw pbw 1573 Jan 25 17:31 <a href="PropertyConsts.dia">PropertyConsts.dia</a>
+-rw-rw-r-- 1 pbw pbw 20379 Jan 25 17:31 <a href="PropertyConsts.png">PropertyConsts.png</a>
+-rw-r--r-- 1 pbw pbw 575 Jan 25 23:47 <a href="PropertyConsts.png.xml">PropertyConsts.png.xml</a>
+-rw-rw-r-- 1 pbw pbw 1333 Jan 25 17:31 <a href="PropertyStaticsOverview.dia">PropertyStaticsOverview.dia</a>
+-rw-rw-r-- 1 pbw pbw 7503 Jan 25 17:31 <a href="PropertyStaticsOverview.png">PropertyStaticsOverview.png</a>
+-rw-rw-r-- 1 pbw pbw 3068 Jan 25 17:31 <a href="SAXParsing.dia">SAXParsing.dia</a>
+-rw-rw-r-- 1 pbw pbw 24482 Jan 25 17:31 <a href="SAXParsing.png">SAXParsing.png</a>
+-rw-rw-r-- 1 pbw pbw 964 Jan 25 17:31 <a href="VerticalAlign.dia">VerticalAlign.dia</a>
+-rw-rw-r-- 1 pbw pbw 7091 Jan 25 17:31 <a href="VerticalAlign.png">VerticalAlign.png</a>
+-rw-r--r-- 1 pbw pbw 565 Jan 25 23:48 <a href="VerticalAlign.png.xml">VerticalAlign.png.xml</a>
+-rw-rw-r-- 1 pbw pbw 2004 Jan 25 17:31 <a href="XML-event-buffer.dia">XML-event-buffer.dia</a>
+-rw-rw-r-- 1 pbw pbw 20415 Jan 25 17:31 <a href="XML-event-buffer.png">XML-event-buffer.png</a>
+-rw-rw-r-- 1 pbw pbw 2322 Jan 25 17:31 <a href="XMLEventQueue.dia">XMLEventQueue.dia</a>
+-rw-rw-r-- 1 pbw pbw 11643 Jan 25 17:31 <a href="XMLEventQueue.png">XMLEventQueue.png</a>
+-rw-r--r-- 1 pbw pbw 6584 Jan 26 11:56 <a href="alt.properties.xml">alt.properties.xml</a>
+-rw-rw-r-- 1 pbw pbw 1152 Jan 25 17:31 <a href="book.xml">book.xml</a>
+-rw-r--r-- 1 pbw pbw 7834 Jan 26 13:07 <a href="classes-overview.xml">classes-overview.xml</a>
+-rw-rw-r-- 1 pbw pbw 8330 Jan 25 17:31 <a href="parserPersistence.png">parserPersistence.png</a>
+-rw-rw-r-- 1 pbw pbw 1974 Jan 25 17:31 <a href="processPlumbing.dia">processPlumbing.dia</a>
+-rw-rw-r-- 1 pbw pbw 8689 Jan 25 17:31 <a href="processPlumbing.png">processPlumbing.png</a>
+-rw-r--r-- 1 pbw pbw 5123 Jan 26 11:58 <a href="properties-classes.xml">properties-classes.xml</a>
+-rw-rw-r-- 1 pbw pbw 3115 Jan 25 17:31 <a href="property-super-classes-full.dia">property-super-classes-full.dia</a>
+-rw-rw-r-- 1 pbw pbw 89360 Jan 25 17:31 <a href="property-super-classes-full.png">property-super-classes-full.png</a>
+-rw-r--r-- 1 pbw pbw 10221 Jan 25 23:49 <a href="propertyExpressions.xml">propertyExpressions.xml</a>
+-rw-r--r-- 1 pbw pbw 9361 Jan 26 11:59 <a href="xml-parsing.xml">xml-parsing.xml</a>
+-rw-rw-r-- 1 pbw pbw 2655 Jan 25 17:31 <a href="xmlevent-queue.dia">xmlevent-queue.dia</a>
+-rw-rw-r-- 1 pbw pbw 12326 Jan 25 17:31 <a href="xmlevent-queue.png">xmlevent-queue.png</a>
+ </code></pre>
+ <hr>
+ </body>
+</html>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>Properties$classes</title>
+ <authors>
+ <person name="Peter B. West" email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="fo.Properties and the nested properties classes">
+ <figure src="PropertyClasses.png" alt="Nested property and
+ top-level classes"/>
+ <s2 title="Nested property classes">
+ <p>
+ Given the intention that individual properties have only a
+ <em>virtual</em> instantiation in the arrays of
+ <code>PropertyConsts</code>, these classes are intended to
+ remain as repositories of static data and methods. The name
+ of each property is entered in the
+ <code>PropNames.propertyNames</code> array of
+ <code>String</code>s, and each has a unique integer constant
+ defined, corresponding to the offset of the property name in
+ that array.
+ </p>
+ <s3 title="Fields common to all classes">
+ <dl>
+ <dt><code>final int dataTypes</code></dt>
+ <dd>
+ This field defines the allowable data types which may be
+ assigned to the property. The value is chosen from the
+ data type constants defined in <code>Properties</code>, and
+ may consist of more than one of those constants,
+ bit-ORed together.
+ </dd>
+ <dt><code>final int traitMapping</code></dt>
+ <dd>
+ This field defines the mapping of properties to traits
+ in the <code>Area tree</code>. The value is chosen from the
+ trait mapping constants defined in <code>Properties</code>,
+ and may consist of more than one of those constants,
+ bit-ORed together.
+ </dd>
+ <dt><code>final int initialValueType</code></dt>
+ <dd>
+ This field defines the data type of the initial value
+ assigned to the property. The value is chosen from the
+ initial value type constants defined in
+ <code>Properties</code>.
+ </dd>
+ <dt><code>final int inherited</code></dt>
+ <dd>
+ This field defines the kind of inheritance applicable to
+ the property. The value is chosen from the inheritance
+ constants defined in <code>Properties</code>.
+ </dd>
+ </dl>
+ </s3>
+ <s3 title="Datatype dependent fields">
+ <dl>
+ <dt>Enumeration types</dt>
+ <dd>
+ <strong><code>final String[] enums</code></strong><br/>
+ This array contains the <code>NCName</code> text
+ values of the enumeration. In the current
+ implementation, it always contains a null value at
+ <code>enum[0]</code>.<br/> <br/>
+
+ <strong><code>final String[]
+ enumValues</code></strong><br/> When the number of
+ enumeration values is small,
+ <code>enumValues</code> is a reference to the
+ <code>enums</code> array.<br/> <br/>
+
+ <strong><code>final HashMap
+ enumValues</code></strong><br/> When the number of
+ enumeration values is larger,
+ <code>enumValues</code> is a
+ <code>HashMap</code> statically initialized to
+ contain the integer constant values corresponding to
+ each text value, indexed by the text
+ value.<br/> <br/>
+
+ <strong><code>final int</code></strong>
+ <em><code>enumeration-constants</code></em><br/> A
+ unique integer constant is defined for each of the
+ possible enumeration values.<br/> <br/>
+ </dd>
+ <dt>Many types:
+ <code>final</code> <em>datatype</em>
+ <code>initialValue</code></dt>
+ <dd>
+ When the initial datatype does not have an implicit
+ initial value (as, for example, does type
+ <code>AUTO</code>) the initial value for the property is
+ assigned to this field. The type of this field will
+ vary according to the <code>initialValueType</code>
+ field.
+ </dd>
+ <dt>AUTO: <code>PropertyValueList auto(property,
+ list)></code></dt>
+ <dd>
+ When <em>AUTO</em> is a legal value type, the
+ <code>auto()</code> method must be defined in the property
+ class.<br/>
+ <em>NOT YET IMPLEMENTED.</em>
+ </dd>
+ <dt>COMPLEX: <code>PropertyValueList complex(property,
+ list)></code></dt>
+ <dd>
+ <em>COMPLEX</em> is specified as a value type when complex
+ conditions apply to the selection of a value type, or
+ when lists of values are acceptable. To process and
+ validate such a property value assignment, the
+ <code>complex()</code> method must be defined in the
+ property class.
+ </dd>
+ </dl>
+ </s3>
+ </s2>
+ <s2 title="Nested property pseudo-classes">
+ <p>
+ The property pseudo-classes are classes, like
+ <code>ColorCommon</code> which contain values, particularly
+ <em>enums</em>, which are common to a number of actual
+ properties.
+ </p>
+ </s2>
+ <p>
+ <strong>Previous:</strong> <link href= "classes-overview.html"
+ >property classes overview.</link>
+ </p>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>Property Expression Parsing</title>
+ <authors>
+ <person id="pbw" name="Peter B. West" email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="Property expression parsing">
+ <note>
+ The following discussion of the experiments with alternate
+ property expression parsing is very much a work in progress,
+ and subject to sudden changes.
+ </note>
+ <p>
+ The parsing of property value expressions is handled by two
+ closely related classes: <code>PropertyTokenizer</code> and its
+ subclass, <code>PropertyParser</code>.
+ <code>PropertyTokenizer</code>, as the name suggests, handles
+ the tokenizing of the expression, handing <em>tokens</em>
+ back to its subclass,
+ <code>PropertyParser</code>. <code>PropertyParser</code>, in
+ turn, returns a <code>PropertyValueList</code>, a list of
+ <code>PropertyValue</code>s.
+ </p>
+ <p>
+ The tokenizer and parser rely in turn on the datatype
+ definition from the <code>org.apache.fop.datatypes</code>
+ package and the datatype <code>static final int</code>
+ constants from <code>PropertyConsts</code>.
+ </p>
+ <s2 title="Data types">
+ <p>
+ The data types currently defined in
+ <code>org.apache.fop.datatypes</code> include:
+ </p>
+ <table>
+ <tr><th colspan="2">Numbers and lengths</th></tr>
+ <tr>
+ <th>Numeric</th>
+ <td colspan="3">
+ The fundamental numeric data type. <em>Numerics</em> of
+ various types are constructed by the classes listed
+ below.
+ </td>
+ </tr>
+ <tr>
+ <td/>
+ <th colspan="3">Constructor classes for <em>Numeric</em></th>
+ </tr>
+ <tr>
+ <td/><td>Angle</td>
+ <td colspan="2">In degrees(deg), gradients(grad) or
+ radians(rad)</td>
+ </tr>
+ <tr>
+ <td/><td>Ems</td>
+ <td colspan="2">Relative length in <em>ems</em></td>
+ </tr>
+ <tr>
+ <td/><td>Frequency</td>
+ <td colspan="2">In hertz(Hz) or kilohertz(kHz)</td>
+ </tr>
+ <tr>
+ <td/><td>IntegerType</td><td/>
+ </tr>
+ <tr>
+ <td/><td>Length</td>
+ <td colspan="2">In centimetres(cm), millimetres(mm),
+ inches(in), points(pt), picas(pc) or pixels(px)</td>
+ </tr>
+ <tr>
+ <td/><td>Percentage</td><td/>
+ </tr>
+ <tr>
+ <td/><td>Time</td>
+ <td>In seconds(s) or milliseconds(ms)</td>
+ </tr>
+ <tr><th colspan="2">Strings</th></tr>
+ <tr>
+ <th>StringType</th>
+ <td colspan="3">
+ Base class for data types which result in a <em>String</em>.
+ </td>
+ </tr>
+ <tr>
+ <td/><th>Literal</th>
+ <td colspan="2">
+ A subclass of <em>StringType</em> for literals which
+ exceed the constraints of an <em>NCName</em>.
+ </td>
+ </tr>
+ <tr>
+ <td/><th>MimeType</th>
+ <td colspan="2">
+ A subclass of <em>StringType</em> for literals which
+ represent a mime type.
+ </td>
+ </tr>
+ <tr>
+ <td/><th>UriType</th>
+ <td colspan="2">
+ A subclass of <em>StringType</em> for literals which
+ represent a URI, as specified by the argument to
+ <em>url()</em>.
+ </td>
+ </tr>
+ <tr>
+ <td/><th>NCName</th>
+ <td colspan="2">
+ A subclass of <em>StringType</em> for literals which
+ meet the constraints of an <em>NCName</em>.
+ </td>
+ </tr>
+ <tr>
+ <td/><td/><th>Country</th>
+ <td>An RFC 3066/ISO 3166 country code.</td>
+ </tr>
+ <tr>
+ <td/><td/><th>Language</th>
+ <td>An RFC 3066/ISO 639 language code.</td>
+ </tr>
+ <tr>
+ <td/><td/><th>Script</th>
+ <td>An ISO 15924 script code.</td>
+ </tr>
+ <tr><th colspan="2">Enumerated types</th></tr>
+ <tr>
+ <th>EnumType</th>
+ <td colspan="3">
+ An integer representing one of the tokens in a set of
+ enumeration values.
+ </td>
+ </tr>
+ <tr>
+ <td/><th>MappedEnumType</th>
+ <td colspan="2">
+ A subclass of <em>EnumType</em>. Maintains a
+ <em>String</em> with the value to which the associated
+ "raw" enumeration token maps. E.g., the
+ <em>font-size</em> enumeration value "medium" maps to
+ the <em>String</em> "12pt".
+ </td>
+ </tr>
+ <tr><th colspan="2">Colors</th></tr>
+ <tr>
+ <th>ColorType</th>
+ <td colspan="3">
+ Maintains a four-element array of float, derived from
+ the name of a standard colour, the name returned by a
+ call to <em>system-color()</em>, or an RGB
+ specification.
+ </td>
+ </tr>
+ <tr><th colspan="2">Fonts</th></tr>
+ <tr>
+ <th>FontFamilySet</th>
+ <td colspan="3">
+ Maintains an array of <em>String</em>s containing a
+ prioritized list of possibly generic font family names.
+ </td>
+ </tr>
+ <tr><th colspan="2">Pseudo-types</th></tr>
+ <tr>
+ <td colspan="4">
+ A variety of pseudo-types have been defined as
+ convenience types for frequently appearing enumeration
+ token values, or for other special purposes.
+ </td>
+ </tr>
+ <tr>
+ <th>Inherit</th>
+ <td colspan="3">
+ For values of <em>inherit</em>.
+ </td>
+ </tr>
+ <tr>
+ <th>Auto</th>
+ <td colspan="3">
+ For values of <em>auto</em>.
+ </td>
+ </tr>
+ <tr>
+ <th>None</th>
+ <td colspan="3">
+ For values of <em>none</em>.
+ </td>
+ </tr>
+ <tr>
+ <th>Bool</th>
+ <td colspan="3">
+ For values of <em>true/false</em>.
+ </td>
+ </tr>
+ <tr>
+ <th>FromNearestSpecified</th>
+ <td colspan="3">
+ Created to ensure that, when associated with
+ a shorthand, the <em>from-nearest-specified-value()</em>
+ core function is the sole component of the expression.
+ </td>
+ </tr>
+ <tr>
+ <th>FromParent</th>
+ <td colspan="3">
+ Created to ensure that, when associated with
+ a shorthand, the <em>from-parent()</em>
+ core function is the sole component of the expression.
+ </td>
+ </tr>
+ </table>
+ </s2>
+ <s2 title="Tokenizer">
+ <p>
+ The tokenizer returns one of the following token
+ values:
+ </p>
+ <source>
+ static final int
+ EOF = 0
+ ,NCNAME = 1
+ ,MULTIPLY = 2
+ ,LPAR = 3
+ ,RPAR = 4
+ ,LITERAL = 5
+ ,FUNCTION_LPAR = 6
+ ,PLUS = 7
+ ,MINUS = 8
+ ,MOD = 9
+ ,DIV = 10
+ ,COMMA = 11
+ ,PERCENT = 12
+ ,COLORSPEC = 13
+ ,FLOAT = 14
+ ,INTEGER = 15
+ ,ABSOLUTE_LENGTH = 16
+ ,RELATIVE_LENGTH = 17
+ ,TIME = 18
+ ,FREQ = 19
+ ,ANGLE = 20
+ ,INHERIT = 21
+ ,AUTO = 22
+ ,NONE = 23
+ ,BOOL = 24
+ ,URI = 25
+ ,MIMETYPE = 26
+ // NO_UNIT is a transient token for internal use only. It is
+ // never set as the end result of parsing a token.
+ ,NO_UNIT = 27
+ ;
+ </source>
+ <p>
+ Most of these tokens are self-explanatory, but a few need
+ further comment.
+ </p>
+ <dl>
+ <dt>AUTO</dt>
+ <dd>
+ Because of its frequency of occurrence, and the fact that
+ it is always the <em>initial value</em> for any property
+ which supports it, AUTO has been promoted into a
+ pseudo-type with its on datatype class. Therefore, it is
+ also reported as a token.
+ </dd>
+ <dt>NONE</dt>
+ <dd>
+ Similarly to AUTO, NONE has been promoted to a pseudo-type
+ because of its frequency.
+ </dd>
+ <dt>BOOL</dt>
+ <dd>
+ There is a <em>de facto</em> boolean type buried in the
+ enumeration types for many of the properties. It had been
+ specified as a type in its own right in this code.
+ </dd>
+ <dt>MIMETYPE</dt>
+ <dd>
+ The property <code>content-type</code> introduces this
+ complication. It can have two values of the form
+ <strong>content-type:</strong><em>mime-type</em>
+ (e.g. <code>content-type="content-type:xml/svg"</code>) or
+ <strong>namespace-prefix:</strong><em>prefix</em>
+ (e.g. <code>content-type="namespace-prefix:svg"</code>). The
+ experimental code reduces these options to the payload
+ in each case: an <code>NCName</code> in the case of a
+ namespace prefix, and a MIMETYPE in the case of a
+ content-type specification. <code>NCName</code>s cannot
+ contain a "/".
+ </dd>
+ </dl>
+ </s2>
+ <s2 title="Parser">
+ <p>
+ The parser retuns a <code>PropertyValueList</code>,
+ necessary because of the possibility that a list of
+ <code>PropertyValue</code> elements may be returned from the
+ expressions of soem properties.
+ </p>
+ <p>
+ <code>PropertyValueList</code>s may contain
+ <code>PropertyValue</code>s or other
+ <code>PropertyValueList</code>s. This latter provision is
+ necessitated for the peculiar case of of
+ <em>text-shadow</em>, which may contain whitespace separated
+ sublists of either two or three elements, separated from one
+ another by commas. To accommodate this peculiarity, comma
+ separated elements are added to the top-level list, while
+ whitespace separated values are always collected into
+ sublists to be added to the top-level list.
+ </p>
+ <p>
+ Other special cases include the processing of the core
+ functions <code>from-parent()</code> and
+ <code>from-nearest-specified-value()</code> when these
+ function calls are assigned to a shorthand property, or used
+ with a shorthand property name as an argument. In these
+ cases, the function call must be the sole component of the
+ expression. The pseudo-element classes
+ <code>FromParent</code> and
+ <code>FromNearestSpecified</code> are generated in these
+ circumstances so that an exception will be thrown if they
+ are involved in expression evaluation with other
+ components. (See Rec. Section 5.10.4 Property Value
+ Functions.)
+ </p>
+ <p>
+ The experimental code is a simple extension of the existing
+ parser code, which itself borrowed heavily from James
+ Clark's XT processor.
+ </p>
+ </s2>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" encoding="ISO-8859-1"?>
+<!-- $Id$ -->
+<!--
+<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd">
+-->
+
+<document>
+ <header>
+ <title>Integrating XML Parsing</title>
+ <authors>
+ <person name="Peter B. West" email="pbwest@powerup.com.au"/>
+ </authors>
+ </header>
+ <body>
+ <!-- one of (anchor s1) -->
+ <s1 title="An alternative parser integration">
+ <p>
+ This note proposes an alternative method of integrating the
+ output of the SAX parsing of the Flow Object (FO) tree into
+ FOP processing. The pupose of the proposed changes is to
+ provide for better decomposition of the process of analysing
+ and rendering an fo tree such as is represented in the output
+ from initial (XSLT) processing of an XML source document.
+ </p>
+ <s2 title="Structure of SAX parsing">
+ <p>
+ Figure 1 is a schematic representation of the process of SAX
+ parsing of an input source. SAX parsing involves the
+ registration, with an object implementing the
+ <code>XMLReader</code> interface, of a
+ <code>ContentHandler</code> which contains a callback
+ routine for each of the event types encountered by the
+ parser, e.g., <code>startDocument()</code>,
+ <code>startElement()</code>, <code>characters()</code>,
+ <code>endElement()</code> and <code>endDocument()</code>.
+ Parsing is initiated by a call to the <code>parser()</code>
+ method of the <code>XMLReader</code>. Note that the call to
+ <code>parser()</code> and the calls to individual callback
+ methods are synchronous: <code>parser()</code> will only
+ return when the last callback method returns, and each
+ callback must complete before the next is called.<br/><br/>
+ <strong>Figure 1</strong>
+ </p>
+ <figure src="SAXParsing.png" alt="SAX parsing schematic"/>
+ <p>
+ In the process of parsing, the heirarchical structure of the
+ original FO tree is flattened into a number of streams of
+ events of the same type which are reported in the sequence
+ in which they are encountered. Apart from that, the API
+ imposes no structure or constraint which expresses the
+ relationship between, e.g., a startElement event and the
+ endElement event for the same element. To the extent that
+ such relationship information is required, it must be
+ managed by the callback routines.
+ </p>
+ <p>
+ The most direct approach here is to build the tree
+ "invisibly"; to bury within the callback routines the
+ necessary code to construct the tree. In the simplest case,
+ the whole of the FO tree is built within the call to
+ <code>parser()</code>, and that in-memory tree is subsequently
+ processed to (a) validate the FO structure, and (b)
+ construct the Area tree. The problem with this approach is
+ the potential size of the FO tree in memory. FOP has
+ suffered from this problem in the past.
+ </p>
+ </s2>
+ <s2 title="Cluttered callbacks">
+ <p>
+ On the other hand, the callback code may become increasingly
+ complex as tree validation and the triggering of the Area
+ tree processing and subsequent rendering is moved into the
+ callbacks, typically the <code>endElement()</code> method.
+ In order to overcome acute memory problems, the FOP code was
+ recently modified in this way, to trigger Area tree building
+ and rendering in the <code>endElement()</code> method, when
+ the end of a page-sequence was detected.
+ </p>
+ <p>
+ The drawback with such a method is that it becomes difficult
+ to detemine the order of events and the circumstances in
+ which any particular processing events are triggered. When
+ the processing events are inherently self-contained, this is
+ irrelevant. But the more complex and context-dependent the
+ relationships are among the processing elements, the more
+ obscurity is engendered in the code by such "side-effect"
+ processing.
+ </p>
+ </s2>
+ <s2 title="From passive to active parsing">
+ <p>
+ In order to solve the simultaneous problems of exposing the
+ structure of the processing and minimising in-memory
+ requirements, the experimental code separates the parsing of
+ the input source from the building of the FO tree and all
+ downstream processing. The callback routines become
+ minimal, consisting of the creation and buffering of
+ <code>XMLEvent</code> objects as a <em>producer</em>. All
+ of these objects are effectively merged into a single event
+ stream, in strict event order, for subsequent access by the
+ FO tree building process, acting as a
+ <em>consumer</em>. In itself, this does not reduce the
+ footprint. This occurs when the approach is generalised to
+ modularise FOP processing.<br/><br/> <strong>Figure 2</strong>
+ </p>
+ <figure src="XML-event-buffer.png" alt="XML event buffer"/>
+ <p>
+ The most useful change that this brings about is the switch
+ from <em>passive</em> to <em>active</em> XML element
+ processing. The process of parsing now becomes visible to
+ the controlling process. All local validation requirements,
+ all object and data structure building, is initiated by the
+ process(es) <em>get</em>ting from the queue - in the case
+ above, the FO tree builder.
+ </p>
+ </s2>
+ <s2 title="XMLEvent methods">
+ <anchor id="XMLEvent-methods"/>
+ <p>
+ The experimental code uses a class <strong>XMLEvent</strong>
+ to provide the objects which are placed in the queue.
+ <em>XMLEvent</em> includes a variety of methods to access
+ elements in the queue. Namespace URIs encountered in
+ parsing are maintined in a <code>static</code>
+ <code>HashMap</code> where they are associated with a unique
+ integer index. This integer value is used in the signature
+ of some of the access methods.
+ </p>
+ <dl>
+ <dt>XMLEvent getEvent(SyncedCircularBuffer events)</dt>
+ <dd>
+ This is the basis of all of the queue access methods. It
+ returns the next element from the queue, which may be a
+ pushback element.
+ </dd>
+ <dt>XMLEvent getEndDocument(events)</dt>
+ <dd>
+ <em>get</em> and discard elements from the queue
+ until an ENDDOCUMENT element is found and returned.
+ </dd>
+ <dt> XMLEvent expectEndDocument(events)</dt>
+ <dd>
+ If the next element on the queue is an ENDDOCUMENT event,
+ return it. Otherwise, push the element back and throw an
+ exception. Each of the <em>get</em> methods (except
+ <em>getEvent()</em> itself) has a corresponding
+ <em>expect</em> method.
+ </dd>
+ <dt>XMLEvent get/expectStartElement(events)</dt>
+ <dd> Return the next STARTELEMENT event from the queue.</dd>
+ <dt>XMLEvent get/expectStartElement(events, String
+ qName)</dt>
+ <dd>
+ Return the next STARTELEMENT with a QName matching
+ <em>qName</em>.
+ </dd>
+ <dt>
+ XMLEvent get/expectStartElement(events, int uriIndex,
+ String localName)
+ </dt>
+ <dd>
+ Return the next STARTELEMENT with a URI indicated by the
+ <em>uriIndex</em> and a local name matching <em>localName</em>.
+ </dd>
+ <dt>
+ XMLEvent get/expectStartElement(events, LinkedList list)
+ </dt>
+ <dd>
+ <em>list</em> contains instances of the nested class
+ <code>UriLocalName</code>, which hold a
+ <em>uriIndex</em> and a <em>localName</em>. Return
+ the next STARTELEMENT with a URI indicated by the
+ <em>uriIndex</em> and a local name matching
+ <em>localName</em> from any element of
+ <em>list</em>.
+ </dd>
+ <dt>XMLEvent get/expectEndElement(events)</dt>
+ <dd>Return the next ENDELEMENT.</dd>
+ <dt>XMLEvent get/expectEndElement(events, qName)</dt>
+ <dd>Return the next ENDELEMENT with QName
+ <em>qname</em>.</dd>
+ <dt>XMLEvent get/expectEndElement(events, uriIndex, localName)</dt>
+ <dd>
+ Return the next ENDELEMENT with a URI indicated by the
+ <em>uriIndex</em> and a local name matching
+ <em>localName</em>.
+ </dd>
+ <dt>
+ XMLEvent get/expectEndElement(events, XMLEvent event)
+ </dt>
+ <dd>
+ Return the next ENDELEMENT with a URI matching the
+ <em>uriIndex</em> and <em>localName</em>
+ matching those in the <em>event</em> argument. This
+ is intended as a quick way to find the ENDELEMENT matching
+ a previously returned STARTELEMENT.
+ </dd>
+ <dt>XMLEvent get/expectCharacters(events)</dt>
+ <dd>Return the next CHARACTERS event.</dd>
+ </dl>
+ </s2>
+ <s2 title="FOP modularisation">
+ <p>
+ This same principle can be extended to the other major
+ sub-systems of FOP processing. In each case, while it is
+ possible to hold a complete intermediate result in memory,
+ the memory costs of that approach are too high. The
+ sub-systems - xml parsing, FO tree construction, Area tree
+ construction and rendering - must run in parallel if the
+ footprint is to be kept manageable. By creating a series of
+ producer-consumer pairs linked by synchronized buffers,
+ logical isolation can be achieved while rates of processing
+ remain coupled. By introducing feedback loops conveying
+ information about the completion of processing of the
+ elements, sub-systems can dispose of or precis those
+ elements without having to be tightly coupled to downstream
+ processes.<br/><br/>
+ <strong>Figure 3</strong>
+ </p>
+ <figure src="processPlumbing.png" alt="FOP modularisation"/>
+ </s2>
+ </s1>
+ </body>
+</document>
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Area Tree</title>
+ <subtitle>All you wanted to know about the Area Tree !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Area Tree">
+ <p>Yet to come :))</p>
+ <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0"?>
+
+<book title="FOP Design" copyright="1999-2002 The Apache Software Foundation">
+ <external href="http://xml.apache.org/fop/" label="About FOP"/>
+ <separator/>
+ <external href="../index.html" label="NEW DESIGN" />
+ <page id="index" label="Uderstanding" source="understanding.xml"/>
+ <separator/>
+ <page id="xml_parsing" label="XML Parsing" source="xml_parsing.xml"/>
+ <page id="fo_tree" label="FO Tree" source="fo_tree.xml"/>
+ <page id="properties" label="Properties" source="properties.xml"/>
+ <page id="layout_managers" label="Layout Managers" source="layout_process.xml"/>
+ <page id="layout_process" label="Layout Process" source="layout_process.xml"/>
+ <page id="handling_attributes" label="Handling Attributes" source="handling_attributes.xml"/>
+ <page id="area_tree" label="Area Tree" source="area_tree.xml"/>
+ <page id="renderers" label="Renderers" source="renderers.xml"/>
+ <separator/>
+ <page id="images" label="Images" source="images.xml"/>
+ <page id="pdf_library" label="PDF Library" source="pdf_library.xml"/>
+ <page id="svg" label="SVG" source="svg.xml"/>
+ <separator/>
+ <page id="status" label="Status" source="status.xml"/>
+</book>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>FO Tree</title>
+ <subtitle>All you wanted to know about FO Tree !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+<body><s1 title="FO Tree">
+ <p>
+ The FO Tree is a representation of the XSL:FO document. This
+ represents the <strong>Objectify</strong> step from the
+ spec. The <strong>Refinement</strong> step is part of reading
+ and using the properties which may happen immediately or
+ during the layout process.
+ </p>
+
+
+
+<p>Each xml element is represented by a java object. For pagination the
+classes are in <code>org.apache.fop.fo.pagination.*</code>, for elements in the flow
+they are in <code>org.apache.fop.fo.flow.*</code> and some others are in
+<code>org.apache.fop.fo.*.</code></p>
+
+
+
+<p>The base class for all objects in the tree is FONode. The base class for
+all FO Objects is FObj.</p>
+
+
+
+<p>(insert diagram here)</p>
+
+
+
+<p>There is a class for each element in the FO set. An object is created for
+each element in the FO Tree. This object holds the properties for the FO
+Object.</p>
+
+
+
+ <p>
+ When the object is created it is setup. It is given its
+ element name, the FOUserAgent - for resolving properties
+ etc. - the logger and the attributes. The methods
+ <code>handleAttributes()</code> and
+ <code>setuserAgent()</code>, common to <code>FONode</code>,
+ are used in this process. The object will then be given any
+ text data or child elements. Then the <code>end()</code>
+ method is called. The end method is used by a number of
+ elements to indicate that it can do certain processing since
+ all the children have been added.
+ </p>
+
+
+
+<p>Some validity checking is done during these steps. The user can be warned of the error and processing can continue if possible.
+</p>
+
+
+ <p>
+ The FO Tree is simply a heirarchy of java objects that
+ represent the fo elements from xml. The traversal is done by
+ the layout or structure process only in the flow elements.
+ </p>
+
+
+
+<s2 title="Properties">
+
+
+
+<p>The XML attributes on each element are passed to the object. The objects
+that represent FO objects then convert the attributes into properties.
+</p>
+
+
+<p>Since properties can be inherited the PropertyList class handles resolving
+properties for a particular element.
+All properties are specified in an XML file. Classes are created
+automatically during the build process.
+</p>
+
+
+<p>(insert diagram here)</p>
+
+
+
+<p>In some cases the element may be moved to have a different parent, for
+example markers, or the inheritance could be different, for example
+initial property set.</p></s2>
+
+
+
+
+<s2 title="Foreign XML">
+
+
+<p>The base class for foreign XML is XMLObj. This class handles creating a
+DOM Element and the setting of attributes. It also can create a DOM
+Document if it is a top level element, class XMLElement.
+This class must be extended for the namespace of the XML elements. For
+unknown namespaces the class is UnknowXMLObj.</p>
+
+
+
+<p>(insert diagram here)</p>
+
+
+
+<p>If some special processing is needed then the top level element can extend
+the XMLObj. For example the SVGElement makes the special DOM required for
+batik and gets the size of the svg.
+</p>
+
+
+<p>Foreign XML will usually be in an fo:instream-foreign-object, the XML will
+be passed to the render as a DOM where the render will be able to handle
+it. Other XML from an unknwon namespace will be ignored.
+</p>
+
+
+<p>By using element mappings it is possible to read other XML and either</p>
+<ul><li>set information on the area tree</li>
+<li>create pseudo FO Objects that create areas in the area tree</li>
+<li>create FO Objects</li></ul>
+</s2>
+
+
+
+<s2 title="Unknown Elements">
+<p>If an element is in a known namespace but the element is unknown then an
+Unknown object is created. This is mainly to provide information to the
+user.
+This could happen if the fo document contains an element from a different
+version or the element is misspelt.</p>
+</s2>
+
+
+<s2 title="Page Masters">
+ <p>
+ The first elements in a document are the elements for the
+ page master setup. This is usually only a small number and
+ will be used throughout the document to create new pages.
+ These elements are kept as a factory to create the page and
+ appropriate regions whenever a new page is requested by the
+ layout. The objects in the FO Tree that represent these
+ elements are themselves the factory. The root element keeps
+ these objects as a factory for the page sequences.
+ </p>
+</s2>
+
+
+<s2 title="Flow">
+<p>The elements that are in the flow of the document are a set of elements
+that is needed for the layout process. Each element is important in the
+creation of areas.</p>
+</s2>
+
+
+
+<s2 title="Other Elements">
+
+
+
+ <p>
+ The remaining FO Objects are things like page-sequence,
+ title and color-profile. These are handled by their parent
+ element; i.e. the root looks after the declarations and the
+ declarations maintains a list of colour profiles. The
+ page-sequences are direct descendents of root.
+ </p>
+ </s2>
+
+
+
+<s2 title="Associated Tasks">
+
+
+
+<ul><li>Create diagrams</li>
+<li>Setup all properties and elements for XSL:FO</li>
+<li>Setup user agent for property resolution</li>
+<li>Verify all XML is handled appropriately</li></ul></s2></s1></body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Handling Attributes</title>
+ <subtitle>All you wanted to know about FOP Handling Attributes !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Handling Attributes">
+ <p>Yet to come :))</p>
+ <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Images</title>
+ <subtitle>All you wanted to know about Images in FOP !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body>
+
+
+ <s1 title="Images in FOP"> <note> this is still in progress, input in the code is welcome. Needs documenting formats, testing.
+ So all those people interested in images should get involved.</note>
+ <p>Images may only be needed to be loaded when the image is rendered to the
+output or to find the dimensions.<br/>
+An image url may be invalid, this can be costly to find out so we need to
+keep a list of invalid image urls.</p>
+<p>We have a number of different caching schemes that are possible.</p>
+<p>All images are referred to using the url given in the XSL:FO after
+removing "url('')" wrapping. This does
+not include any sort of resolving such as relative -> absolute. The
+external graphic in the FO Tree and the image area in the Area Tree only
+have the url as a reference.
+The images are handled through a static interface in ImageFactory.<br/></p>
+
+
+<p>(insert image)</p>
+
+
+<s2 title="Threading">
+
+
+
+<p>In a single threaded case with one document the image should be released
+as soon as the renderer caches it. If there are multiple documents then
+the images could be held in a weak cache in case another document needs to
+load the same image.</p>
+
+
+<p>In a multi threaded case many threads could be attempting to get the same
+image. We need to make sure an image will only be loaded once at a
+particular time. Once a particular document is finished then we can move
+all the images to a common weak cache.</p>
+</s2>
+
+<s2 title="Caches">
+<s3 title="LRU">
+<p>All images are in a common cache regardless of context. To limit the size
+of the cache the LRU image is removed to keep the amount of memory used
+low. Each image can supply the amount of data held in memory.</p>
+</s3>
+
+<s3 title="Context">
+<p>Images are cached according to the context, using the FOUserAgent as a key.
+Once the context is finished the images are added to a common weak hashmap
+so that other contexts can load these images or the data will be garbage
+collected if required.</p>
+<p>If images are to be used commonly then we cannot dispose of data in the
+FopImage when cached by the renderer. Also if different contexts have
+different base directories for resolving relative url's then the loading
+and caching must be separate. We can have a cache that shares images among
+all contexts or only loads an image for a context.</p>
+</s3>
+
+<p>The cache uses an image loader so that it can synchronize the image
+loading on an image by image basis. Finding and adding an image loader to
+the cache is also synchronized to prevent thread problems.</p>
+</s2>
+
+<s2 title="Invalid Images">
+
+
+<p>
+If an image cannot be loaded for some reason, for example the url is
+invalid or the image data is corrupt or an unknown type. Then it should
+only attempt to load the image once. All other attempts to get the image
+should return null so that it can be easily handled.<br/>
+This will prevent any extra processing or waiting.</p>
+</s2>
+
+
+<s2 title="Reading">
+<p>Once a stream is opened for the image url then a set of image readers is
+used to determine what type of image it is. The reader can peek at the
+image header or if necessary load the image. The reader can also get the
+image size at this stage.
+The reader then can provide the mime type to create the image object to
+load the rest of the information.<br/></p></s2>
+
+
+
+<s2 title="Data">
+
+
+
+<p>The data usually need for an image is the size and either a bitmap or the
+original data. Images such as jpeg and eps can be embedded into the
+document with the original data. SVG images are converted into a DOM which
+needs to be rendered to the PDF. Other images such as gif, tiff etc. are
+converted into a bitmap.
+Data is loaded by the FopImage by calling load(type) where type is the type of data to load.<br/></p></s2>
+
+
+<s2 title="Rendering">
+
+<p>Different renderers need to have the information in different forms.</p>
+
+
+<s3 title="PDF">
+<dl><dt>original data</dt> <dd>JPG, EPS</dd>
+<dt>bitmap</dt> <dd>gif, tiff, bmp, png</dd>
+<dt>other</dt> <dd>SVG</dd></dl>
+</s3>
+
+<s3 title="PS">
+<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
+<dt>other</dt> <dd>SVG</dd></dl>
+</s3>
+
+<s3 title="awt">
+<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
+<dt>other</dt> <dd>SVG</dd></dl></s3>
+
+
+
+<p>The renderer uses the url to retrieve the image from the ImageFactory and
+then load the required data depending on the image mime type. If the
+renderer can insert the image into the document and use that data for all
+future references of the same image then it can cache the reference in the
+renderer and the image can be released from the image cache.</p></s2>
+</s1>
+ </body></document>
+
+
+
+
+
+
+
+
+
+
+
+
+
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Layout Managers</title>
+ <subtitle>All you wanted to know about Layout Managers !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Layout Managers">
+ <p>Yet to come :))</p>
+ <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Layout Process</title>
+ <subtitle>All you wanted to know about the Layout Process !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Layout Process">
+ <p>Yet to come :))</p>
+ <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>PDF Library</title>
+ <subtitle>All you wanted to know about the PDF Library !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="PDF Library">
+
+<p>The PDF Library is an independant package of classes in FOP. These class
+provide a simple way to construct documents and add the contents. The
+classes are found in <code>org.apache.fop.pdf.*</code>.</p>
+
+
+
+
+<s2 title="PDF Document">
+<p>This is where most of the document is created and put together.</p>
+<p>It sets up the header, trailer and resources. Each page is made and added to the document.
+There are a number of methods that can be used to create/add certain PDF objects to the document.</p>
+</s2>
+
+<s2 title="Building PDF">
+<p>The PDF Document is built by creating a page for each page in the Area Tree.</p>
+<p> This page then has all the contents added.
+ The page is then added to the document and available objects can be written to the output stream.</p>
+<p>The contents of the page are things such as text, lines, images etc.
+The PDFRenderer inserts the text directly into a pdf stream.
+The text consists of markup to set fonts, set text position and add text.</p>
+<p>Most of the simple pdf markup is inserted directly into a pdf stream.
+Other more complex objects or commonly used objects are added through java classes.
+Some pdf objects such as an image consists of two parts.</p>
+<p>It has a separate object for the image data and another bit of markup to display the image in a certain position on the page.
+</p><p>The java objects that represent a pdf object implement a method that returns the markup for inserting into a stream.
+The method is: byte[] toPDF().</p>
+
+</s2>
+<s2 title="Features">
+
+
+
+<s3 title="Fonts">
+<p>Support for embedding fonts and using the default Acrobat fonts.
+</p></s3>
+
+<s3 title="Images">
+<p>Images can be inserted into a page. The image can either be inserted as a pixel map or directly insert a jpeg image.
+</p></s3>
+
+<s3 title="Stream Filters">
+<p>A number of filters are available to encode the pdf streams. These filters can compress the data or change it such as converting to hex.
+</p></s3>
+
+<s3 title="Links">
+<p>A pdf link can be added for an area on the page. This link can then point to an external destination or a position on any page in the document.
+</p></s3>
+
+<s3 title="Patterns">
+<p>The fill and stroke of graphical objects can be set with a colour, pattern or gradient.
+</p></s3>
+
+
+<p>The are a number of other features for handling pdf markup relevent to creating PDF files for FOP.</p>
+</s2>
+
+
+<s2 title="Associated Tasks">
+<p>There are a large number of additional features that can be added to pdf.</p>
+<p>Many of these can be handled with extensions or post processing.</p>
+
+</s2>
+
+
+
+ </s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Properties</title>
+ <subtitle>All you wanted to know about the Properties !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Property Handling">
+<p>During XML Parsing, the FO tree is constructed. For each FO object (some
+subclass of FObj), the tree builder then passes the list of all
+attributes specified on the FO element to the handleAttrs method. This
+method converts the attribute specifications into a PropertyList.</p>
+<p>The actual work is done by a PropertyListBuilder (PLB for short). The
+basic idea of the PLB is to handle each attribute in the list in turn,
+find an appropriate "Maker" for it, call the Maker to convert the
+attribute value into a Property object of the correct type, and store
+that Property in the PropertyList.</p>
+
+
+<s2 title="Finding a Maker">
+<p>
+The PLB finds a "Maker" for the property based on the attribute name and
+the element name. Most Makers are generic and handle the attribute on
+any element, but it's possible to set up an element-specific property
+Maker. The attribute name to Maker mappings are automatically created
+during the code generation phase by processing the XML property
+description files.</p>
+</s2>
+
+<s2 title="Processing the attribute list">
+<p>The PLB first looks to see if the font-size property is specified, since
+it sets up relative units which can be used in other property
+specifications. Each attribute is then handled in turn. If the attribute
+specifies part of a compound property such as space-before.optimum, the
+PLB looks to see if the attribute list also contains the "base" property
+(space-before in this case) and processes that first.</p></s2>
+<s2 title="How the Property Maker works"><p>There is a family of Maker objects for each of the property datatypes,
+such as Length, Number, Enumerated, Space, etc. But since each Property
+has specific aspects such as whether it's inherited, its default value,
+its corresponding properties, etc. there is usually a specific Maker for
+each Property. All these Maker classes are created during the code
+generation phase by processing (using XSLT) the XML property description
+files to create Java classes.</p>
+
+
+<p>The Maker first checks for "keyword" values for a property. These are
+things like "thin, medium, thick" for the border-width property. The
+datatype is really a Length but it can be specified using these keywords
+whose actual value is determined by the "User Agent" rather than being
+specified in the XSL standard. For FOP, these values are currently
+defined in foproperties.xml. The keyword value is just a string, so it
+still needs to be parsed as described next.</p>
+
+
+<p>The Maker also checks to see if the property is an Enumerated type and
+then checks whether the value matches one of the specified enumeration
+values.</p>
+
+
+<p>Otherwise the Maker uses the property parser in the fo.expr package to
+evaluate the attribute value and return a Property object. The parser
+interprets the expression language and performs numeric operations and
+function call evaluations.</p>
+
+
+<p>If the returned Property value is of the correct type (specificed in
+foproperties.xml, where else?), the Maker returns it. Otherwise, it may
+be able to convert the returned type into the correct type.</p>
+
+
+<p>Some kinds of property values can't be fully resolved during FO tree
+building because they depend on layout information. This is the case of
+length values specified as percentages and of the special
+proportional-column-width(x) specification for table-column widths.
+These are stored as special kinds of Length objects which are evaluated
+during layout. Expressions involving "em" units which are relative to
+font-size _are_ resolved during the FO tree building however.</p></s2>
+
+
+<s2 title="Structure of the PropertyList">
+<p>The PropertyList extends HashMap and its basic function is to associate
+Property value objects with Property names. The Property objects are all
+subclasses of the base Property class. Each one simply contains a
+reference to one of the property datatype objects. Property provides
+accessors for all known datatypes and various subclasses override the
+accessor(s) which are reasonable for the datatype they store.</p>
+
+
+<p>The PropertyList itself provides various ways of looking up Property
+values to handle such issues as inheritance and corresponding
+properties. </p>
+
+
+<p>The main logic is:<br/>If the property is a writing-mode relative property (using start, end,
+before or after in its name), the corresponding absolute property value
+is returned if it's explicitly set on this FO. <br/>Otherwise, the
+writing-mode relative value is returned if it's explicitly set. If the
+property is inherited, the process repeats using the PropertyList of the
+FO's parent object. (This is easy because each PropertyList points to
+the PropertyList of the nearest ancestor FO.) If the property isn't
+inherited or no value is found at any level, the initial value is
+returned.</p></s2>
+
+
+<s2 title="References">
+
+<dl><dt>docs/design/properties.xml</dt> <dd>a more detailed version of this (generated
+html in docs/html-docs/design/properties.html)</dd>
+
+
+<dt>src/codegen/properties.dtd</dt> <dd>heavily commented DTD for foproperties.xml,
+but may not be completely up-to-date</dd></dl></s2>
+
+
+<s2 title="To Do"> <s3 title="documentation">
+
+<ul><li>explain PropertyManager vs. direct access</li>
+<li>Explain corresponding properties</li></ul></s3>
+
+
+<s3 title="development">
+
+<p>Lots of properties are incompletely handled, especially funny kinds of
+keyword values and shorthand values (one attribute which sets several
+properties)</p></s3></s2>
+
+</s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Renderers</title>
+ <subtitle>All you wanted to know about the Renderers !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Renderers">
+ <p>Yet to come :))</p>
+ <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Tutorial series Status</title>
+ <subtitle>Current Status of tutorial about FOP and Design</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="Tutorial series Status"> <p>Peter said : Do we have a volunteer to track
+ Keiron's tutorials and turn them into web page documentation?</p> <p><strong>The answer is yes
+ we have, but the work is on progress !</strong></p> <note>Keiron has recently extended
+ the documentation generation on the CVS trunk to make this process a bit
+ easier. Keiron tells Peter that Apache is readying a major overhaul of its web
+ site and xml->html generation, but that should not deter us from proceeding
+ with documentation.</note></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0" standalone="no"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>SVG</title>
+ <subtitle>All you wanted to know about SVG and FOP !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body><s1 title="SVG">
+ <p>SVG is rendered through Batik.</p><p>The XML from the XSL:FO document
+ is converted into an SVG DOM with batik. This DOM is then set as the Document
+ on the Foreign Object area in the Area Tree.</p><p>This DOM is then available to
+ be rendered by the renderer.</p><p>SVG is rendered in the renderers via an
+ XMLHandler in the FOUserAgent. This XML handler is used to render the SVG. The
+ SVG is rendered by using batik. Batik converts the SVG DOM into an internal
+ structure that can be drawn into a Graphics2D. So for PDF we use a
+ PDFGraphics2D to draw into.</p><p>This creates the necessary PDF information to
+ create the SVG image in the PDF document.</p><p>Most of the work is done in the
+ PDFGraphics2D class. There are also a few bridges that are plugged into batik
+ to provide different behaviour for some SVG elements.</p><s2
+ title="Text Drawing"><p>Normally batik converts text into a set of curved
+ shapes. </p><p>This is handled as any other shapes when rendering to the output. This
+ is not always desirable as the shapes have very fine curves. This can cause the
+ output to look a bit bad in PDF and PS (it can be drawn properly but is not by
+ default). These curves also require much more data than the original
+ text.</p><p>To handle this there is a PDFTextElementBridge that is set when
+ using the bridge in batik. If the text is simple enough for the text to be
+ drawn in the PDF as with all other text then this sets the TextPainter to use
+ the PDFTextPainter. This inserts the text directly into the PDF using the
+ drawString method on the PDFGraphics2D.</p><p>Text is considered simple if the
+ font is available, the font size is useable and there are no tspans or other
+ complications. This can make the resulting PDF significantly
+ smaller.</p></s2><s2 title="PDF Links"><p>To support links in PDF another batik
+ element bridge is used. The PDFAElementBridge creates a PDFANode which inserts
+ a link into the PDF document via the PDFGraphics2D.</p><p>Since links are
+ positioned on the page without any transforms then we need to transform the
+ coordinates of the link area so that they match the current position of the a
+ element area. This transform may also need to account for the svg being
+ positioned on the page.</p></s2><s2 title="Images"><p>Images are normally drawn
+ into the PDFGraphics2D. This then creates a bitmap of the image data that can
+ be inserted into the PDF document. </p><p>As PDF can support jpeg images then another
+ element bridge is used so that the jpeg can be directly inserted into the
+ PDF.</p></s2><s2 title="PDF Transcoder"><p>Batik provides a mechanism to
+ convert SVG into various formats. Through FOP we can convert an SVG document
+ into a single paged PDF document. The page contains the SVG drawn as best as
+ possible on the page. There is a PDFDocumentGraphics2D that creates a
+ standalone PDF document with a single page. This is then drawn into by batik in
+ the same way as with the PDFGraphics2D.</p></s2><s2
+ title="Other Outputs"><p>When rendering to AWT the SVG is simply drawn onto the
+ awt canvas using batik.</p><p>The PS Renderer uses a similar technique as the
+ PDF Renderer.</p><p>The SVG Renderer simply embeds the SVG inside an svg
+ element.</p></s2><s2 title="Associated Tasks"><ul><li>To get accurate drawing
+ pdf transparency is needed.</li><li>The drawRenderedImage methods need
+ implementing.</li><li>Handle colour space better.</li><li>Improve link handling
+ with pdf.</li><li>Improve image handling.</li></ul></s2></s1>
+ </body></document>
\ No newline at end of file
--- /dev/null
+<?xml version="1.0"?>
+<!-- Overview -->
+<document>
+ <header>
+ <title>Understanding FOP Design</title>
+ <subtitle>Tutorial series about Design Approach to FOP</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body>
+<s1 title="Understanding">
+ <note>
+ The content of this <strong>Understanding series</strong>
+ was all taken from the interactive fop development mailing
+ list discussion . <br/> We strongly advise you to join this
+ mailing list and ask question about this series there. <br/>
+ You can subscribe to fop-dev@xml.apache.org by sending an
+ email to <link href=
+ "mailto:fop-dev-subscribe@xml.apache.org"
+ >fop-dev-subscribe@xml.apache.org</link>. <br/> You will
+ find more information about how to get involved <link href=
+ "http://xml.apache.org/fop/involved.html"
+ >there</link>.<br/> You can also read the <link href=
+ "http://marc.theaimsgroup.com/?l=fop-dev&r=1&w=2"
+ >archive</link> of the discussion list fop-dev to get an
+ idea of the issues being discussed.
+ </note>
+ <s2 title="Introduction">
+ <p>
+ Welcome to the understanding series. This will be
+ a series of notes for developers to understand how FOP
+ works. We will
+ attempt to clarify the processes involved to go from xml(fo)
+ to pdf or other formats. Some areas will get more
+ complicated as we proceed.
+ </p>
+ </s2>
+
+
+ <s2 title="Overview">
+ <p>FOP takes an xml file does its magic and then writes a document to a
+ stream.</p>
+ <p>xml -> [FOP] -> document</p>
+ <p>The document could be pdf, ps etc. or directed to a printer or the
+ screen. The principle remains the same. The xml document must be in the XSL:FO
+ format.</p>
+ <p>For convenience we provide a mechanism to handle XML+XSL as
+ input.</p>
+ <p>The xml document is always handled internally as SAX. The SAX events
+ are used to read the elements, attributes and text data of the FO document.
+ After the manipulation of the data the renderer writes out the pages in the
+ appropriate format. It may write as it goes, a page at a time or the whole
+ document at once. Once finished the document should contain all the data in the
+ chosen format ready for whatever use.</p></s2>
+ <s2 title="Stages"><p>The fo data goes through a few stages. Each piece
+ of data will generally go through the process in the same way but some
+ information may be used a number of times or in a different order. To reduce
+ memory one stage will start before the previous is completed.</p>
+ <p>SAX Handler -> FO Tree -> Layout Managers -> Area Tree
+ -> Render -> document</p>
+ <p>In the case of rtf, mif etc. <br/>SAX Handler -> FO Tree ->
+ Structure Renderer -> document</p>
+ <p>The FO Tree is constructed from the xml document. It is an internal
+ representation of the xml document and it is like a DOM with some differences.
+ The Layout Managers use the FO Tree do their layout stuff and create an Area
+ Tree. The Area Tree is a representation of the final result. It is a
+ representation of a set of pages containing the text and other graphics. The
+ Area Tree is then given to a Renderer. The Renderer can read the Area Tree and
+ convert the information into the render format. For example the PDF Renderer
+ creates a PDF Document. For each page in the Area Tree the renderer creates a
+ PDF Page and places the contents of the page into the PDF Page. Once a PDF Page
+ is complete then it can be written to the output stream.</p>
+ <p>For the structure documents the Structure listener will read
+ directly from the FO Tree and create the document. These documents do not need
+ the layout process or the Area Tree.</p></s2>
+ <s2 title="Associated Tasks"><p>Verify Structure Listener
+ concept.</p></s2>
+ <s2 title="Further Topics">
+ <ul><li>XML parsing</li>
+ <li>FO Tree</li>
+ <li>Properties</li>
+ <li>Layout Managers</li>
+ <li>Layout Process</li>
+ <li>Handling Attributes</li>
+ <li>Area Tree</li>
+ <li>Renderers</li>
+ <li>Images</li>
+ <li>PDF Library</li>
+ <li>SVG</li>
+ </ul>
+ </s2>
+
+ </s1> </body></document>
+
--- /dev/null
+<?xml version="1.0"?>
+<document>
+ <header>
+ <title>XML Parsing</title>
+ <subtitle>All you wanted to know about XML Parsing !</subtitle>
+ <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
+ </authors>
+ </header>
+ <body>
+
+<s1 title="XML Parsing"><p>Since everyone knows the basics we can get
+ into the various stages starting with the XML handling.</p>
+ <s2 title="XML Input"><p>FOP can take the input XML in a number of ways:
+ </p>
+ <ul>
+ <li>SAX Events through SAX Handler
+ <ul>
+ <li>
+ <code>FOTreeBuilder</code> is the SAX Handler which is
+ obtained through <code>getContentHandler</code> on
+ <code>Driver</code>.
+ </li>
+ </ul>
+ </li>
+ <li>
+ DOM which is converted into SAX Events
+ <ul>
+ <li>
+ The conversion of a DOM tree is done via the
+ <code>render(Document)</code> method on
+ <code>Driver</code>.
+ </li>
+ </ul>
+ </li>
+ <li>
+ data source which is parsed and converted into SAX Events
+ <ul>
+ <li>
+ The <code>Driver</code> can take an
+ <code>InputSource</code> as input. This can use a
+ <code>Stream</code>, <code>String</code> etc.
+ </li>
+ </ul>
+ </li>
+ <li>
+ XML+XSLT which is transformed using an XSLT Processor and
+ the result is fired as SAX Events
+ <ul>
+ <li>
+ <code>XSLTInputHandler</code> is used as an
+ <code>InputSource</code> in the
+ render(<code>XMLReader</code>,
+ <code>InputSource</code>) method on
+ <code>Driver</code>
+ </li>
+ </ul>
+ </li>
+ </ul>
+
+ <p>The SAX Events which are fired on the SAX Handler, class
+ <code>FOTreeBuilder</code>, must represent an XSL:FO document. If not there will be an
+ error. Any problems with the XML being well formed are handled here.</p></s2>
+ <s2 title="Element Mappings"><p> The element mapping is a hashmap of all
+ the elements in a particular namespace. This makes it easy to create a
+ different object for each element. Element mappings are static to save on
+ memory. </p><p>To add an extension a developer can put in the classpath a jar
+ that contains the file <code>/META-INF/services/org.apache.fop.fo.ElementMapping</code>.
+ This must contain a line with the fully qualified name of a class that
+ implements the <em>org.apache.fop.fo.ElementMapping</em> interface. This will then be
+ loaded automatically at the start. Internal mappings are: FO, SVG and Extension
+ (pdf bookmarks)</p></s2>
+ <s2 title="Tree Building"><p>The SAX Events will fire all the information
+ for the document with start element, end element, text data etc. This
+ information is used to build up a representation of the FO document. To do this
+ for a namespace there is a set of element mappings. When an element + namepsace
+ mapping is found then it can create an object for that element. If the element
+ is not found then it creates a dummy object or a generic DOM for unknown
+ namespaces.</p>
+ <p>The object is then setup and then given attributes for the element.
+ For the FO Tree the attributes are converted into properties. The FO objects
+ use a property list mapping to convert the attributes into a list of properties
+ for the element. For other XML, for example SVG, a DOM of the XML is
+ constructed. This DOM can then be passed through to the renderer. Other element
+ mappings can be used in different ways, for example to create elements that
+ create areas during the layout process or setup information for the renderer
+ etc.</p>
+ <p>
+ While the tree building is mainly about creating the FO Tree
+ there are some stages that can propagate to the renderer. At
+ the end of a page sequence we know that all pages in the
+ page sequence can be laid out without being effected by any
+ further XML. The significance of this is that the FO Tree
+ for the page sequence may be able to be disposed of. The
+ end of the XML document also tells us that we can finalise
+ the output document. (The layout of individual pages is
+ accomplished by the layout managers page at a time;
+ i.e. they do not need to wait for the end of the page
+ sequence. The page may not yet be complete, however,
+ containing forward page number references, for example.)
+ </p>
+ </s2>
+ <s2 title="Associated Tasks">
+ <ul><li>Error handling for xml not well formed.</li>
+ <li>Error handling for other XML parsing errors.</li><li>Developer
+ info for adding namespace handlers.</li></ul></s2></s1>
+ </body></document>
\ No newline at end of file