diff options
Diffstat (limited to 'docs/design')
55 files changed, 2173 insertions, 0 deletions
diff --git a/docs/design/alt.design/AbsolutePosition.dia b/docs/design/alt.design/AbsolutePosition.dia Binary files differnew file mode 100644 index 000000000..a2d6421a3 --- /dev/null +++ b/docs/design/alt.design/AbsolutePosition.dia diff --git a/docs/design/alt.design/AbsolutePosition.png b/docs/design/alt.design/AbsolutePosition.png Binary files differnew file mode 100644 index 000000000..ed8a3691b --- /dev/null +++ b/docs/design/alt.design/AbsolutePosition.png diff --git a/docs/design/alt.design/AbsolutePosition.png.xml b/docs/design/alt.design/AbsolutePosition.png.xml new file mode 100644 index 000000000..7b2cde0bc --- /dev/null +++ b/docs/design/alt.design/AbsolutePosition.png.xml @@ -0,0 +1,21 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id: AbsolutePosition.png.xml,v 1.1 2002-01-05 14:46:32+10 pbw +Exp pbw $ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> +<document> + <header> + <title>AbsolutePosition diagram</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="Properties$AbsolutePosition"> + <figure src="AbsolutePosition.png" alt="AbsolutePosition diagram"/> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/BorderCommonStyle.png b/docs/design/alt.design/BorderCommonStyle.png Binary files differnew file mode 100644 index 000000000..67cc9f8ee --- /dev/null +++ b/docs/design/alt.design/BorderCommonStyle.png diff --git a/docs/design/alt.design/BorderCommonStyle.png.xml b/docs/design/alt.design/BorderCommonStyle.png.xml new file mode 100644 index 000000000..f57865bc2 --- /dev/null +++ b/docs/design/alt.design/BorderCommonStyle.png.xml @@ -0,0 +1,20 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> +<document> + <header> + <title>BorderCommonStyle diagram</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="Properties$BorderCommonStyle"> + <figure src="BorderCommonStyle.png" alt="BorderCommonStyle diagram"/> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/PropNames.dia b/docs/design/alt.design/PropNames.dia Binary files differnew file mode 100644 index 000000000..81db6a0c9 --- /dev/null +++ b/docs/design/alt.design/PropNames.dia diff --git a/docs/design/alt.design/PropNames.png b/docs/design/alt.design/PropNames.png Binary files differnew file mode 100644 index 000000000..8287e0875 --- /dev/null +++ b/docs/design/alt.design/PropNames.png diff --git a/docs/design/alt.design/PropNames.png.xml b/docs/design/alt.design/PropNames.png.xml new file mode 100644 index 000000000..829509d8b --- /dev/null +++ b/docs/design/alt.design/PropNames.png.xml @@ -0,0 +1,21 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>..fo.PropNames diagram</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="PropNames.class"> + <figure src="PropNames.png" alt="PropNames.class diagram"/> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/Properties.dia b/docs/design/alt.design/Properties.dia Binary files differnew file mode 100644 index 000000000..25d482d5d --- /dev/null +++ b/docs/design/alt.design/Properties.dia diff --git a/docs/design/alt.design/Properties.png b/docs/design/alt.design/Properties.png Binary files differnew file mode 100644 index 000000000..10da5f23c --- /dev/null +++ b/docs/design/alt.design/Properties.png diff --git a/docs/design/alt.design/Properties.png.xml b/docs/design/alt.design/Properties.png.xml new file mode 100644 index 000000000..f2a53578c --- /dev/null +++ b/docs/design/alt.design/Properties.png.xml @@ -0,0 +1,21 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>..fo.Properties diagram</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="Properties.class"> + <figure src="Properties.png" alt="Properties.class diagram"/> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/PropertyClasses.dia b/docs/design/alt.design/PropertyClasses.dia Binary files differnew file mode 100644 index 000000000..5a02f6780 --- /dev/null +++ b/docs/design/alt.design/PropertyClasses.dia diff --git a/docs/design/alt.design/PropertyClasses.png b/docs/design/alt.design/PropertyClasses.png Binary files differnew file mode 100644 index 000000000..e58ca94bf --- /dev/null +++ b/docs/design/alt.design/PropertyClasses.png diff --git a/docs/design/alt.design/PropertyConsts.dia b/docs/design/alt.design/PropertyConsts.dia Binary files differnew file mode 100644 index 000000000..ed30cc6bc --- /dev/null +++ b/docs/design/alt.design/PropertyConsts.dia diff --git a/docs/design/alt.design/PropertyConsts.png b/docs/design/alt.design/PropertyConsts.png Binary files differnew file mode 100644 index 000000000..b6df72f84 --- /dev/null +++ b/docs/design/alt.design/PropertyConsts.png diff --git a/docs/design/alt.design/PropertyConsts.png.xml b/docs/design/alt.design/PropertyConsts.png.xml new file mode 100644 index 000000000..73d509cae --- /dev/null +++ b/docs/design/alt.design/PropertyConsts.png.xml @@ -0,0 +1,20 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> +<document> + <header> + <title>..fo.PropertyConsts diagram</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="PropertyConsts.class"> + <figure src="PropertyConsts.png" alt="PropertyConsts.class diagram"/> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/PropertyStaticsOverview.dia b/docs/design/alt.design/PropertyStaticsOverview.dia Binary files differnew file mode 100644 index 000000000..2ef800725 --- /dev/null +++ b/docs/design/alt.design/PropertyStaticsOverview.dia diff --git a/docs/design/alt.design/PropertyStaticsOverview.png b/docs/design/alt.design/PropertyStaticsOverview.png Binary files differnew file mode 100644 index 000000000..fdda19e74 --- /dev/null +++ b/docs/design/alt.design/PropertyStaticsOverview.png diff --git a/docs/design/alt.design/SAXParsing.dia b/docs/design/alt.design/SAXParsing.dia Binary files differnew file mode 100644 index 000000000..74a525ecf --- /dev/null +++ b/docs/design/alt.design/SAXParsing.dia diff --git a/docs/design/alt.design/SAXParsing.png b/docs/design/alt.design/SAXParsing.png Binary files differnew file mode 100644 index 000000000..f2652e1f7 --- /dev/null +++ b/docs/design/alt.design/SAXParsing.png diff --git a/docs/design/alt.design/VerticalAlign.dia b/docs/design/alt.design/VerticalAlign.dia Binary files differnew file mode 100644 index 000000000..519715a1a --- /dev/null +++ b/docs/design/alt.design/VerticalAlign.dia diff --git a/docs/design/alt.design/VerticalAlign.png b/docs/design/alt.design/VerticalAlign.png Binary files differnew file mode 100644 index 000000000..860d5bdff --- /dev/null +++ b/docs/design/alt.design/VerticalAlign.png diff --git a/docs/design/alt.design/VerticalAlign.png.xml b/docs/design/alt.design/VerticalAlign.png.xml new file mode 100644 index 000000000..6ff21bb00 --- /dev/null +++ b/docs/design/alt.design/VerticalAlign.png.xml @@ -0,0 +1,21 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>VerticalAlign diagram</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="Properties$VerticalAlign"> + <figure src="VerticalAlign.png" alt="VerticalAlign diagram"/> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/XML-event-buffer.dia b/docs/design/alt.design/XML-event-buffer.dia Binary files differnew file mode 100644 index 000000000..ec8b131f6 --- /dev/null +++ b/docs/design/alt.design/XML-event-buffer.dia diff --git a/docs/design/alt.design/XML-event-buffer.png b/docs/design/alt.design/XML-event-buffer.png Binary files differnew file mode 100644 index 000000000..4ee16e913 --- /dev/null +++ b/docs/design/alt.design/XML-event-buffer.png diff --git a/docs/design/alt.design/XMLEventQueue.dia b/docs/design/alt.design/XMLEventQueue.dia Binary files differnew file mode 100644 index 000000000..6a39a3734 --- /dev/null +++ b/docs/design/alt.design/XMLEventQueue.dia diff --git a/docs/design/alt.design/XMLEventQueue.png b/docs/design/alt.design/XMLEventQueue.png Binary files differnew file mode 100644 index 000000000..477abd79a --- /dev/null +++ b/docs/design/alt.design/XMLEventQueue.png diff --git a/docs/design/alt.design/alt.properties.xml b/docs/design/alt.design/alt.properties.xml new file mode 100644 index 000000000..a0bb5ef6c --- /dev/null +++ b/docs/design/alt.design/alt.properties.xml @@ -0,0 +1,167 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> +<document> + <header> + <title>Implementing Properties</title> + <authors> + <person id="pbw" name="Peter B. West" email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="An alternative properties implementation"> + <note> + The following discussion focusses on the relationship between + Flow Objects in the Flow Object tree, and properties. There + is no (or only passing) discussion of the relationship between + properties and traits, and by extension, between properties + and the Area tree. The discussion is illustrated with some + pseudo-UML diagrams. + </note> + <p> + Property handling is complex and expensive. Varying numbers of + properties apply to individual Flow Objects + <strong>(FOs)</strong> in the <strong>FO + tree </strong> but any property may effectively be + assigned a value on any element of the tree. If that property + is inheritable, its defined value will then be available to + any children of the defining FO. + </p> + <note> + <em>(XSL 1.0 Rec)</em> <strong>5.1.4 Inheritance</strong> + ...The inheritable properties can be placed on any formatting + object. + </note> + <p> + Even if the value is not inheritable, it may be accessed by + its children through the <code>inherit</code> keyword or the + <code>from-parent()</code> core function, and potentially by + any of its descendents through the + <code>from-nearest-specified-value()</code> core function. + </p> + <p> + In addition to the assigned values of properties, almost every + property has an <strong>initial value</strong> which is used + when no value has been assigned. + </p> + <s2 title="The history problem"> + </s2> + <p> + The difficulty and expense of handling properties comes from + this univeral inheritance possibility. The list of properties + which are assigned values on any particular <em>FO</em> + element will not generally be large, but a current value is + required for each property which applies to the <em>FO</em> + being processed. + </p> + <p> + The environment from which these values may be selected + includes, for each <em>FO</em>, for each applicable property, + the value assigned on this <em>FO</em>, the value which + applied to the parent of this <em>FO</em>, the nearest value + specified on an ancestor of this element, and the initial + value of the property. + </p> + <s2 title="Data requirement and structure"> + <p> + This determines the minimum set of properties and associated + property value assignments that is necessary for the + processing of any individual <em>FO</em>. Implicit in this + set is the set of properties and associated values, + effective on the current <em>FO</em>, that were assigned on + that <em>FO</em>. + </p> + <p> + This minimum requirement - the initial value, the + nearest ancestor specified value, the parent computed value + and the value assigned to the current element - + suggests a stack implementation. + </p> + </s2> + <s2 title="Stack considerations"> + <p> + One possibility is to push to the stack only a minimal set + of required elements. When a value is assigned, the + relevant form or forms of that value (specified, computed, + actual) are pushed onto the stack. As long as each + <em>FO</em> maintains a list of the properties which were + assigned from it, the value can be popped when the focus of + FO processing retreats back up the <em>FO</em> tree. + </p> + <p> + The complication is that, for elements which are not + automatically inherited, when an <em>FO</em> is encountered + which does <strong>not</strong> assign a value to the + property, the initial value must either be already at the + top of the stack or be pushed onto the stack. + </p> + <p> + As a first approach, the simplest procedure may be to push a + current value onto the stack for every element - initial + values for non-inherited properties and the parental value + otherwise. Then perform any processing of assigned values. + This simplifies program logic at what is hopefully a small + cost in memory and processing time. It may be tuned in a + later iteration. + </p> + <s3 title="Stack implementation"> + <p> + Initial attempts at this implementation have used + <code>LinkedList</code>s as the stacks, on the assumption + that + </p> + <sl> + <!-- one of (dl sl ul ol li) --> + <li>random access would not be required</li> + <li> + pushing and popping of list elements requires nearly + constant (low) time + </li> + <li> no penalty for first addition to an empty list</li> + <li>efficient access to both bottom and top of stack</li> + </sl> + <p> + However, it may be required to perform stack access + operations from an arbitrary place on the stack, in which + case it would probably be more efficient to use + <code>ArrayList</code>s instead. + </p> + </s3> + </s2> + <s2 title="Class vs instance"> + <p> + An individual stack would contain values for a particular + property, and the context of the stack is the property class + as a whole. The property instances would be represented by + the individual values on the stack. If properties are to be + represented as instantiations of the class, the stack + entries would presumably be references to, or at least + referenced from, individual property objects. However, the + most important information about individual property + instances is the value assigned, and the relationship of + this property object to its ancestors and its descendents. + Other information would include the ownership of a property + instance by a particular <em>FO</em>, and, in the other + direction, the membership of the property in the set of + properties for which an <em>FO</em> has defined values. + </p> + <p> + In the presence of a stack, however, none of this required + information mandates the instantiation of properties. All + of the information mentioned so far can be effectively + represented by a stack position and a link to an + <em>FO</em>. If the property stack is maintained in + parallel with a stack of <em>FOs</em>, even that link is + implicit in the stack position. + </p> + </s2> + <p> + <strong>Next:</strong> <link href= "classes-overview.html" + >property classes overview.</link> + </p> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/book.xml b/docs/design/alt.design/book.xml new file mode 100644 index 000000000..5ae28c1f9 --- /dev/null +++ b/docs/design/alt.design/book.xml @@ -0,0 +1,21 @@ +<?xml version="1.0"?> + +<book title="FOP New Design Notes" copyright="2001-2002 The Apache Software Foundation"> + <external href="http://xml.apache.org/fop/" label="About FOP"/> + <separator/> + <external href="../index.html" label="NEW DESIGN" /> + <separator/> + <page id="index" label="alt.properties" source="alt.properties.xml"/> + <page id="classes-overview" label="Classes overview" source="classes-overview.xml"/> + <page id="properties-classes" label="Properties classes" source="properties-classes.xml"/> + <page id="Properties" label="Properties" source="Properties.png.xml"/> + <page id="PropertyConsts" label="PropertyConsts" source="PropertyConsts.png.xml"/> + <page id="PropNames" label="PropNames" source="PropNames.png.xml"/> + <page id="AbsolutePosition" label="AbsolutePosition" source="AbsolutePosition.png.xml"/> + <page id="VerticalAlign" label="VerticalAlign" source="VerticalAlign.png.xml"/> + <page id="BorderCommonStyle" label="BorderCommonStyle" source="BorderCommonStyle.png.xml"/> + <separator/> + <page id="xml-parsing" label="XML parsing" source="xml-parsing.xml"/> + <separator/> + <page id="property-parsing" label="Property parsing" source="propertyExpressions.xml"/> +</book> diff --git a/docs/design/alt.design/classes-overview.xml b/docs/design/alt.design/classes-overview.xml new file mode 100644 index 000000000..fab8e921d --- /dev/null +++ b/docs/design/alt.design/classes-overview.xml @@ -0,0 +1,201 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>Property classes overview</title> + <authors> + <person id="pbw" name="Peter B. West" + email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="Classes overview"> + <s2 title="The class of all properties"> + <p> + If individual properties can have a "virtual reality" on the + stack, where is the stack itself to be instantiated? One + possibility is to have the stacks as <code>static</code> + data structures within the individual property classes. + However, the reduction of individual property instances to + stack entries allows the possibility of further + virtualization of property classes. If the individual + properties can be represented by an integer, i.e. a + <code>static final int</code>, the set of individual + property stacks can be collected together into one array. + Where to put such an overall collection? Creating an + über-class to accommodate everything that applies to + property classes as a whole allows this array to be defined + as a <em><code>static final</code> something[]</em>. + </p> + </s2> + <s2 title="The overall property classes"> + <p> + This approach has been taken for the experimental code. + Rather than simply creating a overall class containing + common elements of properties and acting as a superclass, + advantage has been taken of the facility for nesting of + top-level classes. All of the individual property classes + are nested within the <code>Properties</code> class. + This has advantages and disadvantages. + </p> + <dl> + <dt>Disadvantages</dt> + <dd> + The file becomes extremely cumbersome. This can cause + problems with "intelligent" editors. E.g. + <em>XEmacs</em> syntax highlighting virtually grinds to a + halt with the current version of this file.<br/> <br/> + + Possible problems with IDEs. There may be speed problems + or even overflow problems with various IDEs. The current + version of this and related files had only been tried with + the <em>[X]Emacs JDE</em> environment, without difficulties + apart from the editor speed problems mentioned + above.<br/> <br/> + + Retro look and feel. Not the done Java thing.<br/> <br/> + </dd> + <dt>Advantages</dt> + <dd> + Everything to do with properties in the one place (more or + less.)<br/> <br/> + + Eliminates the need for a large part of the (sometimes) + necessary evil of code generation. The One Big File of + <code>foproperties.xml</code>, with its ancillary xsl, is + absorbed into the One Bigger File of + <code>Properties.java</code>. The huge advantage of this + is that it <strong>is</strong> Java. + </dd> + </dl> + </s2> + <s2 title="The property information classes"> + <p> + In fact, in order to keep the size of the file down to more + a more manageable level, the property information classes of + static data and methods have been split tentatively into + three: + </p> + <figure src="PropertyStaticsOverview.png" alt="Top level + property classes"/> + <dl> + <dt><link href="PropNames.html">PropNames</link></dt> + <dd> + Contains an array, <code>propertyNames</code>, of the names of + all properties, and a set of enumeration constants, one + for each property name in the <code>PropertyNames</code> + array. These constants index the name of the properties + in <code>propertyNames</code>, and must be manually kept in + sync with the entries in the array. (This was the last of + the classes split off from the original single class; + hence the naming tiredness.) + <br/> <br/> + </dd> + <dt><link href="PropertyConsts.html">PropertyConsts</link></dt> + <dd> + Contains two basic sets of data:<br/> + Property-indexed arrays and property set + definitions.<br/> <br/> + + <strong>Property-indexed arrays</strong> are elaborations + of the property indexing idea discussed in relation to the + arrays of property stacks. One of the arrays is<br/> <br/> + + <code>public static final LinkedList[] + propertyStacks</code><br/> <br/> + + This is an array of stacks, implemented as + <code>LinkedList</code>s, one for each property.<br/> <br/> + + The other arrays provide indexed access to fields which + are, in most cases, common to all of the properties. An + exception is<br/> <br/> + + <code>public static final Method[] + complexMethods</code><br/> <br/> + + which contains a reference to the method + <code>complex()</code> which is only defined for + properties which have complex value parsing requirements. + It is likely that a similar array will be defined for + properties which allow a value of <em>auto</em>.<br/> <br/> + + The property-indexed arrays are initialized by + <code>static</code> initializers in this class. The + <code>PropNames</code> class and + <code>Properties</code> + nested classes are scanned in order to obtain or derive + the data necessary for initialization.<br/> <br/> + + <strong>Property set definitions</strong> are + <code>HashSet</code>s of properties (represented by + integer constants) which belong to each of the categories + of properties defined. They are used to simplify the + assignment of property sets to individual FOs. + Representative <code>HashSet</code>s include + <em>backgroundProps</em> and + <em>tableProps</em>.<br/> <br/> + </dd> + <dt><link href="Properties.html">Properties</link></dt> + <dd> + <br/> + This class contains only sets of constants for use by the + individual property classes, but it also importantly + serves as a container for all of the property classes, and + some convenience pseudo-property classes.<br/> <br/> + + <strong>Constants sets</strong> include:<br/> <br/> + + <em>Datatype constants</em>. A bitmap set of + integer constants over a possible range of 2^0 to 2^31 + (represented as -2147483648). E.g.<br/> + INTEGER = 1<br/> + ENUM = 524288<br/> <br/> + Some of the definitions are bit-ORed + combinations of the basic values. Used to set the + <em>dataTypes</em> field of the property + classes.<br/> <br/> + + <em>Trait mapping constants</em>. A bitmap set of + integer constants over a possible range of 2^0 to 2^31 + (represented as -2147483648), representing the manner in + which a property maps into a <em>trait</em>. Used to set + the <code>traitMapping</code> field of the property + classes.<br/> <br/> + + <em>Initial value constants</em>. A sequence of + integer constants representing the datatype of the initial + value of a property. Used to set the + <code>initialValueType</code> field of the property + classes.<br/> <br/> + + <em>Inheritance value constants</em>. A sequence + of integer constants representing the way in which the + property is normally inherited. Used to set the + <code>inherited</code> field of the property + classes.<br/> <br/> + + <strong>Nested property classes</strong>. The + <em>Properties</em> class serves as the holding pen for + all of the individual property classes, and for property + pseudo-classes which contain data common to a number of + actual properties, e.g. <em>ColorCommon</em>. + </dd> + </dl> + </s2> + <p> + <strong>Previous:</strong> <link href= + "alt.properties.html" >alt.properties</link> + </p> + <p> + <strong>Next:</strong> <link href= + "properties-classes.html" >Properties classes</link> + </p> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/dirlist.html b/docs/design/alt.design/dirlist.html new file mode 100644 index 000000000..6a4cf9ddd --- /dev/null +++ b/docs/design/alt.design/dirlist.html @@ -0,0 +1,55 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> +<html> + <head> + <title>alt.design</title> + </head> + <body> + <h3>Directory listing of alt.design</h3> + <hr> + <pre><code> +drwxrwxr-x 2 pbw pbw 4096 Jan 31 17:58 . +drwxrwxr-x 5 pbw pbw 4096 Jan 31 17:57 <a href="../dirlist.html">..</a> +-rw-rw-r-- 1 pbw pbw 949 Jan 25 17:31 <a href="AbsolutePosition.dia">AbsolutePosition.dia</a> +-rw-rw-r-- 1 pbw pbw 4890 Jan 25 17:31 <a href="AbsolutePosition.png">AbsolutePosition.png</a> +-rw-r--r-- 1 pbw pbw 579 Jan 25 23:47 <a href="AbsolutePosition.png.xml">AbsolutePosition.png.xml</a> +-rw-rw-r-- 1 pbw pbw 4140 Jan 25 17:31 <a href="BorderCommonStyle.png">BorderCommonStyle.png</a> +-rw-r--r-- 1 pbw pbw 584 Jan 26 12:29 <a href="BorderCommonStyle.png.xml">BorderCommonStyle.png.xml</a> +-rw-rw-r-- 1 pbw pbw 807 Jan 25 17:31 <a href="PropNames.dia">PropNames.dia</a> +-rw-rw-r-- 1 pbw pbw 3428 Jan 25 17:31 <a href="PropNames.png">PropNames.png</a> +-rw-r--r-- 1 pbw pbw 551 Jan 25 23:48 <a href="PropNames.png.xml">PropNames.png.xml</a> +-rw-rw-r-- 1 pbw pbw 1900 Jan 25 17:31 <a href="Properties.dia">Properties.dia</a> +-rw-rw-r-- 1 pbw pbw 32437 Jan 25 17:31 <a href="Properties.png">Properties.png</a> +-rw-r--r-- 1 pbw pbw 556 Jan 25 23:48 <a href="Properties.png.xml">Properties.png.xml</a> +-rw-rw-r-- 1 pbw pbw 2180 Jan 25 17:31 <a href="PropertyClasses.dia">PropertyClasses.dia</a> +-rw-rw-r-- 1 pbw pbw 17581 Jan 25 17:31 <a href="PropertyClasses.png">PropertyClasses.png</a> +-rw-rw-r-- 1 pbw pbw 1573 Jan 25 17:31 <a href="PropertyConsts.dia">PropertyConsts.dia</a> +-rw-rw-r-- 1 pbw pbw 20379 Jan 25 17:31 <a href="PropertyConsts.png">PropertyConsts.png</a> +-rw-r--r-- 1 pbw pbw 575 Jan 25 23:47 <a href="PropertyConsts.png.xml">PropertyConsts.png.xml</a> +-rw-rw-r-- 1 pbw pbw 1333 Jan 25 17:31 <a href="PropertyStaticsOverview.dia">PropertyStaticsOverview.dia</a> +-rw-rw-r-- 1 pbw pbw 7503 Jan 25 17:31 <a href="PropertyStaticsOverview.png">PropertyStaticsOverview.png</a> +-rw-rw-r-- 1 pbw pbw 3068 Jan 25 17:31 <a href="SAXParsing.dia">SAXParsing.dia</a> +-rw-rw-r-- 1 pbw pbw 24482 Jan 25 17:31 <a href="SAXParsing.png">SAXParsing.png</a> +-rw-rw-r-- 1 pbw pbw 964 Jan 25 17:31 <a href="VerticalAlign.dia">VerticalAlign.dia</a> +-rw-rw-r-- 1 pbw pbw 7091 Jan 25 17:31 <a href="VerticalAlign.png">VerticalAlign.png</a> +-rw-r--r-- 1 pbw pbw 565 Jan 25 23:48 <a href="VerticalAlign.png.xml">VerticalAlign.png.xml</a> +-rw-rw-r-- 1 pbw pbw 2004 Jan 25 17:31 <a href="XML-event-buffer.dia">XML-event-buffer.dia</a> +-rw-rw-r-- 1 pbw pbw 20415 Jan 25 17:31 <a href="XML-event-buffer.png">XML-event-buffer.png</a> +-rw-rw-r-- 1 pbw pbw 2322 Jan 25 17:31 <a href="XMLEventQueue.dia">XMLEventQueue.dia</a> +-rw-rw-r-- 1 pbw pbw 11643 Jan 25 17:31 <a href="XMLEventQueue.png">XMLEventQueue.png</a> +-rw-r--r-- 1 pbw pbw 6584 Jan 26 11:56 <a href="alt.properties.xml">alt.properties.xml</a> +-rw-rw-r-- 1 pbw pbw 1152 Jan 25 17:31 <a href="book.xml">book.xml</a> +-rw-r--r-- 1 pbw pbw 7834 Jan 26 13:07 <a href="classes-overview.xml">classes-overview.xml</a> +-rw-rw-r-- 1 pbw pbw 8330 Jan 25 17:31 <a href="parserPersistence.png">parserPersistence.png</a> +-rw-rw-r-- 1 pbw pbw 1974 Jan 25 17:31 <a href="processPlumbing.dia">processPlumbing.dia</a> +-rw-rw-r-- 1 pbw pbw 8689 Jan 25 17:31 <a href="processPlumbing.png">processPlumbing.png</a> +-rw-r--r-- 1 pbw pbw 5123 Jan 26 11:58 <a href="properties-classes.xml">properties-classes.xml</a> +-rw-rw-r-- 1 pbw pbw 3115 Jan 25 17:31 <a href="property-super-classes-full.dia">property-super-classes-full.dia</a> +-rw-rw-r-- 1 pbw pbw 89360 Jan 25 17:31 <a href="property-super-classes-full.png">property-super-classes-full.png</a> +-rw-r--r-- 1 pbw pbw 10221 Jan 25 23:49 <a href="propertyExpressions.xml">propertyExpressions.xml</a> +-rw-r--r-- 1 pbw pbw 9361 Jan 26 11:59 <a href="xml-parsing.xml">xml-parsing.xml</a> +-rw-rw-r-- 1 pbw pbw 2655 Jan 25 17:31 <a href="xmlevent-queue.dia">xmlevent-queue.dia</a> +-rw-rw-r-- 1 pbw pbw 12326 Jan 25 17:31 <a href="xmlevent-queue.png">xmlevent-queue.png</a> + </code></pre> + <hr> + </body> +</html> diff --git a/docs/design/alt.design/parserPersistence.png b/docs/design/alt.design/parserPersistence.png Binary files differnew file mode 100644 index 000000000..342a933b4 --- /dev/null +++ b/docs/design/alt.design/parserPersistence.png diff --git a/docs/design/alt.design/processPlumbing.dia b/docs/design/alt.design/processPlumbing.dia Binary files differnew file mode 100644 index 000000000..184e51524 --- /dev/null +++ b/docs/design/alt.design/processPlumbing.dia diff --git a/docs/design/alt.design/processPlumbing.png b/docs/design/alt.design/processPlumbing.png Binary files differnew file mode 100644 index 000000000..182d3c68e --- /dev/null +++ b/docs/design/alt.design/processPlumbing.png diff --git a/docs/design/alt.design/properties-classes.xml b/docs/design/alt.design/properties-classes.xml new file mode 100644 index 000000000..216f2b9e0 --- /dev/null +++ b/docs/design/alt.design/properties-classes.xml @@ -0,0 +1,140 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>Properties$classes</title> + <authors> + <person name="Peter B. West" email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="fo.Properties and the nested properties classes"> + <figure src="PropertyClasses.png" alt="Nested property and + top-level classes"/> + <s2 title="Nested property classes"> + <p> + Given the intention that individual properties have only a + <em>virtual</em> instantiation in the arrays of + <code>PropertyConsts</code>, these classes are intended to + remain as repositories of static data and methods. The name + of each property is entered in the + <code>PropNames.propertyNames</code> array of + <code>String</code>s, and each has a unique integer constant + defined, corresponding to the offset of the property name in + that array. + </p> + <s3 title="Fields common to all classes"> + <dl> + <dt><code>final int dataTypes</code></dt> + <dd> + This field defines the allowable data types which may be + assigned to the property. The value is chosen from the + data type constants defined in <code>Properties</code>, and + may consist of more than one of those constants, + bit-ORed together. + </dd> + <dt><code>final int traitMapping</code></dt> + <dd> + This field defines the mapping of properties to traits + in the <code>Area tree</code>. The value is chosen from the + trait mapping constants defined in <code>Properties</code>, + and may consist of more than one of those constants, + bit-ORed together. + </dd> + <dt><code>final int initialValueType</code></dt> + <dd> + This field defines the data type of the initial value + assigned to the property. The value is chosen from the + initial value type constants defined in + <code>Properties</code>. + </dd> + <dt><code>final int inherited</code></dt> + <dd> + This field defines the kind of inheritance applicable to + the property. The value is chosen from the inheritance + constants defined in <code>Properties</code>. + </dd> + </dl> + </s3> + <s3 title="Datatype dependent fields"> + <dl> + <dt>Enumeration types</dt> + <dd> + <strong><code>final String[] enums</code></strong><br/> + This array contains the <code>NCName</code> text + values of the enumeration. In the current + implementation, it always contains a null value at + <code>enum[0]</code>.<br/> <br/> + + <strong><code>final String[] + enumValues</code></strong><br/> When the number of + enumeration values is small, + <code>enumValues</code> is a reference to the + <code>enums</code> array.<br/> <br/> + + <strong><code>final HashMap + enumValues</code></strong><br/> When the number of + enumeration values is larger, + <code>enumValues</code> is a + <code>HashMap</code> statically initialized to + contain the integer constant values corresponding to + each text value, indexed by the text + value.<br/> <br/> + + <strong><code>final int</code></strong> + <em><code>enumeration-constants</code></em><br/> A + unique integer constant is defined for each of the + possible enumeration values.<br/> <br/> + </dd> + <dt>Many types: + <code>final</code> <em>datatype</em> + <code>initialValue</code></dt> + <dd> + When the initial datatype does not have an implicit + initial value (as, for example, does type + <code>AUTO</code>) the initial value for the property is + assigned to this field. The type of this field will + vary according to the <code>initialValueType</code> + field. + </dd> + <dt>AUTO: <code>PropertyValueList auto(property, + list)></code></dt> + <dd> + When <em>AUTO</em> is a legal value type, the + <code>auto()</code> method must be defined in the property + class.<br/> + <em>NOT YET IMPLEMENTED.</em> + </dd> + <dt>COMPLEX: <code>PropertyValueList complex(property, + list)></code></dt> + <dd> + <em>COMPLEX</em> is specified as a value type when complex + conditions apply to the selection of a value type, or + when lists of values are acceptable. To process and + validate such a property value assignment, the + <code>complex()</code> method must be defined in the + property class. + </dd> + </dl> + </s3> + </s2> + <s2 title="Nested property pseudo-classes"> + <p> + The property pseudo-classes are classes, like + <code>ColorCommon</code> which contain values, particularly + <em>enums</em>, which are common to a number of actual + properties. + </p> + </s2> + <p> + <strong>Previous:</strong> <link href= "classes-overview.html" + >property classes overview.</link> + </p> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/property-super-classes-full.dia b/docs/design/alt.design/property-super-classes-full.dia Binary files differnew file mode 100644 index 000000000..4fe8f750a --- /dev/null +++ b/docs/design/alt.design/property-super-classes-full.dia diff --git a/docs/design/alt.design/property-super-classes-full.png b/docs/design/alt.design/property-super-classes-full.png Binary files differnew file mode 100644 index 000000000..dea871d3c --- /dev/null +++ b/docs/design/alt.design/property-super-classes-full.png diff --git a/docs/design/alt.design/propertyExpressions.xml b/docs/design/alt.design/propertyExpressions.xml new file mode 100644 index 000000000..0900a323a --- /dev/null +++ b/docs/design/alt.design/propertyExpressions.xml @@ -0,0 +1,341 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>Property Expression Parsing</title> + <authors> + <person id="pbw" name="Peter B. West" email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="Property expression parsing"> + <note> + The following discussion of the experiments with alternate + property expression parsing is very much a work in progress, + and subject to sudden changes. + </note> + <p> + The parsing of property value expressions is handled by two + closely related classes: <code>PropertyTokenizer</code> and its + subclass, <code>PropertyParser</code>. + <code>PropertyTokenizer</code>, as the name suggests, handles + the tokenizing of the expression, handing <em>tokens</em> + back to its subclass, + <code>PropertyParser</code>. <code>PropertyParser</code>, in + turn, returns a <code>PropertyValueList</code>, a list of + <code>PropertyValue</code>s. + </p> + <p> + The tokenizer and parser rely in turn on the datatype + definition from the <code>org.apache.fop.datatypes</code> + package and the datatype <code>static final int</code> + constants from <code>PropertyConsts</code>. + </p> + <s2 title="Data types"> + <p> + The data types currently defined in + <code>org.apache.fop.datatypes</code> include: + </p> + <table> + <tr><th colspan="2">Numbers and lengths</th></tr> + <tr> + <th>Numeric</th> + <td colspan="3"> + The fundamental numeric data type. <em>Numerics</em> of + various types are constructed by the classes listed + below. + </td> + </tr> + <tr> + <td/> + <th colspan="3">Constructor classes for <em>Numeric</em></th> + </tr> + <tr> + <td/><td>Angle</td> + <td colspan="2">In degrees(deg), gradients(grad) or + radians(rad)</td> + </tr> + <tr> + <td/><td>Ems</td> + <td colspan="2">Relative length in <em>ems</em></td> + </tr> + <tr> + <td/><td>Frequency</td> + <td colspan="2">In hertz(Hz) or kilohertz(kHz)</td> + </tr> + <tr> + <td/><td>IntegerType</td><td/> + </tr> + <tr> + <td/><td>Length</td> + <td colspan="2">In centimetres(cm), millimetres(mm), + inches(in), points(pt), picas(pc) or pixels(px)</td> + </tr> + <tr> + <td/><td>Percentage</td><td/> + </tr> + <tr> + <td/><td>Time</td> + <td>In seconds(s) or milliseconds(ms)</td> + </tr> + <tr><th colspan="2">Strings</th></tr> + <tr> + <th>StringType</th> + <td colspan="3"> + Base class for data types which result in a <em>String</em>. + </td> + </tr> + <tr> + <td/><th>Literal</th> + <td colspan="2"> + A subclass of <em>StringType</em> for literals which + exceed the constraints of an <em>NCName</em>. + </td> + </tr> + <tr> + <td/><th>MimeType</th> + <td colspan="2"> + A subclass of <em>StringType</em> for literals which + represent a mime type. + </td> + </tr> + <tr> + <td/><th>UriType</th> + <td colspan="2"> + A subclass of <em>StringType</em> for literals which + represent a URI, as specified by the argument to + <em>url()</em>. + </td> + </tr> + <tr> + <td/><th>NCName</th> + <td colspan="2"> + A subclass of <em>StringType</em> for literals which + meet the constraints of an <em>NCName</em>. + </td> + </tr> + <tr> + <td/><td/><th>Country</th> + <td>An RFC 3066/ISO 3166 country code.</td> + </tr> + <tr> + <td/><td/><th>Language</th> + <td>An RFC 3066/ISO 639 language code.</td> + </tr> + <tr> + <td/><td/><th>Script</th> + <td>An ISO 15924 script code.</td> + </tr> + <tr><th colspan="2">Enumerated types</th></tr> + <tr> + <th>EnumType</th> + <td colspan="3"> + An integer representing one of the tokens in a set of + enumeration values. + </td> + </tr> + <tr> + <td/><th>MappedEnumType</th> + <td colspan="2"> + A subclass of <em>EnumType</em>. Maintains a + <em>String</em> with the value to which the associated + "raw" enumeration token maps. E.g., the + <em>font-size</em> enumeration value "medium" maps to + the <em>String</em> "12pt". + </td> + </tr> + <tr><th colspan="2">Colors</th></tr> + <tr> + <th>ColorType</th> + <td colspan="3"> + Maintains a four-element array of float, derived from + the name of a standard colour, the name returned by a + call to <em>system-color()</em>, or an RGB + specification. + </td> + </tr> + <tr><th colspan="2">Fonts</th></tr> + <tr> + <th>FontFamilySet</th> + <td colspan="3"> + Maintains an array of <em>String</em>s containing a + prioritized list of possibly generic font family names. + </td> + </tr> + <tr><th colspan="2">Pseudo-types</th></tr> + <tr> + <td colspan="4"> + A variety of pseudo-types have been defined as + convenience types for frequently appearing enumeration + token values, or for other special purposes. + </td> + </tr> + <tr> + <th>Inherit</th> + <td colspan="3"> + For values of <em>inherit</em>. + </td> + </tr> + <tr> + <th>Auto</th> + <td colspan="3"> + For values of <em>auto</em>. + </td> + </tr> + <tr> + <th>None</th> + <td colspan="3"> + For values of <em>none</em>. + </td> + </tr> + <tr> + <th>Bool</th> + <td colspan="3"> + For values of <em>true/false</em>. + </td> + </tr> + <tr> + <th>FromNearestSpecified</th> + <td colspan="3"> + Created to ensure that, when associated with + a shorthand, the <em>from-nearest-specified-value()</em> + core function is the sole component of the expression. + </td> + </tr> + <tr> + <th>FromParent</th> + <td colspan="3"> + Created to ensure that, when associated with + a shorthand, the <em>from-parent()</em> + core function is the sole component of the expression. + </td> + </tr> + </table> + </s2> + <s2 title="Tokenizer"> + <p> + The tokenizer returns one of the following token + values: + </p> + <source> + static final int + EOF = 0 + ,NCNAME = 1 + ,MULTIPLY = 2 + ,LPAR = 3 + ,RPAR = 4 + ,LITERAL = 5 + ,FUNCTION_LPAR = 6 + ,PLUS = 7 + ,MINUS = 8 + ,MOD = 9 + ,DIV = 10 + ,COMMA = 11 + ,PERCENT = 12 + ,COLORSPEC = 13 + ,FLOAT = 14 + ,INTEGER = 15 + ,ABSOLUTE_LENGTH = 16 + ,RELATIVE_LENGTH = 17 + ,TIME = 18 + ,FREQ = 19 + ,ANGLE = 20 + ,INHERIT = 21 + ,AUTO = 22 + ,NONE = 23 + ,BOOL = 24 + ,URI = 25 + ,MIMETYPE = 26 + // NO_UNIT is a transient token for internal use only. It is + // never set as the end result of parsing a token. + ,NO_UNIT = 27 + ; + </source> + <p> + Most of these tokens are self-explanatory, but a few need + further comment. + </p> + <dl> + <dt>AUTO</dt> + <dd> + Because of its frequency of occurrence, and the fact that + it is always the <em>initial value</em> for any property + which supports it, AUTO has been promoted into a + pseudo-type with its on datatype class. Therefore, it is + also reported as a token. + </dd> + <dt>NONE</dt> + <dd> + Similarly to AUTO, NONE has been promoted to a pseudo-type + because of its frequency. + </dd> + <dt>BOOL</dt> + <dd> + There is a <em>de facto</em> boolean type buried in the + enumeration types for many of the properties. It had been + specified as a type in its own right in this code. + </dd> + <dt>MIMETYPE</dt> + <dd> + The property <code>content-type</code> introduces this + complication. It can have two values of the form + <strong>content-type:</strong><em>mime-type</em> + (e.g. <code>content-type="content-type:xml/svg"</code>) or + <strong>namespace-prefix:</strong><em>prefix</em> + (e.g. <code>content-type="namespace-prefix:svg"</code>). The + experimental code reduces these options to the payload + in each case: an <code>NCName</code> in the case of a + namespace prefix, and a MIMETYPE in the case of a + content-type specification. <code>NCName</code>s cannot + contain a "/". + </dd> + </dl> + </s2> + <s2 title="Parser"> + <p> + The parser retuns a <code>PropertyValueList</code>, + necessary because of the possibility that a list of + <code>PropertyValue</code> elements may be returned from the + expressions of soem properties. + </p> + <p> + <code>PropertyValueList</code>s may contain + <code>PropertyValue</code>s or other + <code>PropertyValueList</code>s. This latter provision is + necessitated for the peculiar case of of + <em>text-shadow</em>, which may contain whitespace separated + sublists of either two or three elements, separated from one + another by commas. To accommodate this peculiarity, comma + separated elements are added to the top-level list, while + whitespace separated values are always collected into + sublists to be added to the top-level list. + </p> + <p> + Other special cases include the processing of the core + functions <code>from-parent()</code> and + <code>from-nearest-specified-value()</code> when these + function calls are assigned to a shorthand property, or used + with a shorthand property name as an argument. In these + cases, the function call must be the sole component of the + expression. The pseudo-element classes + <code>FromParent</code> and + <code>FromNearestSpecified</code> are generated in these + circumstances so that an exception will be thrown if they + are involved in expression evaluation with other + components. (See Rec. Section 5.10.4 Property Value + Functions.) + </p> + <p> + The experimental code is a simple extension of the existing + parser code, which itself borrowed heavily from James + Clark's XT processor. + </p> + </s2> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/xml-parsing.xml b/docs/design/alt.design/xml-parsing.xml new file mode 100644 index 000000000..3d83802ef --- /dev/null +++ b/docs/design/alt.design/xml-parsing.xml @@ -0,0 +1,224 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- $Id$ --> +<!-- +<!DOCTYPE document SYSTEM "../xml-docs/dtd/document-v10.dtd"> +--> + +<document> + <header> + <title>Integrating XML Parsing</title> + <authors> + <person name="Peter B. West" email="pbwest@powerup.com.au"/> + </authors> + </header> + <body> + <!-- one of (anchor s1) --> + <s1 title="An alternative parser integration"> + <p> + This note proposes an alternative method of integrating the + output of the SAX parsing of the Flow Object (FO) tree into + FOP processing. The pupose of the proposed changes is to + provide for better decomposition of the process of analysing + and rendering an fo tree such as is represented in the output + from initial (XSLT) processing of an XML source document. + </p> + <s2 title="Structure of SAX parsing"> + <p> + Figure 1 is a schematic representation of the process of SAX + parsing of an input source. SAX parsing involves the + registration, with an object implementing the + <code>XMLReader</code> interface, of a + <code>ContentHandler</code> which contains a callback + routine for each of the event types encountered by the + parser, e.g., <code>startDocument()</code>, + <code>startElement()</code>, <code>characters()</code>, + <code>endElement()</code> and <code>endDocument()</code>. + Parsing is initiated by a call to the <code>parser()</code> + method of the <code>XMLReader</code>. Note that the call to + <code>parser()</code> and the calls to individual callback + methods are synchronous: <code>parser()</code> will only + return when the last callback method returns, and each + callback must complete before the next is called.<br/><br/> + <strong>Figure 1</strong> + </p> + <figure src="SAXParsing.png" alt="SAX parsing schematic"/> + <p> + In the process of parsing, the heirarchical structure of the + original FO tree is flattened into a number of streams of + events of the same type which are reported in the sequence + in which they are encountered. Apart from that, the API + imposes no structure or constraint which expresses the + relationship between, e.g., a startElement event and the + endElement event for the same element. To the extent that + such relationship information is required, it must be + managed by the callback routines. + </p> + <p> + The most direct approach here is to build the tree + "invisibly"; to bury within the callback routines the + necessary code to construct the tree. In the simplest case, + the whole of the FO tree is built within the call to + <code>parser()</code>, and that in-memory tree is subsequently + processed to (a) validate the FO structure, and (b) + construct the Area tree. The problem with this approach is + the potential size of the FO tree in memory. FOP has + suffered from this problem in the past. + </p> + </s2> + <s2 title="Cluttered callbacks"> + <p> + On the other hand, the callback code may become increasingly + complex as tree validation and the triggering of the Area + tree processing and subsequent rendering is moved into the + callbacks, typically the <code>endElement()</code> method. + In order to overcome acute memory problems, the FOP code was + recently modified in this way, to trigger Area tree building + and rendering in the <code>endElement()</code> method, when + the end of a page-sequence was detected. + </p> + <p> + The drawback with such a method is that it becomes difficult + to detemine the order of events and the circumstances in + which any particular processing events are triggered. When + the processing events are inherently self-contained, this is + irrelevant. But the more complex and context-dependent the + relationships are among the processing elements, the more + obscurity is engendered in the code by such "side-effect" + processing. + </p> + </s2> + <s2 title="From passive to active parsing"> + <p> + In order to solve the simultaneous problems of exposing the + structure of the processing and minimising in-memory + requirements, the experimental code separates the parsing of + the input source from the building of the FO tree and all + downstream processing. The callback routines become + minimal, consisting of the creation and buffering of + <code>XMLEvent</code> objects as a <em>producer</em>. All + of these objects are effectively merged into a single event + stream, in strict event order, for subsequent access by the + FO tree building process, acting as a + <em>consumer</em>. In itself, this does not reduce the + footprint. This occurs when the approach is generalised to + modularise FOP processing.<br/><br/> <strong>Figure 2</strong> + </p> + <figure src="XML-event-buffer.png" alt="XML event buffer"/> + <p> + The most useful change that this brings about is the switch + from <em>passive</em> to <em>active</em> XML element + processing. The process of parsing now becomes visible to + the controlling process. All local validation requirements, + all object and data structure building, is initiated by the + process(es) <em>get</em>ting from the queue - in the case + above, the FO tree builder. + </p> + </s2> + <s2 title="XMLEvent methods"> + <anchor id="XMLEvent-methods"/> + <p> + The experimental code uses a class <strong>XMLEvent</strong> + to provide the objects which are placed in the queue. + <em>XMLEvent</em> includes a variety of methods to access + elements in the queue. Namespace URIs encountered in + parsing are maintined in a <code>static</code> + <code>HashMap</code> where they are associated with a unique + integer index. This integer value is used in the signature + of some of the access methods. + </p> + <dl> + <dt>XMLEvent getEvent(SyncedCircularBuffer events)</dt> + <dd> + This is the basis of all of the queue access methods. It + returns the next element from the queue, which may be a + pushback element. + </dd> + <dt>XMLEvent getEndDocument(events)</dt> + <dd> + <em>get</em> and discard elements from the queue + until an ENDDOCUMENT element is found and returned. + </dd> + <dt> XMLEvent expectEndDocument(events)</dt> + <dd> + If the next element on the queue is an ENDDOCUMENT event, + return it. Otherwise, push the element back and throw an + exception. Each of the <em>get</em> methods (except + <em>getEvent()</em> itself) has a corresponding + <em>expect</em> method. + </dd> + <dt>XMLEvent get/expectStartElement(events)</dt> + <dd> Return the next STARTELEMENT event from the queue.</dd> + <dt>XMLEvent get/expectStartElement(events, String + qName)</dt> + <dd> + Return the next STARTELEMENT with a QName matching + <em>qName</em>. + </dd> + <dt> + XMLEvent get/expectStartElement(events, int uriIndex, + String localName) + </dt> + <dd> + Return the next STARTELEMENT with a URI indicated by the + <em>uriIndex</em> and a local name matching <em>localName</em>. + </dd> + <dt> + XMLEvent get/expectStartElement(events, LinkedList list) + </dt> + <dd> + <em>list</em> contains instances of the nested class + <code>UriLocalName</code>, which hold a + <em>uriIndex</em> and a <em>localName</em>. Return + the next STARTELEMENT with a URI indicated by the + <em>uriIndex</em> and a local name matching + <em>localName</em> from any element of + <em>list</em>. + </dd> + <dt>XMLEvent get/expectEndElement(events)</dt> + <dd>Return the next ENDELEMENT.</dd> + <dt>XMLEvent get/expectEndElement(events, qName)</dt> + <dd>Return the next ENDELEMENT with QName + <em>qname</em>.</dd> + <dt>XMLEvent get/expectEndElement(events, uriIndex, localName)</dt> + <dd> + Return the next ENDELEMENT with a URI indicated by the + <em>uriIndex</em> and a local name matching + <em>localName</em>. + </dd> + <dt> + XMLEvent get/expectEndElement(events, XMLEvent event) + </dt> + <dd> + Return the next ENDELEMENT with a URI matching the + <em>uriIndex</em> and <em>localName</em> + matching those in the <em>event</em> argument. This + is intended as a quick way to find the ENDELEMENT matching + a previously returned STARTELEMENT. + </dd> + <dt>XMLEvent get/expectCharacters(events)</dt> + <dd>Return the next CHARACTERS event.</dd> + </dl> + </s2> + <s2 title="FOP modularisation"> + <p> + This same principle can be extended to the other major + sub-systems of FOP processing. In each case, while it is + possible to hold a complete intermediate result in memory, + the memory costs of that approach are too high. The + sub-systems - xml parsing, FO tree construction, Area tree + construction and rendering - must run in parallel if the + footprint is to be kept manageable. By creating a series of + producer-consumer pairs linked by synchronized buffers, + logical isolation can be achieved while rates of processing + remain coupled. By introducing feedback loops conveying + information about the completion of processing of the + elements, sub-systems can dispose of or precis those + elements without having to be tightly coupled to downstream + processes.<br/><br/> + <strong>Figure 3</strong> + </p> + <figure src="processPlumbing.png" alt="FOP modularisation"/> + </s2> + </s1> + </body> +</document> diff --git a/docs/design/alt.design/xmlevent-queue.dia b/docs/design/alt.design/xmlevent-queue.dia Binary files differnew file mode 100644 index 000000000..91e752473 --- /dev/null +++ b/docs/design/alt.design/xmlevent-queue.dia diff --git a/docs/design/alt.design/xmlevent-queue.png b/docs/design/alt.design/xmlevent-queue.png Binary files differnew file mode 100644 index 000000000..0bb019c65 --- /dev/null +++ b/docs/design/alt.design/xmlevent-queue.png diff --git a/docs/design/understanding/area_tree.xml b/docs/design/understanding/area_tree.xml new file mode 100644 index 000000000..bf8c63794 --- /dev/null +++ b/docs/design/understanding/area_tree.xml @@ -0,0 +1,13 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Area Tree</title> + <subtitle>All you wanted to know about the Area Tree !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Area Tree"> + <p>Yet to come :))</p> + <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/book.xml b/docs/design/understanding/book.xml new file mode 100644 index 000000000..505d3c06c --- /dev/null +++ b/docs/design/understanding/book.xml @@ -0,0 +1,23 @@ +<?xml version="1.0"?> + +<book title="FOP Design" copyright="1999-2002 The Apache Software Foundation"> + <external href="http://xml.apache.org/fop/" label="About FOP"/> + <separator/> + <external href="../index.html" label="NEW DESIGN" /> + <page id="index" label="Uderstanding" source="understanding.xml"/> + <separator/> + <page id="xml_parsing" label="XML Parsing" source="xml_parsing.xml"/> + <page id="fo_tree" label="FO Tree" source="fo_tree.xml"/> + <page id="properties" label="Properties" source="properties.xml"/> + <page id="layout_managers" label="Layout Managers" source="layout_process.xml"/> + <page id="layout_process" label="Layout Process" source="layout_process.xml"/> + <page id="handling_attributes" label="Handling Attributes" source="handling_attributes.xml"/> + <page id="area_tree" label="Area Tree" source="area_tree.xml"/> + <page id="renderers" label="Renderers" source="renderers.xml"/> + <separator/> + <page id="images" label="Images" source="images.xml"/> + <page id="pdf_library" label="PDF Library" source="pdf_library.xml"/> + <page id="svg" label="SVG" source="svg.xml"/> + <separator/> + <page id="status" label="Status" source="status.xml"/> +</book>
\ No newline at end of file diff --git a/docs/design/understanding/fo_tree.xml b/docs/design/understanding/fo_tree.xml new file mode 100644 index 000000000..83e58ef8c --- /dev/null +++ b/docs/design/understanding/fo_tree.xml @@ -0,0 +1,184 @@ +<?xml version="1.0"?> +<document> + <header> + <title>FO Tree</title> + <subtitle>All you wanted to know about FO Tree !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> +<body><s1 title="FO Tree"> + <p> + The FO Tree is a representation of the XSL:FO document. This + represents the <strong>Objectify</strong> step from the + spec. The <strong>Refinement</strong> step is part of reading + and using the properties which may happen immediately or + during the layout process. + </p> + + + +<p>Each xml element is represented by a java object. For pagination the +classes are in <code>org.apache.fop.fo.pagination.*</code>, for elements in the flow +they are in <code>org.apache.fop.fo.flow.*</code> and some others are in +<code>org.apache.fop.fo.*.</code></p> + + + +<p>The base class for all objects in the tree is FONode. The base class for +all FO Objects is FObj.</p> + + + +<p>(insert diagram here)</p> + + + +<p>There is a class for each element in the FO set. An object is created for +each element in the FO Tree. This object holds the properties for the FO +Object.</p> + + + + <p> + When the object is created it is setup. It is given its + element name, the FOUserAgent - for resolving properties + etc. - the logger and the attributes. The methods + <code>handleAttributes()</code> and + <code>setuserAgent()</code>, common to <code>FONode</code>, + are used in this process. The object will then be given any + text data or child elements. Then the <code>end()</code> + method is called. The end method is used by a number of + elements to indicate that it can do certain processing since + all the children have been added. + </p> + + + +<p>Some validity checking is done during these steps. The user can be warned of the error and processing can continue if possible. +</p> + + + <p> + The FO Tree is simply a heirarchy of java objects that + represent the fo elements from xml. The traversal is done by + the layout or structure process only in the flow elements. + </p> + + + +<s2 title="Properties"> + + + +<p>The XML attributes on each element are passed to the object. The objects +that represent FO objects then convert the attributes into properties. +</p> + + +<p>Since properties can be inherited the PropertyList class handles resolving +properties for a particular element. +All properties are specified in an XML file. Classes are created +automatically during the build process. +</p> + + +<p>(insert diagram here)</p> + + + +<p>In some cases the element may be moved to have a different parent, for +example markers, or the inheritance could be different, for example +initial property set.</p></s2> + + + + +<s2 title="Foreign XML"> + + +<p>The base class for foreign XML is XMLObj. This class handles creating a +DOM Element and the setting of attributes. It also can create a DOM +Document if it is a top level element, class XMLElement. +This class must be extended for the namespace of the XML elements. For +unknown namespaces the class is UnknowXMLObj.</p> + + + +<p>(insert diagram here)</p> + + + +<p>If some special processing is needed then the top level element can extend +the XMLObj. For example the SVGElement makes the special DOM required for +batik and gets the size of the svg. +</p> + + +<p>Foreign XML will usually be in an fo:instream-foreign-object, the XML will +be passed to the render as a DOM where the render will be able to handle +it. Other XML from an unknwon namespace will be ignored. +</p> + + +<p>By using element mappings it is possible to read other XML and either</p> +<ul><li>set information on the area tree</li> +<li>create pseudo FO Objects that create areas in the area tree</li> +<li>create FO Objects</li></ul> +</s2> + + + +<s2 title="Unknown Elements"> +<p>If an element is in a known namespace but the element is unknown then an +Unknown object is created. This is mainly to provide information to the +user. +This could happen if the fo document contains an element from a different +version or the element is misspelt.</p> +</s2> + + +<s2 title="Page Masters"> + <p> + The first elements in a document are the elements for the + page master setup. This is usually only a small number and + will be used throughout the document to create new pages. + These elements are kept as a factory to create the page and + appropriate regions whenever a new page is requested by the + layout. The objects in the FO Tree that represent these + elements are themselves the factory. The root element keeps + these objects as a factory for the page sequences. + </p> +</s2> + + +<s2 title="Flow"> +<p>The elements that are in the flow of the document are a set of elements +that is needed for the layout process. Each element is important in the +creation of areas.</p> +</s2> + + + +<s2 title="Other Elements"> + + + + <p> + The remaining FO Objects are things like page-sequence, + title and color-profile. These are handled by their parent + element; i.e. the root looks after the declarations and the + declarations maintains a list of colour profiles. The + page-sequences are direct descendents of root. + </p> + </s2> + + + +<s2 title="Associated Tasks"> + + + +<ul><li>Create diagrams</li> +<li>Setup all properties and elements for XSL:FO</li> +<li>Setup user agent for property resolution</li> +<li>Verify all XML is handled appropriately</li></ul></s2></s1></body></document>
\ No newline at end of file diff --git a/docs/design/understanding/handling_attributes.xml b/docs/design/understanding/handling_attributes.xml new file mode 100644 index 000000000..1ae043059 --- /dev/null +++ b/docs/design/understanding/handling_attributes.xml @@ -0,0 +1,13 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Handling Attributes</title> + <subtitle>All you wanted to know about FOP Handling Attributes !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Handling Attributes"> + <p>Yet to come :))</p> + <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/images.xml b/docs/design/understanding/images.xml new file mode 100644 index 000000000..6aaa82bc8 --- /dev/null +++ b/docs/design/understanding/images.xml @@ -0,0 +1,146 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Images</title> + <subtitle>All you wanted to know about Images in FOP !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body> + + + <s1 title="Images in FOP"> <note> this is still in progress, input in the code is welcome. Needs documenting formats, testing. + So all those people interested in images should get involved.</note> + <p>Images may only be needed to be loaded when the image is rendered to the +output or to find the dimensions.<br/> +An image url may be invalid, this can be costly to find out so we need to +keep a list of invalid image urls.</p> +<p>We have a number of different caching schemes that are possible.</p> +<p>All images are referred to using the url given in the XSL:FO after +removing "url('')" wrapping. This does +not include any sort of resolving such as relative -> absolute. The +external graphic in the FO Tree and the image area in the Area Tree only +have the url as a reference. +The images are handled through a static interface in ImageFactory.<br/></p> + + +<p>(insert image)</p> + + +<s2 title="Threading"> + + + +<p>In a single threaded case with one document the image should be released +as soon as the renderer caches it. If there are multiple documents then +the images could be held in a weak cache in case another document needs to +load the same image.</p> + + +<p>In a multi threaded case many threads could be attempting to get the same +image. We need to make sure an image will only be loaded once at a +particular time. Once a particular document is finished then we can move +all the images to a common weak cache.</p> +</s2> + +<s2 title="Caches"> +<s3 title="LRU"> +<p>All images are in a common cache regardless of context. To limit the size +of the cache the LRU image is removed to keep the amount of memory used +low. Each image can supply the amount of data held in memory.</p> +</s3> + +<s3 title="Context"> +<p>Images are cached according to the context, using the FOUserAgent as a key. +Once the context is finished the images are added to a common weak hashmap +so that other contexts can load these images or the data will be garbage +collected if required.</p> +<p>If images are to be used commonly then we cannot dispose of data in the +FopImage when cached by the renderer. Also if different contexts have +different base directories for resolving relative url's then the loading +and caching must be separate. We can have a cache that shares images among +all contexts or only loads an image for a context.</p> +</s3> + +<p>The cache uses an image loader so that it can synchronize the image +loading on an image by image basis. Finding and adding an image loader to +the cache is also synchronized to prevent thread problems.</p> +</s2> + +<s2 title="Invalid Images"> + + +<p> +If an image cannot be loaded for some reason, for example the url is +invalid or the image data is corrupt or an unknown type. Then it should +only attempt to load the image once. All other attempts to get the image +should return null so that it can be easily handled.<br/> +This will prevent any extra processing or waiting.</p> +</s2> + + +<s2 title="Reading"> +<p>Once a stream is opened for the image url then a set of image readers is +used to determine what type of image it is. The reader can peek at the +image header or if necessary load the image. The reader can also get the +image size at this stage. +The reader then can provide the mime type to create the image object to +load the rest of the information.<br/></p></s2> + + + +<s2 title="Data"> + + + +<p>The data usually need for an image is the size and either a bitmap or the +original data. Images such as jpeg and eps can be embedded into the +document with the original data. SVG images are converted into a DOM which +needs to be rendered to the PDF. Other images such as gif, tiff etc. are +converted into a bitmap. +Data is loaded by the FopImage by calling load(type) where type is the type of data to load.<br/></p></s2> + + +<s2 title="Rendering"> + +<p>Different renderers need to have the information in different forms.</p> + + +<s3 title="PDF"> +<dl><dt>original data</dt> <dd>JPG, EPS</dd> +<dt>bitmap</dt> <dd>gif, tiff, bmp, png</dd> +<dt>other</dt> <dd>SVG</dd></dl> +</s3> + +<s3 title="PS"> +<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd> +<dt>other</dt> <dd>SVG</dd></dl> +</s3> + +<s3 title="awt"> +<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd> +<dt>other</dt> <dd>SVG</dd></dl></s3> + + + +<p>The renderer uses the url to retrieve the image from the ImageFactory and +then load the required data depending on the image mime type. If the +renderer can insert the image into the document and use that data for all +future references of the same image then it can cache the reference in the +renderer and the image can be released from the image cache.</p></s2> +</s1> + </body></document> + + + + + + + + + + + + + diff --git a/docs/design/understanding/layout_managers.xml b/docs/design/understanding/layout_managers.xml new file mode 100644 index 000000000..6630ac64b --- /dev/null +++ b/docs/design/understanding/layout_managers.xml @@ -0,0 +1,13 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Layout Managers</title> + <subtitle>All you wanted to know about Layout Managers !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Layout Managers"> + <p>Yet to come :))</p> + <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/layout_process.xml b/docs/design/understanding/layout_process.xml new file mode 100644 index 000000000..4c426d8eb --- /dev/null +++ b/docs/design/understanding/layout_process.xml @@ -0,0 +1,13 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Layout Process</title> + <subtitle>All you wanted to know about the Layout Process !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Layout Process"> + <p>Yet to come :))</p> + <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/pdf_library.xml b/docs/design/understanding/pdf_library.xml new file mode 100644 index 000000000..434cc03f8 --- /dev/null +++ b/docs/design/understanding/pdf_library.xml @@ -0,0 +1,78 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>PDF Library</title> + <subtitle>All you wanted to know about the PDF Library !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="PDF Library"> + +<p>The PDF Library is an independant package of classes in FOP. These class +provide a simple way to construct documents and add the contents. The +classes are found in <code>org.apache.fop.pdf.*</code>.</p> + + + + +<s2 title="PDF Document"> +<p>This is where most of the document is created and put together.</p> +<p>It sets up the header, trailer and resources. Each page is made and added to the document. +There are a number of methods that can be used to create/add certain PDF objects to the document.</p> +</s2> + +<s2 title="Building PDF"> +<p>The PDF Document is built by creating a page for each page in the Area Tree.</p> +<p> This page then has all the contents added. + The page is then added to the document and available objects can be written to the output stream.</p> +<p>The contents of the page are things such as text, lines, images etc. +The PDFRenderer inserts the text directly into a pdf stream. +The text consists of markup to set fonts, set text position and add text.</p> +<p>Most of the simple pdf markup is inserted directly into a pdf stream. +Other more complex objects or commonly used objects are added through java classes. +Some pdf objects such as an image consists of two parts.</p> +<p>It has a separate object for the image data and another bit of markup to display the image in a certain position on the page. +</p><p>The java objects that represent a pdf object implement a method that returns the markup for inserting into a stream. +The method is: byte[] toPDF().</p> + +</s2> +<s2 title="Features"> + + + +<s3 title="Fonts"> +<p>Support for embedding fonts and using the default Acrobat fonts. +</p></s3> + +<s3 title="Images"> +<p>Images can be inserted into a page. The image can either be inserted as a pixel map or directly insert a jpeg image. +</p></s3> + +<s3 title="Stream Filters"> +<p>A number of filters are available to encode the pdf streams. These filters can compress the data or change it such as converting to hex. +</p></s3> + +<s3 title="Links"> +<p>A pdf link can be added for an area on the page. This link can then point to an external destination or a position on any page in the document. +</p></s3> + +<s3 title="Patterns"> +<p>The fill and stroke of graphical objects can be set with a colour, pattern or gradient. +</p></s3> + + +<p>The are a number of other features for handling pdf markup relevent to creating PDF files for FOP.</p> +</s2> + + +<s2 title="Associated Tasks"> +<p>There are a large number of additional features that can be added to pdf.</p> +<p>Many of these can be handled with extensions or post processing.</p> + +</s2> + + + + </s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/properties.xml b/docs/design/understanding/properties.xml new file mode 100644 index 000000000..529ec8673 --- /dev/null +++ b/docs/design/understanding/properties.xml @@ -0,0 +1,130 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Properties</title> + <subtitle>All you wanted to know about the Properties !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Property Handling"> +<p>During XML Parsing, the FO tree is constructed. For each FO object (some +subclass of FObj), the tree builder then passes the list of all +attributes specified on the FO element to the handleAttrs method. This +method converts the attribute specifications into a PropertyList.</p> +<p>The actual work is done by a PropertyListBuilder (PLB for short). The +basic idea of the PLB is to handle each attribute in the list in turn, +find an appropriate "Maker" for it, call the Maker to convert the +attribute value into a Property object of the correct type, and store +that Property in the PropertyList.</p> + + +<s2 title="Finding a Maker"> +<p> +The PLB finds a "Maker" for the property based on the attribute name and +the element name. Most Makers are generic and handle the attribute on +any element, but it's possible to set up an element-specific property +Maker. The attribute name to Maker mappings are automatically created +during the code generation phase by processing the XML property +description files.</p> +</s2> + +<s2 title="Processing the attribute list"> +<p>The PLB first looks to see if the font-size property is specified, since +it sets up relative units which can be used in other property +specifications. Each attribute is then handled in turn. If the attribute +specifies part of a compound property such as space-before.optimum, the +PLB looks to see if the attribute list also contains the "base" property +(space-before in this case) and processes that first.</p></s2> +<s2 title="How the Property Maker works"><p>There is a family of Maker objects for each of the property datatypes, +such as Length, Number, Enumerated, Space, etc. But since each Property +has specific aspects such as whether it's inherited, its default value, +its corresponding properties, etc. there is usually a specific Maker for +each Property. All these Maker classes are created during the code +generation phase by processing (using XSLT) the XML property description +files to create Java classes.</p> + + +<p>The Maker first checks for "keyword" values for a property. These are +things like "thin, medium, thick" for the border-width property. The +datatype is really a Length but it can be specified using these keywords +whose actual value is determined by the "User Agent" rather than being +specified in the XSL standard. For FOP, these values are currently +defined in foproperties.xml. The keyword value is just a string, so it +still needs to be parsed as described next.</p> + + +<p>The Maker also checks to see if the property is an Enumerated type and +then checks whether the value matches one of the specified enumeration +values.</p> + + +<p>Otherwise the Maker uses the property parser in the fo.expr package to +evaluate the attribute value and return a Property object. The parser +interprets the expression language and performs numeric operations and +function call evaluations.</p> + + +<p>If the returned Property value is of the correct type (specificed in +foproperties.xml, where else?), the Maker returns it. Otherwise, it may +be able to convert the returned type into the correct type.</p> + + +<p>Some kinds of property values can't be fully resolved during FO tree +building because they depend on layout information. This is the case of +length values specified as percentages and of the special +proportional-column-width(x) specification for table-column widths. +These are stored as special kinds of Length objects which are evaluated +during layout. Expressions involving "em" units which are relative to +font-size _are_ resolved during the FO tree building however.</p></s2> + + +<s2 title="Structure of the PropertyList"> +<p>The PropertyList extends HashMap and its basic function is to associate +Property value objects with Property names. The Property objects are all +subclasses of the base Property class. Each one simply contains a +reference to one of the property datatype objects. Property provides +accessors for all known datatypes and various subclasses override the +accessor(s) which are reasonable for the datatype they store.</p> + + +<p>The PropertyList itself provides various ways of looking up Property +values to handle such issues as inheritance and corresponding +properties. </p> + + +<p>The main logic is:<br/>If the property is a writing-mode relative property (using start, end, +before or after in its name), the corresponding absolute property value +is returned if it's explicitly set on this FO. <br/>Otherwise, the +writing-mode relative value is returned if it's explicitly set. If the +property is inherited, the process repeats using the PropertyList of the +FO's parent object. (This is easy because each PropertyList points to +the PropertyList of the nearest ancestor FO.) If the property isn't +inherited or no value is found at any level, the initial value is +returned.</p></s2> + + +<s2 title="References"> + +<dl><dt>docs/design/properties.xml</dt> <dd>a more detailed version of this (generated +html in docs/html-docs/design/properties.html)</dd> + + +<dt>src/codegen/properties.dtd</dt> <dd>heavily commented DTD for foproperties.xml, +but may not be completely up-to-date</dd></dl></s2> + + +<s2 title="To Do"> <s3 title="documentation"> + +<ul><li>explain PropertyManager vs. direct access</li> +<li>Explain corresponding properties</li></ul></s3> + + +<s3 title="development"> + +<p>Lots of properties are incompletely handled, especially funny kinds of +keyword values and shorthand values (one attribute which sets several +properties)</p></s3></s2> + +</s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/renderers.xml b/docs/design/understanding/renderers.xml new file mode 100644 index 000000000..cf66e2b8e --- /dev/null +++ b/docs/design/understanding/renderers.xml @@ -0,0 +1,13 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Renderers</title> + <subtitle>All you wanted to know about the Renderers !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Renderers"> + <p>Yet to come :))</p> + <note>The series of notes for developers has started but it has not yet gone so far ! Keep watching</note></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/status.xml b/docs/design/understanding/status.xml new file mode 100644 index 000000000..c1e7fb1ea --- /dev/null +++ b/docs/design/understanding/status.xml @@ -0,0 +1,17 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>Tutorial series Status</title> + <subtitle>Current Status of tutorial about FOP and Design</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="Tutorial series Status"> <p>Peter said : Do we have a volunteer to track + Keiron's tutorials and turn them into web page documentation?</p> <p><strong>The answer is yes + we have, but the work is on progress !</strong></p> <note>Keiron has recently extended + the documentation generation on the CVS trunk to make this process a bit + easier. Keiron tells Peter that Apache is readying a major overhaul of its web + site and xml->html generation, but that should not deter us from proceeding + with documentation.</note></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/svg.xml b/docs/design/understanding/svg.xml new file mode 100644 index 000000000..7fd19f369 --- /dev/null +++ b/docs/design/understanding/svg.xml @@ -0,0 +1,57 @@ +<?xml version="1.0" standalone="no"?> +<!-- Overview --> +<document> + <header> + <title>SVG</title> + <subtitle>All you wanted to know about SVG and FOP !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body><s1 title="SVG"> + <p>SVG is rendered through Batik.</p><p>The XML from the XSL:FO document + is converted into an SVG DOM with batik. This DOM is then set as the Document + on the Foreign Object area in the Area Tree.</p><p>This DOM is then available to + be rendered by the renderer.</p><p>SVG is rendered in the renderers via an + XMLHandler in the FOUserAgent. This XML handler is used to render the SVG. The + SVG is rendered by using batik. Batik converts the SVG DOM into an internal + structure that can be drawn into a Graphics2D. So for PDF we use a + PDFGraphics2D to draw into.</p><p>This creates the necessary PDF information to + create the SVG image in the PDF document.</p><p>Most of the work is done in the + PDFGraphics2D class. There are also a few bridges that are plugged into batik + to provide different behaviour for some SVG elements.</p><s2 + title="Text Drawing"><p>Normally batik converts text into a set of curved + shapes. </p><p>This is handled as any other shapes when rendering to the output. This + is not always desirable as the shapes have very fine curves. This can cause the + output to look a bit bad in PDF and PS (it can be drawn properly but is not by + default). These curves also require much more data than the original + text.</p><p>To handle this there is a PDFTextElementBridge that is set when + using the bridge in batik. If the text is simple enough for the text to be + drawn in the PDF as with all other text then this sets the TextPainter to use + the PDFTextPainter. This inserts the text directly into the PDF using the + drawString method on the PDFGraphics2D.</p><p>Text is considered simple if the + font is available, the font size is useable and there are no tspans or other + complications. This can make the resulting PDF significantly + smaller.</p></s2><s2 title="PDF Links"><p>To support links in PDF another batik + element bridge is used. The PDFAElementBridge creates a PDFANode which inserts + a link into the PDF document via the PDFGraphics2D.</p><p>Since links are + positioned on the page without any transforms then we need to transform the + coordinates of the link area so that they match the current position of the a + element area. This transform may also need to account for the svg being + positioned on the page.</p></s2><s2 title="Images"><p>Images are normally drawn + into the PDFGraphics2D. This then creates a bitmap of the image data that can + be inserted into the PDF document. </p><p>As PDF can support jpeg images then another + element bridge is used so that the jpeg can be directly inserted into the + PDF.</p></s2><s2 title="PDF Transcoder"><p>Batik provides a mechanism to + convert SVG into various formats. Through FOP we can convert an SVG document + into a single paged PDF document. The page contains the SVG drawn as best as + possible on the page. There is a PDFDocumentGraphics2D that creates a + standalone PDF document with a single page. This is then drawn into by batik in + the same way as with the PDFGraphics2D.</p></s2><s2 + title="Other Outputs"><p>When rendering to AWT the SVG is simply drawn onto the + awt canvas using batik.</p><p>The PS Renderer uses a similar technique as the + PDF Renderer.</p><p>The SVG Renderer simply embeds the SVG inside an svg + element.</p></s2><s2 title="Associated Tasks"><ul><li>To get accurate drawing + pdf transparency is needed.</li><li>The drawRenderedImage methods need + implementing.</li><li>Handle colour space better.</li><li>Improve link handling + with pdf.</li><li>Improve image handling.</li></ul></s2></s1> + </body></document>
\ No newline at end of file diff --git a/docs/design/understanding/understanding.xml b/docs/design/understanding/understanding.xml new file mode 100644 index 000000000..c34ec730b --- /dev/null +++ b/docs/design/understanding/understanding.xml @@ -0,0 +1,94 @@ +<?xml version="1.0"?> +<!-- Overview --> +<document> + <header> + <title>Understanding FOP Design</title> + <subtitle>Tutorial series about Design Approach to FOP</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body> +<s1 title="Understanding"> + <note> + The content of this <strong>Understanding series</strong> + was all taken from the interactive fop development mailing + list discussion . <br/> We strongly advise you to join this + mailing list and ask question about this series there. <br/> + You can subscribe to fop-dev@xml.apache.org by sending an + email to <link href= + "mailto:fop-dev-subscribe@xml.apache.org" + >fop-dev-subscribe@xml.apache.org</link>. <br/> You will + find more information about how to get involved <link href= + "http://xml.apache.org/fop/involved.html" + >there</link>.<br/> You can also read the <link href= + "http://marc.theaimsgroup.com/?l=fop-dev&r=1&w=2" + >archive</link> of the discussion list fop-dev to get an + idea of the issues being discussed. + </note> + <s2 title="Introduction"> + <p> + Welcome to the understanding series. This will be + a series of notes for developers to understand how FOP + works. We will + attempt to clarify the processes involved to go from xml(fo) + to pdf or other formats. Some areas will get more + complicated as we proceed. + </p> + </s2> + + + <s2 title="Overview"> + <p>FOP takes an xml file does its magic and then writes a document to a + stream.</p> + <p>xml -> [FOP] -> document</p> + <p>The document could be pdf, ps etc. or directed to a printer or the + screen. The principle remains the same. The xml document must be in the XSL:FO + format.</p> + <p>For convenience we provide a mechanism to handle XML+XSL as + input.</p> + <p>The xml document is always handled internally as SAX. The SAX events + are used to read the elements, attributes and text data of the FO document. + After the manipulation of the data the renderer writes out the pages in the + appropriate format. It may write as it goes, a page at a time or the whole + document at once. Once finished the document should contain all the data in the + chosen format ready for whatever use.</p></s2> + <s2 title="Stages"><p>The fo data goes through a few stages. Each piece + of data will generally go through the process in the same way but some + information may be used a number of times or in a different order. To reduce + memory one stage will start before the previous is completed.</p> + <p>SAX Handler -> FO Tree -> Layout Managers -> Area Tree + -> Render -> document</p> + <p>In the case of rtf, mif etc. <br/>SAX Handler -> FO Tree -> + Structure Renderer -> document</p> + <p>The FO Tree is constructed from the xml document. It is an internal + representation of the xml document and it is like a DOM with some differences. + The Layout Managers use the FO Tree do their layout stuff and create an Area + Tree. The Area Tree is a representation of the final result. It is a + representation of a set of pages containing the text and other graphics. The + Area Tree is then given to a Renderer. The Renderer can read the Area Tree and + convert the information into the render format. For example the PDF Renderer + creates a PDF Document. For each page in the Area Tree the renderer creates a + PDF Page and places the contents of the page into the PDF Page. Once a PDF Page + is complete then it can be written to the output stream.</p> + <p>For the structure documents the Structure listener will read + directly from the FO Tree and create the document. These documents do not need + the layout process or the Area Tree.</p></s2> + <s2 title="Associated Tasks"><p>Verify Structure Listener + concept.</p></s2> + <s2 title="Further Topics"> + <ul><li>XML parsing</li> + <li>FO Tree</li> + <li>Properties</li> + <li>Layout Managers</li> + <li>Layout Process</li> + <li>Handling Attributes</li> + <li>Area Tree</li> + <li>Renderers</li> + <li>Images</li> + <li>PDF Library</li> + <li>SVG</li> + </ul> + </s2> + + </s1> </body></document> + diff --git a/docs/design/understanding/xml_parsing.xml b/docs/design/understanding/xml_parsing.xml new file mode 100644 index 000000000..a7c8d4a85 --- /dev/null +++ b/docs/design/understanding/xml_parsing.xml @@ -0,0 +1,106 @@ +<?xml version="1.0"?> +<document> + <header> + <title>XML Parsing</title> + <subtitle>All you wanted to know about XML Parsing !</subtitle> + <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> + </authors> + </header> + <body> + +<s1 title="XML Parsing"><p>Since everyone knows the basics we can get + into the various stages starting with the XML handling.</p> + <s2 title="XML Input"><p>FOP can take the input XML in a number of ways: + </p> + <ul> + <li>SAX Events through SAX Handler + <ul> + <li> + <code>FOTreeBuilder</code> is the SAX Handler which is + obtained through <code>getContentHandler</code> on + <code>Driver</code>. + </li> + </ul> + </li> + <li> + DOM which is converted into SAX Events + <ul> + <li> + The conversion of a DOM tree is done via the + <code>render(Document)</code> method on + <code>Driver</code>. + </li> + </ul> + </li> + <li> + data source which is parsed and converted into SAX Events + <ul> + <li> + The <code>Driver</code> can take an + <code>InputSource</code> as input. This can use a + <code>Stream</code>, <code>String</code> etc. + </li> + </ul> + </li> + <li> + XML+XSLT which is transformed using an XSLT Processor and + the result is fired as SAX Events + <ul> + <li> + <code>XSLTInputHandler</code> is used as an + <code>InputSource</code> in the + render(<code>XMLReader</code>, + <code>InputSource</code>) method on + <code>Driver</code> + </li> + </ul> + </li> + </ul> + + <p>The SAX Events which are fired on the SAX Handler, class + <code>FOTreeBuilder</code>, must represent an XSL:FO document. If not there will be an + error. Any problems with the XML being well formed are handled here.</p></s2> + <s2 title="Element Mappings"><p> The element mapping is a hashmap of all + the elements in a particular namespace. This makes it easy to create a + different object for each element. Element mappings are static to save on + memory. </p><p>To add an extension a developer can put in the classpath a jar + that contains the file <code>/META-INF/services/org.apache.fop.fo.ElementMapping</code>. + This must contain a line with the fully qualified name of a class that + implements the <em>org.apache.fop.fo.ElementMapping</em> interface. This will then be + loaded automatically at the start. Internal mappings are: FO, SVG and Extension + (pdf bookmarks)</p></s2> + <s2 title="Tree Building"><p>The SAX Events will fire all the information + for the document with start element, end element, text data etc. This + information is used to build up a representation of the FO document. To do this + for a namespace there is a set of element mappings. When an element + namepsace + mapping is found then it can create an object for that element. If the element + is not found then it creates a dummy object or a generic DOM for unknown + namespaces.</p> + <p>The object is then setup and then given attributes for the element. + For the FO Tree the attributes are converted into properties. The FO objects + use a property list mapping to convert the attributes into a list of properties + for the element. For other XML, for example SVG, a DOM of the XML is + constructed. This DOM can then be passed through to the renderer. Other element + mappings can be used in different ways, for example to create elements that + create areas during the layout process or setup information for the renderer + etc.</p> + <p> + While the tree building is mainly about creating the FO Tree + there are some stages that can propagate to the renderer. At + the end of a page sequence we know that all pages in the + page sequence can be laid out without being effected by any + further XML. The significance of this is that the FO Tree + for the page sequence may be able to be disposed of. The + end of the XML document also tells us that we can finalise + the output document. (The layout of individual pages is + accomplished by the layout managers page at a time; + i.e. they do not need to wait for the end of the page + sequence. The page may not yet be complete, however, + containing forward page number references, for example.) + </p> + </s2> + <s2 title="Associated Tasks"> + <ul><li>Error handling for xml not well formed.</li> + <li>Error handling for other XML parsing errors.</li><li>Developer + info for adding namespace handlers.</li></ul></s2></s1> + </body></document>
\ No newline at end of file |