From: Keiron Liddle Date: Mon, 2 Dec 2002 10:19:43 +0000 (+0000) Subject: initial conversion of alt.design docs X-Git-Tag: Alt-Design-integration-base~284 X-Git-Url: https://source.dussan.org/?a=commitdiff_plain;h=eb328349433beebe9441d6b89c7a554e11f0f198;p=xmlgraphics-fop.git initial conversion of alt.design docs git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk@195701 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/src/documentation/content/xdocs/design/alt.design/AbsolutePosition.png.xml b/src/documentation/content/xdocs/design/alt.design/AbsolutePosition.png.xml new file mode 100644 index 000000000..46a6999a3 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/AbsolutePosition.png.xml @@ -0,0 +1,21 @@ + + + + +
+ AbsolutePosition diagram + + + +
+ +
+ Properties$AbsolutePosition +
+
+ + +
+ diff --git a/src/documentation/content/xdocs/design/alt.design/BorderCommonStyle.png.xml b/src/documentation/content/xdocs/design/alt.design/BorderCommonStyle.png.xml new file mode 100644 index 000000000..e9fad3d77 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/BorderCommonStyle.png.xml @@ -0,0 +1,20 @@ + + + + +
+ BorderCommonStyle diagram + + + +
+ +
+ Properties$BorderCommonStyle +
+
+ +
+ diff --git a/src/documentation/content/xdocs/design/alt.design/PropNames.png.xml b/src/documentation/content/xdocs/design/alt.design/PropNames.png.xml new file mode 100644 index 000000000..6f552009b --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/PropNames.png.xml @@ -0,0 +1,19 @@ + + + + +
+ ..fo.PropNames diagram + + +
+ +
+ PropNames.class +
+
+ +
+ diff --git a/src/documentation/content/xdocs/design/alt.design/Properties.png.xml b/src/documentation/content/xdocs/design/alt.design/Properties.png.xml new file mode 100644 index 000000000..8bcf88943 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/Properties.png.xml @@ -0,0 +1,20 @@ + + + + +
+ ..fo.Properties diagram + + + +
+ +
+ Properties.class +
+
+ +
+ diff --git a/src/documentation/content/xdocs/design/alt.design/PropertyConsts.png.xml b/src/documentation/content/xdocs/design/alt.design/PropertyConsts.png.xml new file mode 100644 index 000000000..2e2763026 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/PropertyConsts.png.xml @@ -0,0 +1,20 @@ + + + + +
+ ..fo.PropertyConsts diagram + + + +
+ +
+ PropertyConsts.class +
+
+ +
+ diff --git a/src/documentation/content/xdocs/design/alt.design/VerticalAlign.png.xml b/src/documentation/content/xdocs/design/alt.design/VerticalAlign.png.xml new file mode 100644 index 000000000..400d43cf7 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/VerticalAlign.png.xml @@ -0,0 +1,20 @@ + + + + +
+ VerticalAlign diagram + + + +
+ +
+ Properties$VerticalAlign +
+
+ +
+ diff --git a/src/documentation/content/xdocs/design/alt.design/alt.properties.xml b/src/documentation/content/xdocs/design/alt.design/alt.properties.xml new file mode 100644 index 000000000..a1e7fcf09 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/alt.properties.xml @@ -0,0 +1,172 @@ + + + + +
+ Implementing Properties + + + +
+ +
+ An alternative properties implementation + + The following discussion focusses on the relationship between + Flow Objects in the Flow Object tree, and properties. There + is no (or only passing) discussion of the relationship between + properties and traits, and by extension, between properties + and the Area tree. The discussion is illustrated with some + pseudo-UML diagrams. + +

+ Property handling is complex and expensive. Varying numbers of + properties apply to individual Flow Objects + (FOs) in the FO + tree but any property may effectively be + assigned a value on any element of the tree. If that property + is inheritable, its defined value will then be available to + any children of the defining FO. +

+ + (XSL 1.0 Rec) 5.1.4 Inheritance + ...The inheritable properties can be placed on any formatting + object. + +

+ Even if the value is not inheritable, it may be accessed by + its children through the inherit keyword or the + from-parent() core function, and potentially by + any of its descendents through the + from-nearest-specified-value() core function. +

+

+ In addition to the assigned values of properties, almost every + property has an initial value which is used + when no value has been assigned. +

+
+ The history problem +

+ The difficulty and expense of handling properties comes from + this univeral inheritance possibility. The list of properties + which are assigned values on any particular FO + element will not generally be large, but a current value is + required for each property which applies to the FO + being processed. +

+

+ The environment from which these values may be selected + includes, for each FO, for each applicable property, + the value assigned on this FO, the value which + applied to the parent of this FO, the nearest value + specified on an ancestor of this element, and the initial + value of the property. +

+
+
+ Data requirement and structure +

+ This determines the minimum set of properties and associated + property value assignments that is necessary for the + processing of any individual FO. Implicit in this + set is the set of properties and associated values, + effective on the current FO, that were assigned on + that FO. +

+

+ This minimum requirement - the initial value, the + nearest ancestor specified value, the parent computed value + and the value assigned to the current element - + suggests a stack implementation. +

+
+
+ Stack considerations +

+ One possibility is to push to the stack only a minimal set + of required elements. When a value is assigned, the + relevant form or forms of that value (specified, computed, + actual) are pushed onto the stack. As long as each + FO maintains a list of the properties which were + assigned from it, the value can be popped when the focus of + FO processing retreats back up the FO tree. +

+

+ The complication is that, for elements which are not + automatically inherited, when an FO is encountered + which does not assign a value to the + property, the initial value must either be already at the + top of the stack or be pushed onto the stack. +

+

+ As a first approach, the simplest procedure may be to push a + current value onto the stack for every element - initial + values for non-inherited properties and the parental value + otherwise. Then perform any processing of assigned values. + This simplifies program logic at what is hopefully a small + cost in memory and processing time. It may be tuned in a + later iteration. +

+
+ Stack implementation +

+ Initial attempts at this implementation have used + LinkedLists as the stacks, on the assumption + that +

+ + +
  • random access would not be required
  • +
  • + pushing and popping of list elements requires nearly + constant (low) time +
  • +
  • no penalty for first addition to an empty list
  • +
  • efficient access to both bottom and top of stack
  • +
    +

    + However, it may be required to perform stack access + operations from an arbitrary place on the stack, in which + case it would probably be more efficient to use + ArrayLists instead. +

    +
    +
    +
    + Class vs instance +

    + An individual stack would contain values for a particular + property, and the context of the stack is the property class + as a whole. The property instances would be represented by + the individual values on the stack. If properties are to be + represented as instantiations of the class, the stack + entries would presumably be references to, or at least + referenced from, individual property objects. However, the + most important information about individual property + instances is the value assigned, and the relationship of + this property object to its ancestors and its descendents. + Other information would include the ownership of a property + instance by a particular FO, and, in the other + direction, the membership of the property in the set of + properties for which an FO has defined values. +

    +

    + In the presence of a stack, however, none of this required + information mandates the instantiation of properties. All + of the information mentioned so far can be effectively + represented by a stack position and a link to an + FO. If the property stack is maintained in + parallel with a stack of FOs, even that link is + implicit in the stack position. +

    +
    +

    + Next: property classes overview. +

    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/book.xml b/src/documentation/content/xdocs/design/alt.design/book.xml new file mode 100644 index 000000000..2324a6d33 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/book.xml @@ -0,0 +1,37 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/documentation/content/xdocs/design/alt.design/classes-overview.xml b/src/documentation/content/xdocs/design/alt.design/classes-overview.xml new file mode 100644 index 000000000..b5e2be845 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/classes-overview.xml @@ -0,0 +1,203 @@ + + + + +
    + Property classes overview + + + +
    + +
    + Classes overview +
    + The class of all properties +

    + If individual properties can have a "virtual reality" on the + stack, where is the stack itself to be instantiated? One + possibility is to have the stacks as static + data structures within the individual property classes. + However, the reduction of individual property instances to + stack entries allows the possibility of further + virtualization of property classes. If the individual + properties can be represented by an integer, i.e. a + static final int, the set of individual + property stacks can be collected together into one array. + Where to put such an overall collection? Creating an + über-class to accommodate everything that applies to + property classes as a whole allows this array to be defined + as a static final something[]. +

    +
    +
    + The overall property classes +

    + This approach has been taken for the experimental code. + Rather than simply creating a overall class containing + common elements of properties and acting as a superclass, + advantage has been taken of the facility for nesting of + top-level classes. All of the individual property classes + are nested within the Properties class. + This has advantages and disadvantages. +

    +
    +
    Disadvantages
    +
    + The file becomes extremely cumbersome. This can cause + problems with "intelligent" editors. E.g. + XEmacs syntax highlighting virtually grinds to a + halt with the current version of this file.

    + + Possible problems with IDEs. There may be speed problems + or even overflow problems with various IDEs. The current + version of this and related files had only been tried with + the [X]Emacs JDE environment, without difficulties + apart from the editor speed problems mentioned + above.

    + + Retro look and feel. Not the done Java thing.

    +
    +
    Advantages
    +
    + Everything to do with properties in the one place (more or + less.)

    + + Eliminates the need for a large part of the (sometimes) + necessary evil of code generation. The One Big File of + foproperties.xml, with its ancillary xsl, is + absorbed into the One Bigger File of + Properties.java. The huge advantage of this + is that it is Java. +
    +
    +
    +
    + The property information classes +

    + In fact, in order to keep the size of the file down to more + a more manageable level, the property information classes of + static data and methods have been split tentatively into + three: +

    +
    +
    +
    PropNames
    +
    + Contains an array, propertyNames, of the names of + all properties, and a set of enumeration constants, one + for each property name in the PropertyNames + array. These constants index the name of the properties + in propertyNames, and must be manually kept in + sync with the entries in the array. (This was the last of + the classes split off from the original single class; + hence the naming tiredness.) +

    +
    +
    PropertyConsts
    +
    + Contains two basic sets of data:
    + Property-indexed arrays and property set + definitions.

    + + Property-indexed arrays are elaborations + of the property indexing idea discussed in relation to the + arrays of property stacks. One of the arrays is

    + + public static final LinkedList[] + propertyStacks

    + + This is an array of stacks, implemented as + LinkedLists, one for each property.

    + + The other arrays provide indexed access to fields which + are, in most cases, common to all of the properties. An + exception is

    + + public static final Method[] + complexMethods

    + + which contains a reference to the method + complex() which is only defined for + properties which have complex value parsing requirements. + It is likely that a similar array will be defined for + properties which allow a value of auto.

    + + The property-indexed arrays are initialized by + static initializers in this class. The + PropNames class and + Properties + nested classes are scanned in order to obtain or derive + the data necessary for initialization.

    + + Property set definitions are + HashSets of properties (represented by + integer constants) which belong to each of the categories + of properties defined. They are used to simplify the + assignment of property sets to individual FOs. + Representative HashSets include + backgroundProps and + tableProps.

    +
    +
    Properties
    +
    +
    + This class contains only sets of constants for use by the + individual property classes, but it also importantly + serves as a container for all of the property classes, and + some convenience pseudo-property classes.

    + + Constants sets include:

    + + Datatype constants. A bitmap set of + integer constants over a possible range of 2^0 to 2^31 + (represented as -2147483648). E.g.
    + INTEGER = 1
    + ENUM = 524288

    + Some of the definitions are bit-ORed + combinations of the basic values. Used to set the + dataTypes field of the property + classes.

    + + Trait mapping constants. A bitmap set of + integer constants over a possible range of 2^0 to 2^31 + (represented as -2147483648), representing the manner in + which a property maps into a trait. Used to set + the traitMapping field of the property + classes.

    + + Initial value constants. A sequence of + integer constants representing the datatype of the initial + value of a property. Used to set the + initialValueType field of the property + classes.

    + + Inheritance value constants. A sequence + of integer constants representing the way in which the + property is normally inherited. Used to set the + inherited field of the property + classes.

    + + Nested property classes. The + Properties class serves as the holding pen for + all of the individual property classes, and for property + pseudo-classes which contain data common to a number of + actual properties, e.g. ColorCommon. +
    +
    +
    +

    + Previous: alt.properties +

    +

    + Next: Properties classes +

    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/compound-properties.xml b/src/documentation/content/xdocs/design/alt.design/compound-properties.xml new file mode 100644 index 000000000..bd326a4da --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/compound-properties.xml @@ -0,0 +1,218 @@ + + + + +
    + Compound properties + + + +
    + +
    + Compound properties in XSLFO + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Property typeSectionInherited'inherit'
    <length-range>
    minimum
    optimum
    maximum
    block-progression-dimension7.14.1noyes
    inline-progression-dimension7.14.5noyes
    leader-length7.21.4yesyes
    <length-conditional>
    length
    conditionality
    border-after-width7.7.12noyes
    border-before-width7.7.9noyes
    border-end-width7.7.18noyes
    border-start-width7.7.15noyes
    padding-after7.7.32noyes
    padding-before7.7.31noyes
    padding-end7.7.34noyes
    padding-start7.7.33noyes
    <length-bp-ip-direction>
    block-progression-direction
    inline-progression-direction
    border-separation7.26.5yesyes
    <space>
    minimum
    optimum
    maximum
    precedence
    conditionality
    letter-spacing7.16.2yesyes
    line-height7.15.4yesyes
    space-after7.10.6noyes
    space-before7.10.5noyes
    space-end7.11.1noyes
    space-start7.11.2noyes
    word-spacing7.16.8yesyes
    <keep>
    within-line
    within-column
    within-page
    keep-together7.19.3yesyes
    keep-with-next7.19.4noyes
    keep-with-previous7.19.5noyes
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/coroutines.xml b/src/documentation/content/xdocs/design/alt.design/coroutines.xml new file mode 100644 index 000000000..31613ef52 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/coroutines.xml @@ -0,0 +1,118 @@ + + + + +
    + Implementing co-routines + + + +
    + +
    + Implementing Co-routines in FOP +

    + All general page layout systems have to solve the same + fundamental problem: expressing a flow of text with its own + natural structure as a series of pages corresponding to the + physical and logical structure of the output medium. This + simple description disguises many complexities. Version 1.0 + of the Recommendation, in Section 3, Introduction to + Formatting , includes the following comments. +

    + + [Formatting] comprises several steps, some of which depend on + others in a non-sequential way.
    ...and...
    + [R]efinement is not necessarily a straightforward, sequential + procedure, but may involve look-ahead, back-tracking, or + control-splicing with other processes in the formatter. +
    +

    Section 3.1, Conceptual Procedure, includes:

    + + The procedure works by processing formatting objects. Each + object, while being processed, may initiate processing in + other objects. While the objects are hierarchically + structured, the processing is not; processing of a given + object is rather like a co-routine which may pass control to + other processes, but pick up again later where it left off. + +
    + Application of co-routines +

    + If one looks only at the flow side of the equation, it's + difficult to see what the problem might be. The ordering of + the elements of the flow is preserved in the area tree, and + where elements are in an hierarchical relationship in the + flow, they will generally be in an hierarchical relationship + in the area tree. In such circumstances, the recursive + processing of the flow seems quite natural. +

    +

    + The problem becomes more obvious when one thinks about the + imposition of an unrelated page structure over the + hierarchical structure of the document content. Take, e.g., + the processing of a nested flow structure which, at a certain + point, is scanning text and generating line-areas, nested + within other block areas and possibly other line areas. The + page fills in the middle of this process. Processing at the + lowest level in the tree must now suspend, immediately + following the production of the line-area which filled the + page. This same event, however, must also trigger the closing + and flushing to the area tree of every open area of which the last + line-area was a descendant. +

    +

    + Once all of these areas have been closed, some dormant process + or processes must wake up, flush the area sub-tree + representing the page, and open a new page sub-tree in the + area tree. Then the whole nested structure of flow objects + and area production must be re-activated, at the point in + processing at which the areas of the previous page were + finalised, but with the new page environment. The most + natural way of expressing the temporal relationship of these + processes is by means of co-routines. +

    +

    + Normal sub-routines (methods) display a hierarchical + relationship where process A suspends on invoking process B, + which on termination returns control to A which resumes from + the point of suspension. Co-routines instead have a parallel + relationship. Process A suspends on invoking process B, but + process B also suspends on returning control to process A. To + process B, this return of control appears to be an invocation + of process A. When process A subsequently invokes B and + suspends, B behaves as though its previous invocation of A has + returned, and it resumes from the point of that invocation. + So control bounces between the two, each one resuming where it + left off.

    + Figure 1 +

    +
    +

    + For example, think of a page-production method working on a + complex page-sequence-master. +

    + + void makePages(...) { + ... + while (pageSequence.hasNext()) { + ... + page = generateNextPage(...); + boolean over = flow.fillPage(page); + if (over) return; + } + } + +

    + The fillPage() method, when it fills a page, will + have unfinished business with the flow, which it will want to + resume at the next call; hence co-routines. One way to + implement them in Java is by threads synchronised on some + common argument-passing object. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/footnotes.xml b/src/documentation/content/xdocs/design/alt.design/footnotes.xml new file mode 100644 index 000000000..16179c5f9 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/footnotes.xml @@ -0,0 +1,140 @@ + + + + +
    + Implementing footnotes + + + +
    + +
    + Implementing footnotes in FOP +

    + Footnotes present difficulties for page layout primarily + because their point of invocation in the flow is different + from their point of appearance in the area tree. All of the + content lines of a footnote may appear on the same page as its + invocation point, all may appear on a following page, or the + lines may be split over a page or pages. (This characteristic + leads to another problem when a footnote overflows the last + page of flow content, but that difficulty will not be + discussed here.) This note considers some aspects of the + implementation of footnotes in a galley-based design. +

    +
    + Footnotes and galleys +

    + In the structure described in the introduction to FOP galleys, + footnotes would be pre-processed as galleys themselves, but + they would remain attached as subtrees to their points of + invocation in the main text. Allocation to a + footnote-reference-area would only occur in the resolution + to Area nodes. +

    +

    + When footnotes are introduced, the communication between + galleys and layout manager, as mentioned above, would be + affected. The returned information would two b-p-d values: + the primary line-area b-p-d impact and the footnote b-p-d + impact. The distinction is necessary for two reasons; to + alert the layout manager to the first footnote of the page, + and because the footnote b-p-d will always impact the + main-reference-area b-p-d, whereas the primary inline-area + may not, e.g. in the case of multiple span-areas. +

    +
    +
    + Multiple columns and footnotes + + A possible method for multi-column layout and balancing + with footnotes, using a galley-based approach. + +

    + This note assumes a galley, as discussed elsewhere, flowing text with + footnotes and possibly other blocks into a possibly + multi-column area. The logic of flowing into multiple + columns is trivially applied to a single column. The galley + is manipulated within the context of the layout + tree. +

    +

    + Associated with the galley are two sets of data. + One contains the maps of all "natural" break-points and + the of all hyphenation break-points. This set is + constructed at the time of construction of the galley and + is a constant for a given galley. The second contains + dynamic data which represents one possible attempt to lay + out the galley. There may be multiple sets of such data + to reflect varying attempts. The data of this set are, + essentially, representations of line-areas, with the supporting + information necessary to determine these line-areas. +

    +

    + The line-area data includes the boundaries within the + galley of each line-area, the boundaries of each column + and the boundaries of the "page", or main area. When a + line-area boundary occurs at a hyphenation point, a + "virtual hyphen" is assumed and accounted for in the + i-p-d. As mentioned, individual footnote galleys will + hang from the parent galley. The associated data of the + footnote galleys is similar: a once-only break-points map, + and one or more line-area maps. No column boundaries are + required, but a page boundary is required at the end of + the last footnote or where a footnote breaks across a page + boundary. +

    +

    + A number of b-p-d values are also maintained. For each + line-area, the b-p-d, the main area b-p-d increment, the + footnote b-p-d increment and the footnote's page-related + b-p-d increment are required. The main-area b-p-d + increments for any particular line-area are dependent on + the column position of the line-area. Total b-p-d's are + also kept: total footnote b-p-d, total main area b-p-d, + and totals for each column.

    + Figure 1 Columns before first footnote. +

    +
    +
    +
    + Balancing columns +

    + Figure 2 Adding a line area with first + footnote. +

    +
    +

    + Columns are balanced dynamically in the galley preliminary + layout. While the galley retains its basic linear + structure, the accompanying data structures accomplish + column distribution and balancing. As each line-area is + added, the columns are re-balanced. N.B. + This re-balancing involves only some of the dynamic data + associated with the participating galley(s). The data + structures associating breakpoints with the beginning and + end of individual line areas does not change in + re-balancing; only the association of line-area with column, + and, possibly, the various impact values for each line-area. +

    + Figure 3 Adding a line area with next + footnote. +

    +
    +
    +
    + Layout managers in the flow of control + To be developed. +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/galleys.xml b/src/documentation/content/xdocs/design/alt.design/galleys.xml new file mode 100644 index 000000000..f7ae3cbb9 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/galleys.xml @@ -0,0 +1,216 @@ + + + + +
    + Galleys + + + +
    + +
    + Layout galleys in FOP +
    + Galleys in Lout +

    + Jeffrey H. Kingston, in The + Design and Implementation of the Lout Document Formatting + Language Section 5, describes the + galley abstraction which he implemented in + Lout. A document to be formatted is a stream of + text and symbols, some of which are receptive + symbols. The output file is the first receptive + symbol; the formatting document is the first galley. The + archetypical example of a receptive symbol is + @FootPlace and its corresponding galley + definition, @FootNote. +

    +

    + Each galley should be thought of as a concurrent process, and + each is associated with a semaphore (or synchronisation + object.) Galleys are free to "promote" components into + receptive targets as long as

    +
      +
    • + an appropriate target has been encountered in the file, +
    • +
    • + the component being promoted contains no unresolved galley + targets itself, and +
    • +
    • + there is sufficient room for the galley component at the + target. +
    • +
    +

    + If these conditions are not met, the galley blocks on its + semaphore. When conditions change so that further progress + may be possible, the semaphore is signalled. Note that the + galleys are a hierarchy, and that the processing and + promotion of galley contents happens bottom-up. +

    +
    +
    + Some features of galleys +

    + It is essential to note that galleys are self-managing; they + are effectively layout bots which require only a + receptive area. If a galley fills a receptive area (say, at + the completion of a page), the galley will wait on its + semaphore, and will remain stalled until a new receptive + area is uncovered in the continued processing (say, as the + filled page is flushed to output and a new empty page is + generated.) +

    +

    + Difficulties with this approach become evident when there + are mutual dependencies between receptive areas which + require negotiation between the respective galleys, and, in + some cases, arbitrary deadlock breaking when there is no + clear-cut resolution to conflicting demands. Footnote + processing and side floats are examples. A thornier example + is table column layout in auto mode, where the + column widths are determined by the contents. In + implementing galleys in FOP, these difficulties must be + taken into account, and some solutions proposed. +

    +

    + Galleys model the whole of the process of creating the final + formatted output; the document as a whole is regarded as a + galley which flushes in to the output file. +

    +
    +
    + The layout tree + +

    + This proposal for implementing galleys in FOP makes use of a + layout tree. As with the layout managers already + proposed, the layout tree acts as a bridge between the FO Tree and the Area Tree. If the elements of + the FO Tree are FO nodes, and the elements of the Area Tree + are Area nodes, representing areas to be drawn on the output + medium, the elements of the layout tree are galley + nodes and area tree fragments. + The area tree fragments are the final stages of the + resolution of the galleys; the output of the galleys will be + inserted directly into the Area Tree. The tree structure + makes it clear that the whole of the formatting process in + FOP, under this model, is a hierarchical series of galleys. + The dynamic data comes from fo:flow and fo:static-content, + and the higher-level receptive areas are derived from the + layout-master-set. +

    +
    +
    + Processing galleys +

    + Galleys are processed in two basic processing environments: +

    +
    + Inline- and block-progression dimensions known +

    + The galley at set-up is provided with both an + inline-progression-dimension (i-p-d) and + a block-progression-dimension (b-p-d). + In this case, no further intervention is necessary to lay + out the galley. The galley has the possibility of laying + itself out, creating all necessary area nodes. This does + not preclude the possibility that some children of this + galley will not be able to be so directly laid out, and + will fall into the second category. +

    +

    + While the option of "automatic" layout exists, to use + such a method would relinquish the possibility of + monitoring the results of such layout and performing + fine-tuning. +

    +
    +
    + Inline- ior block-progression-dimensions unknown +

    + The galley cannot immediately be provided with an i-p-d + ior a b-p-d. This will occur in some of the difficult + cases mentioned earlier. In these cases, the parent + galley acts as a layout manager, similar to the sense used + in another + discussion. The children, lacking full receptive + area dimensions, will proceed with galley pre-processing, + a procedure which will, of necessity, be followed + recursively by all of its children down to the atomic + elements of the galley. These atomic elements are the + individual fo:character nodes and images of fixed + dimensions. +

    +
    +
    +
    + Galley pre-processing + +

    + Galley pre-processing involves the spatial resolution of + objects from the flows to the greatest extent possible + without information on the dimensions of the target area. + Line-areas have a block progression dimension which is + determined by their contents. To achieve full generality in + layouts of indeterminate dimensions, the contents of + line-areas should be laid out as though their inline + progression dimension were limited only by their content. + In terms of inline-areas, galleys would process text and + resolve the dimensions of included images. Text would be + collected into runs with the same alignment + characteristics. In the process, all possible "natural" and + hyphenation break-points can be determined. Where a + line-area contains mixed fonts or embedded images, the b-p-d + of the individual line-areas which are eventually stacked + will, in general, depend on the line break points, but the + advantage of this approach is that such actual selections + can be backed out and new break points selected with a + minimum of re-calculation. This can potentially occur + whenever a first attempt at page layout is backed out. +

    + Figure 1 +

    +
    +

    + Once this pre-processing has been achieved, it is + envisaged that a layout manager might make requests to the + galley of its ability to fill an area of a given + inline-progression-dimension. A positive response would + be accompanied by the block-progression-dimension. The + other possibilities are a partial fill, which would also + require b-p-d data, and a failure due to insufficient + i-p-d, in which case the minimum i-p-d requirement would + be returned. Note that decisions about the + actual dimensions of line-areas to be filled can be + deferred until all options have been tested. +

    +

    + The other primary form of information provided by a + pre-processed galley is its minimum and maximum i-p-d, so + that decisions can be made by the parent on the spacing of + table columns. Apart from information requests, + higher-level processes can either make requests of the + galleys for chunks of nominated sizes, or simply provide the + galley with an i-p-d and b-p-d, which will trigger the + flushing of the galley components into Area nodes. Until + they have flushed, the galleys must be able to respond to a + sequence of information requests, more or less in the manner + of a request iterator, and separately manage the flushing of + objects into the area tree. The purpose of the "request + iterator" would be to support "incremental" information + requests like getNextBreakPosition. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/index.xml b/src/documentation/content/xdocs/design/alt.design/index.xml new file mode 100644 index 000000000..08e60f059 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/index.xml @@ -0,0 +1,84 @@ + + + + +
    + FOP Alternative Design + Alternative Design Approach to FOP + $Revision$ $Name$ + + + +
    + +
    + Alternative Design +

    + This section of the FOP web site contains notes on approaches + to an alternative design for FOP. The individual documents + here are fragmentary, being notes of particular issues, + without an overall framework as yet. +

    +

    + The main aims of this redesign effort are: +

    +
      +
    • full conformance with the Recommendation
    • +
    • increased performance
    • +
    • reduced memory footprint
    • +
    • no limitation on the size of files
    • +
    +

    + In order to achieve these aims, the primary areas + of design interest are: +

    +
      +
    • + Representing properties, for most purposes, as integers. +
    • +
    • + Distributing FOP processing over a number of threads with + single-point downstream communication and flow control by + means of traditional producer/consumer queues. The threads + so far under consideration are: +
        +
      • XML parser
      • +
      • FO tree builder
      • +
      • layout engine
      • +
      • Area tree builder
      • +
      +
    • +
    • + Representing trees with explicit Tree objects, rather than + as implicit relationships among other objects. +
    • +
    • + Caching integrated into the tree node access methods. +
    • +
    +
    + Status and availability +

    + The ALT DESIGN effort is not taking place on the + main line of development, represented by the HEAD + tag on the CVS trunk. The source is available via the + FOP_0-20-0_Alt-Design tag. This code has only a crude, + non-Ant build environment, and is expected only to + compile at this stage. Only the parser stage and the first + stage of FO tree building is present. However, the first + example of producer/consumer binding is working, the Tree + class with inner Tree.Node and inner + Tree.Node.iterators classes are available and + working. Property handling is quite advanced, and is likely + to be almost complete some time in July, 2002. +

    +

    + Only Peter + West is working on the ALT DESIGN sub-project. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/keeps.xml b/src/documentation/content/xdocs/design/alt.design/keeps.xml new file mode 100644 index 000000000..ffb29f5f1 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/keeps.xml @@ -0,0 +1,110 @@ + + + + +
    + Keeps and breaks + + + +
    + +
    + Keeps and breaks in layout galleys +

    + The layout galleys and the + layout tree + which is their context have been discussed elsewhere. Here we + discuss a possible method of implementing keeps and breaks + within the context of layout galleys and the layout tree. +

    +
    + Breaks +

    + Breaks may be handled by inserting a column- or page-break + pseudo-object into the galley stream. For break-before, the + object would be inserted before the area in which the flow + object, to which the property is attached, is leading. If + the flow object is leading in no ancestor context, the + pseudo-object is inserted before the object itself. + Corresponding considerations apply for break-after. + Selection of the position for these objects will be further + examined in the discussion on keeps. +

    +
    +
    + Keeps +

    + Conceptually, all keeps can be represented by a + keep-together pseudo-area. The keep-together property + itself is expressed during layout by wrapping all of the + generated areas in a keep-together area. Keep-with-previous + on formatting object A becomes a keep-together area spanning + the first non-blank normal area leaf node, L, generated by A + or its offspring, and the last non-blank normal area leaf + node preceding L in the area tree. Likewise, keep-with-next + on formatting object A becomes a keep-together area spanning + the last non-blank normal area leaf node, L, generated by A + or its offspring, and the first non-blank normal area leaf + node following L in the area tree. +
    TODO REWORK THIS for block vs inline +

    +

    + The obvious problem with this arrangement is that the + keep-together area violate the hierarachical arrangement of + the layout tree. They form a concurrent structure focussed + on the leaf nodes. This seems to be the essential problem + of handling keep-with-(previous/next); that it cuts across + the otherwise tree-structured flow of processing. Such + problems are endemic in page layout. +

    +

    + In any case, it seems that the relationships between areas + that are of interest in keep processing need some form of + direct expression, parallel to the layout tree itself. + Restricting ourselves too block-level elements, and looking + only at the simple block stacking cases, we get a diagram + like the attached PNG. In order to track the relationships + through the tree, we need four sets of links. +

    +

    + Figure 1 +

    + +
    +

    + The three basic links are: +

    +
      + +
    • Leading edge to leading edge of first normal child.
    • +
    • Trailing edge to leading edge of next normal + sibling.
    • +
    • Trailing edge to trailing edge of parent.
    • +
    +

    + Superimposed on the basic links are bridging links which + span adjacent sets of links. These spanning links are the + tree violators, and give direct access to the areas which + are of interest in keep processing. They could be + implemented as double-linked lists, either within the layout + tree nodes or as separate structures. Gaps in the spanning + links are joined by simply reproducing the single links, as + in the diagram. The whole layout tree for a page is + effectively threaded in order of interest, as far as keeps + are concerned. +

    +

    + The bonus of this structure is that it looks like a superset + of the stacking constraints. It gives direct access to all + sets of adjacent edges and sets of edges whose space + specifiers need to be resolved. Fences can be easily enough + detected during the process of space resolution. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/properties-classes.xml b/src/documentation/content/xdocs/design/alt.design/properties-classes.xml new file mode 100644 index 000000000..3345cadfa --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/properties-classes.xml @@ -0,0 +1,143 @@ + + + + +
    + Properties$classes + + + +
    + +
    + fo.Properties and the nested properties classes +
    +
    + Nested property classes +

    + Given the intention that individual properties have only a + virtual instantiation in the arrays of + PropertyConsts, these classes are intended to + remain as repositories of static data and methods. The name + of each property is entered in the + PropNames.propertyNames array of + Strings, and each has a unique integer constant + defined, corresponding to the offset of the property name in + that array. +

    +
    + Fields common to all classes +
    +
    final int dataTypes
    +
    + This field defines the allowable data types which may be + assigned to the property. The value is chosen from the + data type constants defined in Properties, and + may consist of more than one of those constants, + bit-ORed together. +
    +
    final int traitMapping
    +
    + This field defines the mapping of properties to traits + in the Area tree. The value is chosen from the + trait mapping constants defined in Properties, + and may consist of more than one of those constants, + bit-ORed together. +
    +
    final int initialValueType
    +
    + This field defines the data type of the initial value + assigned to the property. The value is chosen from the + initial value type constants defined in + Properties. +
    +
    final int inherited
    +
    + This field defines the kind of inheritance applicable to + the property. The value is chosen from the inheritance + constants defined in Properties. +
    +
    +
    +
    + Datatype dependent fields +
    +
    Enumeration types
    +
    + final String[] enums
    + This array contains the NCName text + values of the enumeration. In the current + implementation, it always contains a null value at + enum[0].

    + + final String[] + enumValues
    When the number of + enumeration values is small, + enumValues is a reference to the + enums array.

    + + final HashMap + enumValues
    When the number of + enumeration values is larger, + enumValues is a + HashMap statically initialized to + contain the integer constant values corresponding to + each text value, indexed by the text + value.

    + + final int + enumeration-constants
    A + unique integer constant is defined for each of the + possible enumeration values.

    +
    +
    Many types: + final datatype + initialValue
    +
    + When the initial datatype does not have an implicit + initial value (as, for example, does type + AUTO) the initial value for the property is + assigned to this field. The type of this field will + vary according to the initialValueType + field. +
    +
    AUTO: PropertyValueList auto(property, + list)>
    +
    + When AUTO is a legal value type, the + auto() method must be defined in the property + class.
    + NOT YET IMPLEMENTED. +
    +
    COMPLEX: PropertyValueList complex(property, + list)>
    +
    + COMPLEX is specified as a value type when complex + conditions apply to the selection of a value type, or + when lists of values are acceptable. To process and + validate such a property value assignment, the + complex() method must be defined in the + property class. +
    +
    +
    +
    +
    + Nested property pseudo-classes +

    + The property pseudo-classes are classes, like + ColorCommon which contain values, particularly + enums, which are common to a number of actual + properties. +

    +
    +

    + Previous: property classes overview. +

    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/propertyExpressions.xml b/src/documentation/content/xdocs/design/alt.design/propertyExpressions.xml new file mode 100644 index 000000000..90d67349f --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/propertyExpressions.xml @@ -0,0 +1,343 @@ + + + + +
    + Property Expression Parsing + + + +
    + +
    + Property expression parsing + + The following discussion of the experiments with alternate + property expression parsing is very much a work in progress, + and subject to sudden changes. + +

    + The parsing of property value expressions is handled by two + closely related classes: PropertyTokenizer and its + subclass, PropertyParser. + PropertyTokenizer, as the name suggests, handles + the tokenizing of the expression, handing tokens + back to its subclass, + PropertyParser. PropertyParser, in + turn, returns a PropertyValueList, a list of + PropertyValues. +

    +

    + The tokenizer and parser rely in turn on the datatype + definition from the org.apache.fop.datatypes + package and the datatype static final int + constants from PropertyConsts. +

    +
    + Data types +

    + The data types currently defined in + org.apache.fop.datatypes include: +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Numbers and lengths
    Numeric + The fundamental numeric data type. Numerics of + various types are constructed by the classes listed + below. +
    + Constructor classes for Numeric
    AngleIn degrees(deg), gradients(grad) or + radians(rad)
    EmsRelative length in ems
    FrequencyIn hertz(Hz) or kilohertz(kHz)
    IntegerType +
    LengthIn centimetres(cm), millimetres(mm), + inches(in), points(pt), picas(pc) or pixels(px)
    Percentage +
    TimeIn seconds(s) or milliseconds(ms)
    Strings
    StringType + Base class for data types which result in a String. +
    Literal + A subclass of StringType for literals which + exceed the constraints of an NCName. +
    MimeType + A subclass of StringType for literals which + represent a mime type. +
    UriType + A subclass of StringType for literals which + represent a URI, as specified by the argument to + url(). +
    NCName + A subclass of StringType for literals which + meet the constraints of an NCName. +
    CountryAn RFC 3066/ISO 3166 country code.
    LanguageAn RFC 3066/ISO 639 language code.
    ScriptAn ISO 15924 script code.
    Enumerated types
    EnumType + An integer representing one of the tokens in a set of + enumeration values. +
    MappedEnumType + A subclass of EnumType. Maintains a + String with the value to which the associated + "raw" enumeration token maps. E.g., the + font-size enumeration value "medium" maps to + the String "12pt". +
    Colors
    ColorType + Maintains a four-element array of float, derived from + the name of a standard colour, the name returned by a + call to system-color(), or an RGB + specification. +
    Fonts
    FontFamilySet + Maintains an array of Strings containing a + prioritized list of possibly generic font family names. +
    Pseudo-types
    + A variety of pseudo-types have been defined as + convenience types for frequently appearing enumeration + token values, or for other special purposes. +
    Inherit + For values of inherit. +
    Auto + For values of auto. +
    None + For values of none. +
    Bool + For values of true/false. +
    FromNearestSpecified + Created to ensure that, when associated with + a shorthand, the from-nearest-specified-value() + core function is the sole component of the expression. +
    FromParent + Created to ensure that, when associated with + a shorthand, the from-parent() + core function is the sole component of the expression. +
    +
    +
    + Tokenizer +

    + The tokenizer returns one of the following token + values: +

    + + static final int + EOF = 0 + ,NCNAME = 1 + ,MULTIPLY = 2 + ,LPAR = 3 + ,RPAR = 4 + ,LITERAL = 5 + ,FUNCTION_LPAR = 6 + ,PLUS = 7 + ,MINUS = 8 + ,MOD = 9 + ,DIV = 10 + ,COMMA = 11 + ,PERCENT = 12 + ,COLORSPEC = 13 + ,FLOAT = 14 + ,INTEGER = 15 + ,ABSOLUTE_LENGTH = 16 + ,RELATIVE_LENGTH = 17 + ,TIME = 18 + ,FREQ = 19 + ,ANGLE = 20 + ,INHERIT = 21 + ,AUTO = 22 + ,NONE = 23 + ,BOOL = 24 + ,URI = 25 + ,MIMETYPE = 26 + // NO_UNIT is a transient token for internal use only. It is + // never set as the end result of parsing a token. + ,NO_UNIT = 27 + ; + +

    + Most of these tokens are self-explanatory, but a few need + further comment. +

    +
    +
    AUTO
    +
    + Because of its frequency of occurrence, and the fact that + it is always the initial value for any property + which supports it, AUTO has been promoted into a + pseudo-type with its on datatype class. Therefore, it is + also reported as a token. +
    +
    NONE
    +
    + Similarly to AUTO, NONE has been promoted to a pseudo-type + because of its frequency. +
    +
    BOOL
    +
    + There is a de facto boolean type buried in the + enumeration types for many of the properties. It had been + specified as a type in its own right in this code. +
    +
    MIMETYPE
    +
    + The property content-type introduces this + complication. It can have two values of the form + content-type:mime-type + (e.g. content-type="content-type:xml/svg") or + namespace-prefix:prefix + (e.g. content-type="namespace-prefix:svg"). The + experimental code reduces these options to the payload + in each case: an NCName in the case of a + namespace prefix, and a MIMETYPE in the case of a + content-type specification. NCNames cannot + contain a "/". +
    +
    +
    +
    + Parser +

    + The parser retuns a PropertyValueList, + necessary because of the possibility that a list of + PropertyValue elements may be returned from the + expressions of soem properties. +

    +

    + PropertyValueLists may contain + PropertyValues or other + PropertyValueLists. This latter provision is + necessitated for the peculiar case of of + text-shadow, which may contain whitespace separated + sublists of either two or three elements, separated from one + another by commas. To accommodate this peculiarity, comma + separated elements are added to the top-level list, while + whitespace separated values are always collected into + sublists to be added to the top-level list. +

    +

    + Other special cases include the processing of the core + functions from-parent() and + from-nearest-specified-value() when these + function calls are assigned to a shorthand property, or used + with a shorthand property name as an argument. In these + cases, the function call must be the sole component of the + expression. The pseudo-element classes + FromParent and + FromNearestSpecified are generated in these + circumstances so that an exception will be thrown if they + are involved in expression evaluation with other + components. (See Rec. Section 5.10.4 Property Value + Functions.) +

    +

    + The experimental code is a simple extension of the existing + parser code, which itself borrowed heavily from James + Clark's XT processor. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/spaces.xml b/src/documentation/content/xdocs/design/alt.design/spaces.xml new file mode 100644 index 000000000..f2426347b --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/spaces.xml @@ -0,0 +1,179 @@ + + + + +
    + Keeps and space-specifiers + + + +
    + +
    + Keeps and space-specifiers in layout galleys +

    + The layout galleys and the + layout tree + which is the context of this discussion have been discussed + elsewhere. A previous document + discussed data structures which might facilitate the lining of + blocks necessary to implement keeps. Here we discuss the + similarities between the keep data structures and those + required to implement space-specifier resolution. +

    +
    + Space-specifiers + + 4.3 Spaces and Conditionality + ... Space-specifiers occurring in sequence may interact with + each other. The constraint imposed by a sequence of + space-specifiers is computed by calculating for each + space-specifier its associated resolved space-specifier in + accordance with their conditionality and precedence. + + + 4.2.5 Stacking Constraints ... The intention of the + definitions is to identify areas at any level of the tree + which have only space between them. + +

    + The quotations above are pivotal to understanding the + complex discussion of spaces with which they are associated, + all of which exists to enable the resolution of adjacent + <space>s. It may be helpful to think of stacking + constraints as <space>s interaction or + <space>s stacking interaction. +

    +
    +
    + Block stacking constraints +

    + In the discussion of block stacking constraints in Section + 4.2.5, the notion of fence is introduced. For + block stacking constraints, a fence is defined as either a + reference-area boundary or a non-zero padding or border + specification. Fences, however, do not come into play + when determining the constraint between siblings. (See + Figure 1.) +

    +

    Figure 1

    +
    + + Figure 1 assumes a block-progression-direction of top to + bottom. + +

    + In Diagram a), block A has + non-zero padding and borders, in addition to non-zero + spaces. Note, however, that the space-after of A is + adjacent to the space-before of block P, so borders and + padding on these siblings have no impact on the interaction + of their <space>s. The stacking constraint A,P is + indicated by the red rectangle enclosing the space-after of + A and the space-before of P. +

    +

    + In Diagram b), block B is the + first block child of P. The stacking constraint A,P is as + before; the stacking constraint P,B is the space-before of + B, as indicated by the enclosing magenta rectangle. In this + case, however, the non-zero border of P prevents the + interaction of the A,P and P,B stacking constraints. There + is a fence-before P. The fence is notional; it has + no precise location, as the diagram may lead one to believe. +

    +

    + In Diagram c), because of the + zero-width borders and padding on block P, the fence-before + P is not present, and the adjacent <space>s of blocks + A, P and B are free to interact. In this case, the stacking + constraints A,P and P,B are as before, but now there is an + additional stacking constraint A,B, represented by the light + brown rectangle enclosing the other two stacking + constraints. +

    +

    + The other form of fence occurs when the parent block is a + reference area. Diagram b) of Figure + 2 illustrates this situation. Block C is a + reference-area, involving a 180 degree change of + block-progression-direction (BPD). In the diagram, the + inner edge of block C represents the content rectangle, with + its changed BPD. The thicker outer edge represents the + outer boundary of the padding, border and spaces of C. +

    +

    + While not every reference-area will change the + inline-progression-direction (IPD) and BPD of an area, no + attempt is made to discriminate these cases. A + reference-area always a fence. The fence comes into play in + analogous circumstances to non-zero borders or padding. + Space resolution between a reference area and its siblings + is not affected. +

    +

    + In the case of Diagram b), + these are block stacking constraints B,C and C,A. Within + the reference-area, bock stacing constraints C,D and E,C are + unaffected. However, the fence prevents block stacking + constraints such as B,E or D,A. When there is a change of + BPD, as Diagram b) makes + visually obvious, it is difficult to imagine which blocks + would have such a constraint, and what the ordering of the + constraint would be. +

    +

    Figure 2

    + +
    +
    +
    + Keep relationships between blocks +

    + As complicated as space-specifiers become when + reference-areas are involved, the keep relationships as + described in the keeps document, are + unchanged. This is also illustrated in Figure 2. Diagram b) shows the + relative placement of blocks in the rendered output when a + 180 degree change of BPD occurs, with blocks D and E + stacking in the reverse direction to blocks B and C. + Diagram c) shows what happens when the page is too short to + accommodate the last block. D is still laid out, but E is + deferred to the next page. +

    +

    + Note that this rendering reality is expressed directly in + the area (and layout) tree view. Consequently, any keep + relationships expressed as links threading through the + layout tree will not need to be modified to account for + reference-area boundaries, as is the case with similar + space-specifier edge links. E.g., a keep-with-next + condition on block B can be resolved along the path of these + links (B->C->D) into a direct relationship of B->D, + irrespective of the reference-area boundary. +

    +

    + While the same relationships obviously hold when a reference + area induces no change of BPD, the situation for BPD changes + perpendicular to the parent's BPD may not be so clear. In + general, it probably does not make much sense to impose keep + conditions across such a boundary, but there seems to be + nothing preventing such conditions. They can be dealt with + in the same way, i.e., the next leaf block linked in area + tree order must be the next laid out. If a keep condition + is in place, an attempt must be made to meet it. A number + of unusual considerations would apply, e.g. the minimum + inline-progression-dimension of the first leaf block within + the reference-area as compared to the minimum IPD of + subsequent blocks, but prima facie, the essential + logic of the keeps links remains. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/traits.xml b/src/documentation/content/xdocs/design/alt.design/traits.xml new file mode 100644 index 000000000..365540060 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/traits.xml @@ -0,0 +1,369 @@ + + + + +
    + Traits + + + +
    + +
    + Traits + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    TraitApplies toRefsDerived from
    Common Traits
    block-progression-directionAll areas + 4.2.2 Common Traits
    + 7.27.7 writing-mode +
    + 7.27.7 reference-orientation +
    inline-progression-directionAll areas + 4.2.2 Common Traits
    + 7.27.7 writing-mode +
    + 7.27.7 reference-orientation +
    shift-directionInline areas
    glyph-orientationGlyph-areas + 4.2.2 Common Traits
    + 4.6.2 Glyph-areas
    + 4.7.2 Line-building
    + 4.9.5 Intrinsic Marks
    + 7.8.1 Fonts and Font Data
    + 7.27 Writing-mode-related Properties +
    + 7.27.2 glyph-orientation-horizontal
    + 7.27.3 glyph-orientation-vertical
    + 7.27.1 direction
    + 7.27.7 writing-mode +
    is-reference-areaAll areas + 5.6 Non-property Based Trait Generation + + Set "true" on:
    + simple-page-master
    + title
    + region-body
    + region-before
    + region-after
    + region-start
    + region-end
    + block-container
    + inline-container
    + table
    + table-caption
    + table-cell +
    is-viewport-area + 4.2.2 Common Traits +
    top-position
    bottom-position
    left-position
    right-position
    left-offset
    top-offset
    is-first
    is-last
    generated-by
    returned-by
    nominal-font
    blink + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    underline-score + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    underline-score-color + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    overline-score + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    overline-score-color + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    through-score + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    through-score-color + 5.5.6 Text-decoration Property + + + 7.16.4 "text-decoration" + +
    Other Indirectly Derived Traits
    alignment-point + + 4.1 Introduction +
    alignment-baseline + + 4.1 Introduction +
    baseline-shift + + 4.1 Introduction +
    dominant-baseline-identifier + + 4.1 Introduction +
    actual-baseline-table + + 4.1 Introduction +
    start-intrusion-adjustment + + 4.1 Introduction +
    end-intrusion-adjustment + + 4.1 Introduction +
    page-number + + 4.1 Introduction +
    script + + 4.1 Introduction +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/user-agent-refs.xml b/src/documentation/content/xdocs/design/alt.design/user-agent-refs.xml new file mode 100644 index 000000000..47fe45f94 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/user-agent-refs.xml @@ -0,0 +1,909 @@ + + + + +
    + User agent refs + + + +
    + +
    + User Agent references in XSLFO +
    + 4.9.2 Viewport Geometry +

    + If the block-progression-dimension of the reference-area is + larger than that of the viewport-area and the overflow trait + for the reference-area is scroll, then the + inline-scroll-amount and block-scroll-amount are determined + by a scrolling mechanism, if any, provided by the + user agent. Otherwise, both are zero. +

    +
    +
    + 5.1.3 Actual Values +

    + A computed value is in principle ready to be used, but a + user agent may not be able to make use of the value in a + given environment. For example, a user + agent may only be able to render borders with + integer pixel widths and may, therefore, have to adjust the + computed width to an integral number of media pixels. +

    +
    +
    + 5.5.7 Font Properties +

    + There is no XSL mechanism to specify a particular font; + instead, a selected font is chosen from the fonts available + to the User Agent based on a set of + selection criteria. The selection criteria are the following + font properties: "font-family", "font-style", + "font-variant", "font-weight", "font-stretch", and + "font-size", plus, for some formatting objects, one or more + characters. +

    +
    +
    + 5.9.13.1 Pixels +

    + If the User Agent chooses a measurement for + a 'px' that does not match an integer number of device dots + in each axis it may produce undesirable effects... +

    +
    +
    + 5.10.4 Property Value Functions +
    + Function: object merge-property-values( NCName) +

    + The merge-property-values function returns a value of the + property whose name matches the argument, or if omitted + for the property for which the expression is being + evaluated. The value returned is the specified value on + the last fo:multi-property-set, of the parent + fo:multi-properties, that applies to the User + Agent state. If there is no such value, the + computed value of the parent fo:multi-properties is + returned... +

    +

    + The test for applicability of a User + Agent state is specified using the "active-state" + property. +

    +
    +
    +
    + 6.3 Formatting Objects Summary +
    + multi-property-set +

    + The fo:multi-property-set is used to specify an + alternative set of formatting properties that, dependent + on a User Agent state, are applied to the + content. +

    +
    +
    + title +

    + The fo:title formatting object is used to associate a + title with a given page-sequence. This title may be used + by an interactive User Agent to identify + the pages. For example, the content of the fo:title can be + formatted and displayed in a "title" window or in a "tool + tip". +

    +
    +
    +
    + 6.4.1.2 Page-masters +

    + ... When pages are used with a User Agent + such as a Web browser, it is common that the each document + has only one page. The viewport used to view the page + determines the size of the page. When pages are placed on + non-interactive media, such as sheets of paper, pages + correspond to one or more of the surfaces of the paper. +

    +
    +
    + 6.4.20 fo:title +
    + Common Usage: +

    + ... This title may be used by an interactive User + Agent to identify the pages. +

    +
    +
    +
    + 6.6.3 fo:character +
    + Constraints: +

    + The dimensions of the areas are determined by the font + metrics for the glyph. +

    +

    + When formatting an fo:character with a + "treat-as-word-space" value of "true", the User + Agent may use a different method for determining + the inline-progression-dimension of the area. +

    +
    +
    +
    + 6.9 Dynamic Effects: Link and Multi Formatting + Objects +
    + 6.9.1 Introduction +

    + Dynamic effects, whereby user actions (including + User Agent state) can influence the + behavior and/or representation of portions of a document, + can be achieved through the use of the formatting objects + included in this section: +

    +
      +
    • One-directional single-target links.
    • +
    • + The ability to switch between the display of two or more + formatting object subtrees. This can be used for, e.g., + expandable/collapsible table of contents, display of an + icon or a full table or graphic. +
    • +
    • + The ability to switch between different property values, + such as color or font-weight, depending on a + User Agent state, such as "hover". +
    • +
    +
    +
    +
    + 6.10 Out-of-Line Formatting Objects +
    + 6.10.1.3 Conditional Sub-Regions +

    + ... There may be limits on how much space conditionally + generated areas can borrow from the + region-reference-area. It is left to the user + agent to decide these limits. +

    +

    + ... An interactive user agent may choose + to create "hot links" to the footnotes from the + footnote-citation, or create "hot links" to the + before-floats from an implicit citation, instead of + realizing conditional sub-regions. +

    +
    +
    +
    + 6.10.2 fo:float +
    + Constraints: +

    + ... The user agent may make its own + determination, after taking into account the intrusion + adjustments caused by one or more overlapping side-floats, + that the remaining space in the + inline-progression-direction is insufficient for the next + side-float or normal block-area. The user + agent may address this by causing the next + side-float or normal block-area to "clear" one of the + relevant side-floats, as described in the "clear" property + description, so the intrusion adjustment is sufficiently + reduced. Of the side-floats that could be cleared to meet + this constraint, the side-float that is actually cleared + must be the one whose after-edge is closest to the + before-edge of the parent reference-area. +

    +

    + The user agent may determine sufficiency + of space by using a fixed length, or by some heuristic + such as whether an entire word fits into the available + space, or by some combination, in order to handle text and + images. +

    +
    +
    +
    + 6.10.3 fo:footnote +
    + Constraints: +

    + ... The second block-area and any additional block-areas + returned by an fo:footnote must be placed on the + immediately subsequent pages to the page containing the + first block-area returned by the fo:footnote, before any + other content is placed. If a subsequent page does not + contain a region-body, the user agent + must use the region-master of the last page that did + contain a region-body to hold the additional block-areas. +

    +
    +
    +
    + 7.3 Reference Rectangle for Percentage Computations +

    ...

    +
    + Exceptions ... +

    + 5. When the absolute-position is "fixed", the containing + block is defined by the nearest ancestor viewport area. If + there is no ancestor viewport area, the containing block + is defined by the user agent. +

    +
    +
    +
    + 7.6.5 "pause-after" 7.6.6 "pause-before" 7.6.17 "voice-family" +

    Initial: depends on user agent

    +
    +
    + 7.7.1 "background-attachment" +
    + fixed +

    + ... User agents may treat fixed as + scroll. However, it is recommended they interpret fixed + correctly, at least for the HTML and BODY elements, since + there is no way for an author to provide an image only for + those browsers that support fixed. +

    +
    +
    +
    + 7.7.9 "border-before-width" +
    + <length-conditional> +

    + ... If border-before-width is specified using one of the + width keywords the .conditional component is set to + "discard" and the .length component to a User + Agent dependent length. +

    +
    +
    +
    + 7.7.19 "border-top-color" +
    + <color> +

    + ... If an element's border color is not specified with a + "border" property, user agents must use + the value of the element's "color" property as the + computed value for the border color. +

    +
    +
    +
    + 7.7.20 "border-top-style" +

    + Conforming HTML user agents may interpret + 'dotted', 'dashed', 'double', 'groove', 'ridge', 'inset', + and 'outset' to be 'solid'. +

    +
    +
    + 7.7.21 "border-top-width" +
    + thin ... medium ... thick ... +

    + ... The interpretation of the first three values depends + on the user agent. +

    +
    +
    +
    + 7.8.2 "font-family" +

    Initial: depends on user agent

    +
    +
    + 7.8.3 "font-selection-strategy" +

    + There is no XSL mechanism to specify a particular font; + instead, a selected font is chosen from the fonts available + to the User Agent based on a set of + selection criteria. The selection criteria are the following + font properties: "font-family", "font-style", + "font-variant", "font-weight", "font-stretch", and + "font-size", plus, for some formatting objects, one or more + characters. +

    +

    + ... This fallback may be to seek a match using a + User Agent default "font-family", or it may + be a more elaborate fallback strategy where, for example, + "Helvetica" would be used as a fallback for "Univers". +

    +

    + If no match has been found for a particular character, there + is no selected font and the User Agent + should provide a visual indication that a character is not + being displayed (for example, using the 'missing character' + glyph). +

    +
    +
    + 7.8.4 "font-size" +
    + <absolute-size> +

    + An <absolute-size> keyword refers to an entry in a + table of font sizes computed and kept by the user + agent. Possible values are:
    [ xx-small | + x-small | small | medium | large | x-large | xx-large ] +

    +
    +
    + <relative-size> +

    + A <relative-size> keyword is interpreted relative to + the table of font sizes and the font size of the parent + element. Possible values are:
    [ larger | smaller + ]
    For example, if the parent element has a font size + of "medium", a value of "larger" will make the font size + of the current element be "large". If the parent element's + size is not close to a table entry, the user + agent is free to interpolate between table + entries or round off to the closest one. The user + agent may have to extrapolate table values if the + numerical value goes beyond the keywords. +

    +
    +
    + <length> +

    + A length value specifies an absolute font size (that is + independent of the user agent's font + table). +

    +
    +
    +
    + 7.8.8 "font-variant" +
    + small-caps +

    + ... If a genuine small-caps font is not available, + user agents should simulate a small-caps + font... +

    +
    +
    +
    + 7.8.9 "font-weight" +
    + XSL modifications to the CSS definition: +

    + ... The association of other weights within a family to + the numerical weight values is intended only to preserve + the ordering of weights within that family. User + agents must map names to values in a way that + preserves visual order; a face mapped to a value must not + be lighter than faces mapped to lower values. There is no + guarantee on how a user agent will map + fonts within a family to weight values. However, the + following heuristics... +

    +
    +
    +
    + 7.13.1 "alignment-adjust" +
    + auto +

    + ... If the baseline-identifier does not exist in the + baseline-table for the glyph or other inline-area, then + the User Agent may either use heuristics + to determine where that missing baseline would be or may + use the dominant-baseline as a fallback. +

    +
    +
    +
    + 7.13.3 "baseline-shift" +
    + sub/super +

    + ... Because in most fonts the subscript position is + normally given relative to the "alphabetic" baseline, the + User Agent may compute the effective + position for sub/superscripts [sub: spec typo!] + when some other baseline is dominant. ... If there is no + applicable font data the User Agent may + use heuristics to determine the offset. +

    +
    +
    +
    + 7.13.5 "dominant-baseline" +

    + ... If there is no baseline-table in the nominal font or if + the baseline-table lacks an entry for the desired baseline, + then the User Agent may use heuristics to + determine the position of the desired baseline. +

    +
    +
    + 7.14.11 "scaling-method" +
    + auto +

    + The User Agent is free to choose either + resampling, integer scaling, or any other scaling method. +

    +
    +
    + integer-pixels +

    + The User Agent should scale the image + such that each pixel in the original image is scaled to + the nearest integer number of device-pixels that yields an + image less-then-or-equal-to the image size derived from + the content-height, content-width, and scaling properties. +

    +
    +
    + resample-any-method +

    + The User Agent should resample the + supplied image to provide an image that fills the size + derived from the content-height, content-width, and + scaling properties. The user agent may + use any sampling method. +

    +
    +

    + ... This is defined as a preference to allow the + user agent the flexibility to adapt to + device limitations and to accommodate over-constrained + situations involving min/max dimensions and scale factors. +

    +
    +
    + 7.14.12 "width" +

    + ... The width of a replaced element's box is intrinsic and + may be scaled by the user agent if the + value of this property is different than 'auto'. +

    +
    +
    + 7.15.4 "line-height" +
    + normal +

    + Tells user agents to set the computed + value to a "reasonable" value based on the font size of + the element. +

    + +

    + ... When an element contains text that is rendered in more + than one font, user agents should determine + the "line-height" value according to the largest font size. +

    +
    +
    + 7.15.9 "text-align" +

    + ... The actual justification algorithm used is user + agent and written language dependent.
    + Conforming user agents may interpret the + value 'justify' as 'left' or 'right', depending on whether + the element's default writing direction is left-to-right or + right-to-left, respectively. +

    +
    +
    + 7.15.11 "text-indent" +

    + ... User agents should render this + indentation as blank space. +

    +
    +
    + 7.16.2 "letter-spacing" +
    + normal +

    + The spacing is the normal spacing for the current + font. This value allows the user agent to + alter the space between characters in order to justify + text. +

    +
    +
    + <length> +

    + This value indicates inter-character space in addition to + the default space between characters. Values may be + negative, but there may be implementation-specific + limits. User agents may not further + increase or decrease the inter-character space in order to + justify text. +

    +
    +

    + Character-spacing algorithms are user agent + dependent. Character spacing may also be influenced by + justification (see the "text-align" property).
    When the + resultant space between two characters is not the same as + the default space, user agents should not + use ligatures.
    Conforming user agents + may consider the value of the 'letter-spacing' property to + be 'normal'. +

    +
    + XSL modifications to the CSS definition: +

    + ... For "normal": .optimum = "the normal spacing for the + current font" / 2, .maximum = auto, .minimum = auto, + .precedence = force, and .conditionality = discard. A + value of auto for a component implies that the limits are + User Agent specific. +

    +

    + ... The CSS statement that "Conforming user + agents may consider the value of the + 'letter-spacing' property to be 'normal'." does not apply + in XSL, if the User Agent implements the + "Extended" property set. +

    +

    + ... The algorithm for resolving the adjusted values + between word spacing and letter spacing is User + Agent dependent. +

    +
    +
    +
    + 7.16.4 "text-decoration" +

    + ... If the element has no content or no text content (e.g., + the IMG element in HTML), user agents must + ignore this property. +

    +
    + blink +

    + ... Conforming user agents are not + required to support this value. +

    +
    +
    +
    + 7.16.6 "text-transform" +

    + ... Conforming user agents may consider the + value of "text-transform" to be "none" for characters that + are not from the ISO Latin-1 repertoire and for elements in + languages for which the transformation is different from + that specified by the case-conversion tables of Unicode or + ISO 10646. +

    +
    +
    + 7.16.8 "word-spacing" +

    + ... Word spacing algorithms are user + agent-dependent. +

    +
    + XSL modifications to the CSS definition: +

    + ... The algorithm for resolving the adjusted values + between word spacing and letter spacing is User + Agent dependent. +

    +
    +
    +
    + 7.17.1 "color" +

    Initial: depends on user agent

    +
    +
    + 7.17.3 "rendering-intent" +
    + auto +

    + This is the default behavior. The User + Agent determines the best intent based on the + content type. For image content containing an embedded + profile, it shall be assumed that the intent specified + within the profile is the desired intent. Otherwise, the + user agent shall use the current profile + and force the intent, overriding any intent that might be + stored in the profile itself. +

    +
    +
    +
    + 7.20.2 "overflow" +
    + scroll +

    + This value indicates that the content is clipped and that + if the user agent uses a scrolling + mechanism that is visible on the screen (such as a scroll + bar or a panner), that mechanism should be displayed for a + box whether or not any of its content is clipped. +

    +
    +
    + auto +

    + The behavior of the "auto" value is user + agent dependent, but should cause a scrolling + mechanism to be provided for overflowing boxes. +

    +
    +
    +
    + 7.21.2 "leader-pattern" +
    + dots +

    + ... The choice of dot character is dependent on the + user agent. +

    +
    +
    +
    + 7.21.4 "leader-length" +

    + ... User agents may choose to use the value + of "leader-length.optimum" to determine where to break the + line, then use the minimum and maximum values during line + justification. +

    +
    +
    + 7.25.11 "media-usage" +
    + auto +

    + The User Agent determines which value of + "media-usage" (other than the "auto" value) is used. The + User Agent may consider the type of media + on which the presentation is to be placed in making this + determination.
    NOTE:
    For example, the + User Agent could use the following + decision process. If the media is not continuous and is of + fixed bounded size, then the "paginate" (described below) + is used. Otherwise, the "bounded-in-one-dimension" is + used. +

    +
    +
    + bounded-in-one-dimension +

    + ... It is an error if more or less than one of + "page-height" or "page-width" is specified on the first + page master that is used. The User Agent + may recover as follows:... +

    +
    +
    + unbounded +

    + Only one page is generated per fo:page-sequence descendant + from the fo:root. Neither "page-height" nor "page-width" + may be specified on any page master that is used. If a + value is specified for either property, it is an error and + a User Agent may recover by ignoring the + specified value. ... +

    +
    +
    +
    + 7.25.13 "page-height" +
    + auto +

    + The "page-height" shall be determined, in the case of + continuous media, from the size of the User + Agent window... +

    +
    +
    + NOTE: +

    + A User Agent may provide a way to declare + the media for which formatting is to be done. This may be + different from the media on which the formatted result is + viewed. For example, a browser User Agent + may be used to preview pages that are formatted for sheet + media. In that case, the size calculation is based on the + media for which formatting is done rather than the media + being currently used. +

    +
    +
    +
    + 7.25.15 "page-width" +
    + auto +

    + The "page-width" shall be determined, in the case of + continuous media, from the size of the User + Agent window... +

    +
    +
    +
    + 7.26.5 "border-separation" + +

    + ... Rows, columns, row groups, and column groups cannot + have borders (i.e., user agents must + ignore the border properties for those elements). +

    +
    +
    +
    + 7.26.7 "caption-side" +

    + ... For a caption that is on the left or right side of a + table box, on the other hand, a value other than "auto" for + "width" sets the width explicitly, but "auto" tells the + user agent to chose a "reasonable width". +

    +
    +
    + 7.27.2 "glyph-orientation-horizontal" +
    + <angle> +

    + ... The User Agent shall round the value + of the angle to the closest of the permitted values. +

    +
    +
    +
    + 7.27.3 "glyph-orientation-vertical" +
    + auto +

    + ... The determination of which characters should be + auto-rotated may vary across User Agents. +

    +
    +
    + <angle> +

    + ... The User Agent shall round the value + of the angle to the closest of the permitted values. +

    +
    +
    +
    + 7.27.6 "unicode-bidi" +
    + XSL modifications to the CSS definition: +

    + ... Fallback:
    If it is not possible to present the + characters in the correct order, then the + UserAgent should display either a + 'missing character' glyph or display some indication that + the content cannot be correctly rendered. +

    +
    +
    +
    + 7.28.1 "content-type" +

    + ... This property specifies the content-type and may be used + by a User Agent to select a rendering + processor for the object. +

    +
    + auto +

    + No identification of the content-type. The User + Agent may determine it by "sniffing" or by other + means. +

    +
    +
    +
    + 7.29.5 "border-color" +

    + ... If an element's border color is not specified with a + "border" property, user agents must use the + value of the element's "color" property as the computed + value for the border color. +

    +
    +
    + 7.29.9 "border-spacing" +

    + ... Rows, columns, row groups, and column groups cannot have + borders (i.e., user agents must ignore the + border properties for those elements). +

    +
    +
    + 7.29.13 "font" +

    + ... If no font with the indicated characteristics exists on + a given platform, the user agent should + either intelligently substitute (e.g., a smaller version of + the "caption" font might be used for the "small-caption" + font), or substitute a user agent default + font. +

    +
    +
    + 7.29.19 "pause" +

    Initial: depends on user agent

    +
    +
    + 7.29.21 "size" +

    + ... Relative page boxes allow user agents + to scale a document and make optimal use of the target size. +

    +

    + ... User agents may allow users to control + the transfer of the page box to the sheet (e.g., rotating an + absolute page box that's being printed). +

    +
      +
    • + Rendering page boxes that do not fit a target sheet
      + If a page box does not fit the target sheet dimensions, + the user agent may choose to: +
        +
      • + Rotate the page box 90 degrees if this will make the + page box fit. +
      • +
      • Scale the page to fit the target.
      • +
      + The user agent should consult the user + before performing these operations. +
    • +
    • + Positioning the page box on the sheet
      When the page + box is smaller than the target size, the user + agent is free to place the page box anywhere on + the sheet. +
    • +
    +
    +
    + 7.29.23 "white-space" +
    + normal +

    + This value directs user agents to + collapse sequences of whitespace, and break lines as + necessary to fill line boxes. ... +

    +
    +
    + pre +

    + This value prevents user agents from + collapsing sequences of whitespace. ... +

    +
    +

    + ... Conforming user agents may ignore the + 'white-space' property in author and user style sheets but + must specify a value for it in the default style sheet. +

    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/alt.design/xml-parsing.xml b/src/documentation/content/xdocs/design/alt.design/xml-parsing.xml new file mode 100644 index 000000000..6151dae74 --- /dev/null +++ b/src/documentation/content/xdocs/design/alt.design/xml-parsing.xml @@ -0,0 +1,228 @@ + + + + +
    + Integrating XML Parsing + + + +
    + +
    + An alternative parser integration +

    + This note proposes an alternative method of integrating the + output of the SAX parsing of the Flow Object (FO) tree into + FOP processing. The pupose of the proposed changes is to + provide for better decomposition of the process of analysing + and rendering an fo tree such as is represented in the output + from initial (XSLT) processing of an XML source document. +

    +
    + Structure of SAX parsing +

    + Figure 1 is a schematic representation of the process of SAX + parsing of an input source. SAX parsing involves the + registration, with an object implementing the + XMLReader interface, of a + ContentHandler which contains a callback + routine for each of the event types encountered by the + parser, e.g., startDocument(), + startElement(), characters(), + endElement() and endDocument(). + Parsing is initiated by a call to the parser() + method of the XMLReader. Note that the call to + parser() and the calls to individual callback + methods are synchronous: parser() will only + return when the last callback method returns, and each + callback must complete before the next is called.

    + Figure 1 +

    +
    +

    + In the process of parsing, the hierarchical structure of the + original FO tree is flattened into a number of streams of + events of the same type which are reported in the sequence + in which they are encountered. Apart from that, the API + imposes no structure or constraint which expresses the + relationship between, e.g., a startElement event and the + endElement event for the same element. To the extent that + such relationship information is required, it must be + managed by the callback routines. +

    +

    + The most direct approach here is to build the tree + "invisibly"; to bury within the callback routines the + necessary code to construct the tree. In the simplest case, + the whole of the FO tree is built within the call to + parser(), and that in-memory tree is subsequently + processed to (a) validate the FO structure, and (b) + construct the Area tree. The problem with this approach is + the potential size of the FO tree in memory. FOP has + suffered from this problem in the past. +

    +
    +
    + Cluttered callbacks +

    + On the other hand, the callback code may become increasingly + complex as tree validation and the triggering of the Area + tree processing and subsequent rendering is moved into the + callbacks, typically the endElement() method. + In order to overcome acute memory problems, the FOP code was + recently modified in this way, to trigger Area tree building + and rendering in the endElement() method, when + the end of a page-sequence was detected. +

    +

    + The drawback with such a method is that it becomes difficult + to detemine the order of events and the circumstances in + which any particular processing events are triggered. When + the processing events are inherently self-contained, this is + irrelevant. But the more complex and context-dependent the + relationships are among the processing elements, the more + obscurity is engendered in the code by such "side-effect" + processing. +

    +
    +
    + From passive to active parsing +

    + In order to solve the simultaneous problems of exposing the + structure of the processing and minimising in-memory + requirements, the experimental code separates the parsing of + the input source from the building of the FO tree and all + downstream processing. The callback routines become + minimal, consisting of the creation and buffering of + XMLEvent objects as a producer. All + of these objects are effectively merged into a single event + stream, in strict event order, for subsequent access by the + FO tree building process, acting as a + consumer. In itself, this does not reduce the + footprint. This occurs when the approach is generalised to + modularise FOP processing.

    Figure 2 +

    +
    +

    + The most useful change that this brings about is the switch + from passive to active XML element + processing. The process of parsing now becomes visible to + the controlling process. All local validation requirements, + all object and data structure building, is initiated by the + process(es) getting from the queue - in the case + above, the FO tree builder. +

    +
    +
    + XMLEvent methods + +

    + The experimental code uses a class XMLEvent + to provide the objects which are placed in the queue. + XMLEvent includes a variety of methods to access + elements in the queue. Namespace URIs encountered in + parsing are maintined in a static + HashMap where they are associated with a unique + integer index. This integer value is used in the signature + of some of the access methods. +

    +
    +
    XMLEvent getEvent(SyncedCircularBuffer events)
    +
    + This is the basis of all of the queue access methods. It + returns the next element from the queue, which may be a + pushback element. +
    +
    XMLEvent getEndDocument(events)
    +
    + get and discard elements from the queue + until an ENDDOCUMENT element is found and returned. +
    +
    XMLEvent expectEndDocument(events)
    +
    + If the next element on the queue is an ENDDOCUMENT event, + return it. Otherwise, push the element back and throw an + exception. Each of the get methods (except + getEvent() itself) has a corresponding + expect method. +
    +
    XMLEvent get/expectStartElement(events)
    +
    Return the next STARTELEMENT event from the queue.
    +
    XMLEvent get/expectStartElement(events, String + qName)
    +
    + Return the next STARTELEMENT with a QName matching + qName. +
    +
    + XMLEvent get/expectStartElement(events, int uriIndex, + String localName) +
    +
    + Return the next STARTELEMENT with a URI indicated by the + uriIndex and a local name matching localName. +
    +
    + XMLEvent get/expectStartElement(events, LinkedList list) +
    +
    + list contains instances of the nested class + UriLocalName, which hold a + uriIndex and a localName. Return + the next STARTELEMENT with a URI indicated by the + uriIndex and a local name matching + localName from any element of + list. +
    +
    XMLEvent get/expectEndElement(events)
    +
    Return the next ENDELEMENT.
    +
    XMLEvent get/expectEndElement(events, qName)
    +
    Return the next ENDELEMENT with QName + qname.
    +
    XMLEvent get/expectEndElement(events, uriIndex, localName)
    +
    + Return the next ENDELEMENT with a URI indicated by the + uriIndex and a local name matching + localName. +
    +
    + XMLEvent get/expectEndElement(events, XMLEvent event) +
    +
    + Return the next ENDELEMENT with a URI matching the + uriIndex and localName + matching those in the event argument. This + is intended as a quick way to find the ENDELEMENT matching + a previously returned STARTELEMENT. +
    +
    XMLEvent get/expectCharacters(events)
    +
    Return the next CHARACTERS event.
    +
    +
    +
    + FOP modularisation +

    + This same principle can be extended to the other major + sub-systems of FOP processing. In each case, while it is + possible to hold a complete intermediate result in memory, + the memory costs of that approach are too high. The + sub-systems - xml parsing, FO tree construction, Area tree + construction and rendering - must run in parallel if the + footprint is to be kept manageable. By creating a series of + producer-consumer pairs linked by synchronized buffers, + logical isolation can be achieved while rates of processing + remain coupled. By introducing feedback loops conveying + information about the completion of processing of the + elements, sub-systems can dispose of or precis those + elements without having to be tightly coupled to downstream + processes.

    + Figure 3 +

    +
    +
    +
    + +
    + diff --git a/src/documentation/content/xdocs/design/book.xml b/src/documentation/content/xdocs/design/book.xml index 43e5e4834..4e38265b4 100644 --- a/src/documentation/content/xdocs/design/book.xml +++ b/src/documentation/content/xdocs/design/book.xml @@ -12,6 +12,7 @@ + @@ -33,5 +34,8 @@ + + +