From: William Victor Mote Date: Mon, 21 Apr 2003 08:01:01 +0000 (+0000) Subject: Add missing
ids. X-Git-Tag: Root_Temp_KnuthStylePageBreaking~1599 X-Git-Url: https://source.dussan.org/?a=commitdiff_plain;h=c2eaa58ff0e9e1f63c846737ab95c962e238f110;p=xmlgraphics-fop.git Add missing
ids. Move primary design goals up to
tags. Add some content on area tree recycling. git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk@196283 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/src/documentation/content/xdocs/design/index.xml b/src/documentation/content/xdocs/design/index.xml index c202e9e86..4c0cf687c 100644 --- a/src/documentation/content/xdocs/design/index.xml +++ b/src/documentation/content/xdocs/design/index.xml @@ -9,23 +9,24 @@ -
- Introduction -

The articles in this section describe the design and architecture details for FOP.

- The articles in this section pertain to the redesign or trunk line of development. + The articles in this section pertain to the redesign or trunk line of development. The redesign is mainly focusing on parts of the layout process (converting the FO tree into the Area Tree). -
-
+
Primary Design Goals -

The primary design goals for FOP are:

-
    -
  • Comply with the spec.
  • -
  • Process files of arbitrary size (limited only by storage).
  • -
+

A discussion of project design properly begins with a list of the goals of the project. Out of these goals will flow the design issues and details, and eventually, the implementation.

+
+ Conformance to the XSL-FO Specification +

The current design goal is to reach the "basic" level of conformance, and to have enough flexibility in the design to reach "complete" conformance without major rewriting. +After "basic" conformance is achieved, it is probably that higher levels of conformance will be sought.

+
+
+ Process Files of Arbitrary Size +

Except for user storage limitations, the design goal is to be able to process files of any size.

+
-
+
Secondary Design Goals -
+
Keep Memory Minimal

Many FOP design decisions revolve around trying to minimize the use of memory. The primary purpose here is to reduce the amount of data that must be serialized to storage during processing. @@ -41,7 +42,7 @@ To the extent that it can be done so without jeopardizing the primary design goa To achieve our design goals, we have identified and attempted to resolve some design issues. Since they are in support of the primary and secondary goals, they are not necessarily written in stone. However, most of them have been discussed at length among the developers, and are reasonably well settled.

-
+
Use SAX as Input

The two standard ways of dealing with XML input are SAX and DOM. SAX basically creates events as it parses an XML document in a serial fashion; a program using SAX (and not storing anything internally) will only see a small window of the document at any point in time, and can never look forward in the document. DOM creates and stores a tree representation of the document, allowing a view of the entire document as an integrated whole. One issue that may seem counter-intuitive to some new FOP developers, and which has from time to time been contentious, is that FOP uses SAX for input. (DOM can be used as input as well, but it is converted into SAX events before entering FOP, effectively negating its advantages).

Since FOP essentially needs a tree representation of the FO input, at first glance it seems to make sense to use DOM. Instead, FOP takes SAX events and builds its own tree-like structure. Why?

@@ -50,35 +51,39 @@ However, most of them have been discussed at length among the developers, and ar
  • DOM contains an entire document. FOP is able to process individual fo:page-sequence objects discretely, without the need to have the entire document in memory. For documents that have only one fo:page-sequence object, FOP's approach is no advantage, but in other cases it is a huge advantage. A 500-page book that is broken into 100 5-page chapters, each in its own fo:page-sequence, essentially needs only 1% of the document memory that would be required if using DOM as input.
  • -
    +
    Process FO Elements ASAP

    The issue here is that we wish to recycle FO Tree memory as much as possible. There are at least three possible places that FO Tree fragments can be passed to the Layout process, and their memory recycled:

      -
    • fo:block It might be tempting to start laying out pages as soon as the first fo:block object is finished. However, there are many downstream things that can affect the placement of that block on a page, such as graphics and footnotes. So, in order to maintain conformance to the XSL-FO specification, and create high-quality output, we must see more of the document.
    • -
    • fo:root The other extreme is to wait until the entire document is read in before processing any of it. This essentially means that there is no memory recycling. Processing the document correctly is more important than saving memory, so this option would be used if there were no better alternative.
    • -
    • fo:page-sequence The page-sequence object provides a nice clean break in the document. Content from one page-sequence will never interfere with nor affect the placement of the content of another. FOP uses this option as the optimum way to maintain compliance with the standard and to minimize memory consumption.
    • +
    • + fo:block It might be tempting to start laying out pages as soon as the first fo:block object is finished. However, there are many downstream things that can affect the placement of that block on a page, such as graphics and footnotes. So, in order to maintain conformance to the XSL-FO specification, and create high-quality output, we must see more of the document.
    • +
    • + fo:root The other extreme is to wait until the entire document is read in before processing any of it. This essentially means that there is no memory recycling. Processing the document correctly is more important than saving memory, so this option would be used if there were no better alternative.
    • +
    • + fo:page-sequence The page-sequence object provides a nice clean break in the document. Content from one page-sequence will never interfere with nor affect the placement of the content of another. FOP uses this option as the optimum way to maintain compliance with the standard and to minimize memory consumption.
    -
    +
    Serialize FO Tree as Necessary

    This issue is implied by the requirement to process documents of arbitrary size. Unless some arbitrary limit is placed on the size of page-sequence objects, FOP must be able to serialize FO tree fragments as necessary.

    -
    +
    Keep Layouts Simple

    Layout should handle floats, footnotes and keeps in a simple, straightforward way.

    -
    +
    Keep ID References Simple
    -
    +
    Render Pages ASAP

    The issue here is that we wish to recycle the Area Tree memory as much as possible. The problem is that forward references prevent pages from being resolved until the forward references are resolved. If memory is insufficient to store unresolved pages, Area Tree fragments must be serialized until resolved.

    +

    FOP developers have discussed adding the capability of using an Area Tree to render to more than one output target in the same run, which would be a complicating factor in disposal of pages as they are rendered.

    -
    +
    Renderers are Responsible

    Each renderer is totally responsible for its output format.

    -
    +
    Send Output to a Stream