Area Tree

From: William Victor Mote Date: Sat, 30 Nov 2002 06:52:16 +0000 (+0000) Subject: white-space and line-ending fixes X-Git-Tag: Alt-Design-integration-base~288 X-Git-Url: https://source.dussan.org/?a=commitdiff_plain;h=8d3f23d1924ceb5822daa3c280864fd58815f8c5;p=xmlgraphics-fop.git white-space and line-ending fixes git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk@195683 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/docs/design/understanding/area_tree.xml b/docs/design/understanding/area_tree.xml index 9013f647a..4dbe9e6e4 100644 --- a/docs/design/understanding/area_tree.xml +++ b/docs/design/understanding/area_tree.xml @@ -1,13 +1,13 @@ - -

- Area Tree - All you wanted to know about the Area Tree ! - - -

- + +

+ Area Tree + All you wanted to know about the Area Tree ! + + +

The Area Tree is an internal representation of the result document. This is a set of java classes that can put together a set of objects that @@ -16,7 +16,7 @@ represent the pages and their contents.

output using a renderer.

The Area Tree follows the description of the area tree in the XSL:FO specification.

The Area Tree consists of a set of pages, the actual implemenation places +

The Area Tree consists of a set of pages, the actual implemenation places these in a set of page sequences.

@@ -25,7 +25,7 @@ a set of page sequences.

The PageViewPort and Page with the regions is created by the LayoutMasterSet. The contents are then placed by the layout managers. Once the layout of a page is complete then it is added to the Area Tree.

Inside the page is a set of RegionViewport+Region pairs for each region on +

Inside the page is a set of RegionViewport+Region pairs for each region on the page.

@@ -42,7 +42,7 @@ area Word is also used for a group of consecutive characters.

The image and instream foreign object areas are placed inside a viewport. The leader (with use content) and unresolved page number areas are resolved to other inline areas.

Once a LineArea is filled with inline areas then the inline areas need to +

Once a LineArea is filled with inline areas then the inline areas need to be aligned and adjusted to fill the line properly.

@@ -69,8 +69,8 @@ line area or from the block area.

so that if a references is resolved during layout the page can be easily found and then fixed. Once all the forward references are resolved then the page is ready to be rendered.

To layout a page any areas that cannot be resolved need to reserve space. -Once the inline area is resolved then the complete line should be adjusted +

To layout a page any areas that cannot be resolved need to reserve space. +Once the inline area is resolved then the complete line should be adjusted to accomodate any change in space used by the area.

@@ -103,7 +103,7 @@ The StorePagesModel stores all the pages so that any page can be later accessed.

The Area Tree retains the concept of page sequences (this is not in the area tree in the spec) so that this information can be passed to the -renderer. This is useful for setting the title and organising the groups +renderer. This is useful for setting the title and organising the groups of page sequences.

@@ -113,5 +113,5 @@ of page sequences.

Caching implementation.

- + \ No newline at end of file diff --git a/docs/design/understanding/fo_tree.xml b/docs/design/understanding/fo_tree.xml index 83e58ef8c..cbe5e5f35 100644 --- a/docs/design/understanding/fo_tree.xml +++ b/docs/design/understanding/fo_tree.xml @@ -1,11 +1,11 @@ -

- FO Tree - All you wanted to know about FO Tree ! - - -

+ FO Tree + All you wanted to know about FO Tree ! + + +

The FO Tree is a representation of the XSL:FO document. This diff --git a/docs/design/understanding/handling_attributes.xml b/docs/design/understanding/handling_attributes.xml index 1ae043059..81af0b40f 100644 --- a/docs/design/understanding/handling_attributes.xml +++ b/docs/design/understanding/handling_attributes.xml @@ -1,13 +1,13 @@ - -

- Handling Attributes - All you wanted to know about FOP Handling Attributes ! - - -

- -

Yet to come :))

- The series of notes for developers has started but it has not yet gone so far ! Keep watching + +

+ Handling Attributes + All you wanted to know about FOP Handling Attributes ! + + +

+ +

Yet to come :))

+ The series of notes for developers has started but it has not yet gone so far ! Keep watching \ No newline at end of file diff --git a/docs/design/understanding/images.xml b/docs/design/understanding/images.xml index 6aaa82bc8..68d71bcf2 100644 --- a/docs/design/understanding/images.xml +++ b/docs/design/understanding/images.xml @@ -1,21 +1,21 @@ - -

- Images - All you wanted to know about Images in FOP ! - - -

+ +

+ Images + All you wanted to know about Images in FOP ! + + +

- - this is still in progress, input in the code is welcome. Needs documenting formats, testing. + + this is still in progress, input in the code is welcome. Needs documenting formats, testing. So all those people interested in images should get involved. -

Images may only be needed to be loaded when the image is rendered to the +

Images may only be needed to be loaded when the image is rendered to the output or to find the dimensions.
An image url may be invalid, this can be costly to find out so we need to -keep a list of invalid image urls.

+keep a list of invalid image urls.

We have a number of different caching schemes that are possible.

All images are referred to using the url given in the XSL:FO after removing "url('')" wrapping. This does @@ -25,7 +25,7 @@ have the url as a reference. The images are handled through a static interface in ImageFactory.

(insert image)

@@ -129,9 +129,8 @@ then load the required data depending on the image mime type. If the renderer can insert the image into the document and use that data for all future references of the same image then it can cache the reference in the renderer and the image can be released from the image cache.

- + - @@ -143,4 +142,5 @@ renderer and the image can be released from the image cache.

- + + diff --git a/docs/design/understanding/layout_managers.xml b/docs/design/understanding/layout_managers.xml index 48e85c94b..297531b2d 100644 --- a/docs/design/understanding/layout_managers.xml +++ b/docs/design/understanding/layout_managers.xml @@ -1,13 +1,13 @@ - -

- Layout Managers - All you wanted to know about Layout Managers ! - - -

- + +

+ Layout Managers + All you wanted to know about Layout Managers ! + + +

+ @@ -63,5 +63,5 @@ the flow.

(note: more info to follow) - + \ No newline at end of file diff --git a/docs/design/understanding/layout_process.xml b/docs/design/understanding/layout_process.xml index 4c426d8eb..cc9c7fc18 100644 --- a/docs/design/understanding/layout_process.xml +++ b/docs/design/understanding/layout_process.xml @@ -1,13 +1,13 @@ - -

- Layout Process - All you wanted to know about the Layout Process ! - - -

- -

Yet to come :))

- The series of notes for developers has started but it has not yet gone so far ! Keep watching + +

+ Layout Process + All you wanted to know about the Layout Process ! + + +

+ +

Yet to come :))

+ The series of notes for developers has started but it has not yet gone so far ! Keep watching \ No newline at end of file diff --git a/docs/design/understanding/pdf_library.xml b/docs/design/understanding/pdf_library.xml index 434cc03f8..642ca9ec3 100644 --- a/docs/design/understanding/pdf_library.xml +++ b/docs/design/understanding/pdf_library.xml @@ -1,13 +1,13 @@ - -

- PDF Library - All you wanted to know about the PDF Library ! - - -

- + +

+ PDF Library + All you wanted to know about the PDF Library ! + + +

The PDF Library is an independant package of classes in FOP. These class provide a simple way to construct documents and add the contents. The @@ -26,12 +26,12 @@ There are a number of methods that can be used to create/add certain PDF objects

The PDF Document is built by creating a page for each page in the Area Tree.

This page then has all the contents added. The page is then added to the document and available objects can be written to the output stream.

The contents of the page are things such as text, lines, images etc. -The PDFRenderer inserts the text directly into a pdf stream. +

The contents of the page are things such as text, lines, images etc. +The PDFRenderer inserts the text directly into a pdf stream. The text consists of markup to set fonts, set text position and add text.

Most of the simple pdf markup is inserted directly into a pdf stream. +

Most of the simple pdf markup is inserted directly into a pdf stream. Other more complex objects or commonly used objects are added through java classes. -Some pdf objects such as an image consists of two parts.

+Some pdf objects such as an image consists of two parts.

It has a separate object for the image data and another bit of markup to display the image in a certain position on the page.

The java objects that represent a pdf object implement a method that returns the markup for inserting into a stream. The method is: byte[] toPDF().

@@ -74,5 +74,5 @@ The method is: byte[] toPDF().

- + \ No newline at end of file diff --git a/docs/design/understanding/properties.xml b/docs/design/understanding/properties.xml index 529ec8673..442e3ed64 100644 --- a/docs/design/understanding/properties.xml +++ b/docs/design/understanding/properties.xml @@ -1,13 +1,13 @@ - -

- Properties - All you wanted to know about the Properties ! - - -

- + +

+ Properties + All you wanted to know about the Properties ! + + +

During XML Parsing, the FO tree is constructed. For each FO object (some subclass of FObj), the tree builder then passes the list of all attributes specified on the FO element to the handleAttrs method. This @@ -115,7 +115,7 @@ but may not be completely up-to-date - +

explain PropertyManager vs. direct access
Explain corresponding properties

@@ -126,5 +126,5 @@ but may not be completely up-to-date keyword values and shorthand values (one attribute which sets several properties)

- + \ No newline at end of file diff --git a/docs/design/understanding/renderers.xml b/docs/design/understanding/renderers.xml index e597d118b..d3db3647c 100644 --- a/docs/design/understanding/renderers.xml +++ b/docs/design/understanding/renderers.xml @@ -1,13 +1,13 @@ - -

- Renderers - All you wanted to know about the Renderers ! - - -

- + +

+ Renderers + All you wanted to know about the Renderers ! + + +

+ @@ -18,21 +18,21 @@ in the order they appear in the document. In order to save memory it is possble to render the pages out of order. Any page that is not reeady to be rendered is setup by the renderer first so that it can reserve a space or reference for when the page is ready to be rendered.

The AbstractRenderer does most of the work to iterate through the area -tree parts. This means that the most renderers simply need to implement -the specific parts with inserting text, images and lines. The methods can -easily be overridden to handle things in a different way or do some extra +

The AbstractRenderer does most of the work to iterate through the area +tree parts. This means that the most renderers simply need to implement +the specific parts with inserting text, images and lines. The methods can +easily be overridden to handle things in a different way or do some extra processing.

The fonts are setup by the renderer being used. The font metrics are used +

The fonts are setup by the renderer being used. The font metrics are used during the layout process to determine the size of characters.

The render context is used by handlers. It contains information about the -current state of the renderer. Such as the page, the position and any +

The render context is used by handlers. It contains information about the +current state of the renderer. Such as the page, the position and any other miscellanous objects that are required to draw into the page.

diff --git a/docs/design/understanding/status.xml b/docs/design/understanding/status.xml index c1e7fb1ea..f0bbdce30 100644 --- a/docs/design/understanding/status.xml +++ b/docs/design/understanding/status.xml @@ -1,17 +1,17 @@ - -

- Tutorial series Status - Current Status of tutorial about FOP and Design - - -

+ +

+ Tutorial series Status + Current Status of tutorial about FOP and Design + + +

Peter said : Do we have a volunteer to track - Keiron's tutorials and turn them into web page documentation?

The answer is yes - we have, but the work is on progress !

Keiron has recently extended - the documentation generation on the CVS trunk to make this process a bit - easier. Keiron tells Peter that Apache is readying a major overhaul of its web - site and xml->html generation, but that should not deter us from proceeding - with documentation. + Keiron's tutorials and turn them into web page documentation?

The answer is yes + we have, but the work is on progress !

Keiron has recently extended + the documentation generation on the CVS trunk to make this process a bit + easier. Keiron tells Peter that Apache is readying a major overhaul of its web + site and xml->html generation, but that should not deter us from proceeding + with documentation. \ No newline at end of file diff --git a/docs/design/understanding/svg.xml b/docs/design/understanding/svg.xml index 7fd19f369..942a91358 100644 --- a/docs/design/understanding/svg.xml +++ b/docs/design/understanding/svg.xml @@ -1,57 +1,57 @@ - -

- SVG - All you wanted to know about SVG and FOP ! - - -

- -

SVG is rendered through Batik.

The XML from the XSL:FO document - is converted into an SVG DOM with batik. This DOM is then set as the Document - on the Foreign Object area in the Area Tree.

This DOM is then available to - be rendered by the renderer.

SVG is rendered in the renderers via an - XMLHandler in the FOUserAgent. This XML handler is used to render the SVG. The - SVG is rendered by using batik. Batik converts the SVG DOM into an internal - structure that can be drawn into a Graphics2D. So for PDF we use a - PDFGraphics2D to draw into.

This creates the necessary PDF information to - create the SVG image in the PDF document.

Most of the work is done in the - PDFGraphics2D class. There are also a few bridges that are plugged into batik - to provide different behaviour for some SVG elements.

Normally batik converts text into a set of curved - shapes.

This is handled as any other shapes when rendering to the output. This - is not always desirable as the shapes have very fine curves. This can cause the - output to look a bit bad in PDF and PS (it can be drawn properly but is not by - default). These curves also require much more data than the original - text.

To handle this there is a PDFTextElementBridge that is set when - using the bridge in batik. If the text is simple enough for the text to be - drawn in the PDF as with all other text then this sets the TextPainter to use - the PDFTextPainter. This inserts the text directly into the PDF using the - drawString method on the PDFGraphics2D.

Text is considered simple if the - font is available, the font size is useable and there are no tspans or other - complications. This can make the resulting PDF significantly - smaller.

To support links in PDF another batik - element bridge is used. The PDFAElementBridge creates a PDFANode which inserts - a link into the PDF document via the PDFGraphics2D.

Since links are - positioned on the page without any transforms then we need to transform the - coordinates of the link area so that they match the current position of the a - element area. This transform may also need to account for the svg being - positioned on the page.

Images are normally drawn - into the PDFGraphics2D. This then creates a bitmap of the image data that can - be inserted into the PDF document.

As PDF can support jpeg images then another - element bridge is used so that the jpeg can be directly inserted into the - PDF.

Batik provides a mechanism to - convert SVG into various formats. Through FOP we can convert an SVG document - into a single paged PDF document. The page contains the SVG drawn as best as - possible on the page. There is a PDFDocumentGraphics2D that creates a - standalone PDF document with a single page. This is then drawn into by batik in - the same way as with the PDFGraphics2D.

When rendering to AWT the SVG is simply drawn onto the - awt canvas using batik.

The PS Renderer uses a similar technique as the - PDF Renderer.

The SVG Renderer simply embeds the SVG inside an svg - element.

To get accurate drawing - pdf transparency is needed.
The drawRenderedImage methods need - implementing.
Handle colour space better.
Improve link handling - with pdf.
Improve image handling.

+ +

+ SVG + All you wanted to know about SVG and FOP ! + + +

+ +

SVG is rendered through Batik.

The XML from the XSL:FO document + is converted into an SVG DOM with batik. This DOM is then set as the Document + on the Foreign Object area in the Area Tree.

This DOM is then available to + be rendered by the renderer.

SVG is rendered in the renderers via an + XMLHandler in the FOUserAgent. This XML handler is used to render the SVG. The + SVG is rendered by using batik. Batik converts the SVG DOM into an internal + structure that can be drawn into a Graphics2D. So for PDF we use a + PDFGraphics2D to draw into.

This creates the necessary PDF information to + create the SVG image in the PDF document.

Most of the work is done in the + PDFGraphics2D class. There are also a few bridges that are plugged into batik + to provide different behaviour for some SVG elements.

Normally batik converts text into a set of curved + shapes.

This is handled as any other shapes when rendering to the output. This + is not always desirable as the shapes have very fine curves. This can cause the + output to look a bit bad in PDF and PS (it can be drawn properly but is not by + default). These curves also require much more data than the original + text.

To handle this there is a PDFTextElementBridge that is set when + using the bridge in batik. If the text is simple enough for the text to be + drawn in the PDF as with all other text then this sets the TextPainter to use + the PDFTextPainter. This inserts the text directly into the PDF using the + drawString method on the PDFGraphics2D.

Text is considered simple if the + font is available, the font size is useable and there are no tspans or other + complications. This can make the resulting PDF significantly + smaller.

To support links in PDF another batik + element bridge is used. The PDFAElementBridge creates a PDFANode which inserts + a link into the PDF document via the PDFGraphics2D.

Since links are + positioned on the page without any transforms then we need to transform the + coordinates of the link area so that they match the current position of the a + element area. This transform may also need to account for the svg being + positioned on the page.

Images are normally drawn + into the PDFGraphics2D. This then creates a bitmap of the image data that can + be inserted into the PDF document.

As PDF can support jpeg images then another + element bridge is used so that the jpeg can be directly inserted into the + PDF.

Batik provides a mechanism to + convert SVG into various formats. Through FOP we can convert an SVG document + into a single paged PDF document. The page contains the SVG drawn as best as + possible on the page. There is a PDFDocumentGraphics2D that creates a + standalone PDF document with a single page. This is then drawn into by batik in + the same way as with the PDFGraphics2D.

When rendering to AWT the SVG is simply drawn onto the + awt canvas using batik.

The PS Renderer uses a similar technique as the + PDF Renderer.

The SVG Renderer simply embeds the SVG inside an svg + element.

To get accurate drawing + pdf transparency is needed.
The drawRenderedImage methods need + implementing.
Handle colour space better.
Improve link handling + with pdf.
Improve image handling.

\ No newline at end of file diff --git a/docs/design/understanding/understanding.xml b/docs/design/understanding/understanding.xml index c34ec730b..b748ffca1 100644 --- a/docs/design/understanding/understanding.xml +++ b/docs/design/understanding/understanding.xml @@ -1,12 +1,12 @@ - -

- Understanding FOP Design - Tutorial series about Design Approach to FOP - - -

+ +

+ Understanding FOP Design + Tutorial series about Design Approach to FOP + + +

@@ -35,31 +35,31 @@ complicated as we proceed.

- - - + + +

FOP takes an xml file does its magic and then writes a document to a - stream.

xml -> [FOP] -> document

+ stream.

xml -> [FOP] -> document

The document could be pdf, ps etc. or directed to a printer or the screen. The principle remains the same. The xml document must be in the XSL:FO - format.

+ format.

For convenience we provide a mechanism to handle XML+XSL as - input.

+ input.

The xml document is always handled internally as SAX. The SAX events are used to read the elements, attributes and text data of the FO document. After the manipulation of the data the renderer writes out the pages in the appropriate format. It may write as it goes, a page at a time or the whole document at once. Once finished the document should contain all the data in the - chosen format ready for whatever use.

+ chosen format ready for whatever use.

The fo data goes through a few stages. Each piece of data will generally go through the process in the same way but some information may be used a number of times or in a different order. To reduce - memory one stage will start before the previous is completed.

+ memory one stage will start before the previous is completed.

SAX Handler -> FO Tree -> Layout Managers -> Area Tree - -> Render -> document

+ -> Render -> document

In the case of rtf, mif etc.
SAX Handler -> FO Tree -> - Structure Renderer -> document

+ Structure Renderer -> document

The FO Tree is constructed from the xml document. It is an internal representation of the xml document and it is like a DOM with some differences. The Layout Managers use the FO Tree do their layout stuff and create an Area @@ -69,26 +69,26 @@ convert the information into the render format. For example the PDF Renderer creates a PDF Document. For each page in the Area Tree the renderer creates a PDF Page and places the contents of the page into the PDF Page. Once a PDF Page - is complete then it can be written to the output stream.

+ is complete then it can be written to the output stream.

For the structure documents the Structure listener will read directly from the FO Tree and create the document. These documents do not need - the layout process or the Area Tree.

+ the layout process or the Area Tree.

Verify Structure Listener - concept.

- -

XML parsing
FO Tree
Properties
Layout Managers
Layout Process
Handling Attributes
Area Tree
Renderers
Images
PDF Library
SVG

- - + concept.

+ +

XML parsing
FO Tree
Properties
Layout Managers
Layout Process
Handling Attributes
Area Tree
Renderers
Images
PDF Library
SVG

+ + diff --git a/docs/design/understanding/xml_parsing.xml b/docs/design/understanding/xml_parsing.xml index a7c8d4a85..f249cbce9 100644 --- a/docs/design/understanding/xml_parsing.xml +++ b/docs/design/understanding/xml_parsing.xml @@ -1,15 +1,15 @@ - -

- XML Parsing - All you wanted to know about XML Parsing ! - - -

+ +

+ XML Parsing + All you wanted to know about XML Parsing ! + + +

- +

Since everyone knows the basics we can get - into the various stages starting with the XML handling.

+ into the various stages starting with the XML handling.

FOP can take the input XML in a number of ways:

Driver

- +

data source which is parsed and converted into SAX Events

Stream

String

XML+XSLT which is transformed using an XSLT Processor and the result is fired as SAX Events @@ -56,10 +56,10 @@

- +

The SAX Events which are fired on the SAX Handler, class FOTreeBuilder, must represent an XSL:FO document. If not there will be an - error. Any problems with the XML being well formed are handled here.

+ error. Any problems with the XML being well formed are handled here.

The element mapping is a hashmap of all the elements in a particular namespace. This makes it easy to create a different object for each element. Element mappings are static to save on @@ -68,14 +68,14 @@ This must contain a line with the fully qualified name of a class that implements the org.apache.fop.fo.ElementMapping interface. This will then be loaded automatically at the start. Internal mappings are: FO, SVG and Extension - (pdf bookmarks)

+ (pdf bookmarks)

The SAX Events will fire all the information for the document with start element, end element, text data etc. This information is used to build up a representation of the FO document. To do this for a namespace there is a set of element mappings. When an element + namepsace mapping is found then it can create an object for that element. If the element is not found then it creates a dummy object or a generic DOM for unknown - namespaces.

+ namespaces.

The object is then setup and then given attributes for the element. For the FO Tree the attributes are converted into properties. The FO objects use a property list mapping to convert the attributes into a list of properties @@ -83,7 +83,7 @@ constructed. This DOM can then be passed through to the renderer. Other element mappings can be used in different ways, for example to create elements that create areas during the layout process or setup information for the renderer - etc.

+ etc.

While the tree building is mainly about creating the FO Tree there are some stages that can propagate to the renderer. At @@ -98,9 +98,9 @@ sequence. The page may not yet be complete, however, containing forward page number references, for example.)

- - -

Error handling for xml not well formed.

Error handling for xml not well formed.
Error handling for other XML parsing errors.
Developer - info for adding namespace handlers.

\ No newline at end of file