aboutsummaryrefslogtreecommitdiffstats
path: root/src/documentation/content/xdocs/dev/design/layout.xml
blob: 394fa4763efa3a84820590538594026609e03d0a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!--
  Licensed to the Apache Software Foundation (ASF) under one or more
  contributor license agreements.  See the NOTICE file distributed with
  this work for additional information regarding copyright ownership.
  The ASF licenses this file to You under the Apache License, Version 2.0
  (the "License"); you may not use this file except in compliance with
  the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License.
-->
<!-- $Id$ -->
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.3//EN" "http://forrest.apache.org/dtd/document-v13.dtd">
<document>
  <header>
    <title>Apache™ FOP Design: Layout</title>
    <version>$Revision$</version>
    <authors>
      <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
    </authors>
  </header>
  <body>
    <section id="into">
      <title>Introduction</title>
    <p>The role of the layout managers is to build the Area Tree by using the information from the FO Tree.
The layout managers decide where information is placed in the area tree.</p>
    <p>A layout manager is typically associated with an FO Object but not always.</p>
    <p>The layout managers are in between the FO Tree and the Area Tree.
They get information from the FO Tree and create areas and build the pages.
They hold the state of the layout process as it builds up the areas and pages.
They also manage the handling of breaks and spacing between areas.</p>
    <p>FO Objects can have two types of properties, ones that relate to the layout and ones that relate to the rendering.
The layout related properties area used by the layout managers to determine how and where to create the areas.
The render related properties should be passed through to the renderer in the most efficient way possible.</p>
    <p>The aim of the layout system is to be self contained and allow for easy changes or extensions for future development.
For example the line breaking should be decided at a particular point in the process that makes it easier to handle other languages.</p>
    <p>The layout begins once the hierarchy of FO objects has been constructed.
Note: it may be possible to start immediately after a block formatting object has been added to the flow but this is not currently in the scope of the layout.
It is also possible to layout all pages in a page sequence after each page sequence has been added from the xml.</p>
    <p>The layout process is handled by a set of layout managers.
The block level layout managers are used to create the block areas which are added to the region area of a page.</p>
      <p>The traversal is done by the layout or structure process only in the flow elements.</p>
    </section>
    <section id="issues">
      <title>Design Issues</title>
      <section id="issue-simple-layout">
        <title>Keep Layouts Simple</title>
        <p>Layout should handle floats, footnotes and keeps in a simple, straightforward way.</p>
      </section>
      <section id="issue-simple-id-refs">
        <title>Keep ID References Simple</title>
      </section>
      <section id="issue-area-recycle">
        <title>Render Pages ASAP</title>
        <p>The issue here is that we wish to recycle the Area Tree memory as much as possible. The problem is that forward references prevent pages from being resolved until the forward references are resolved. If memory is insufficient to store unresolved pages, Area Tree fragments must be serialized until resolved.</p>
        <p>FOP developers have discussed adding the capability of using an Area Tree to render to more than one output target in the same run, which would be a complicating factor in disposal of pages as they are rendered.</p>
      </section>
    </section>
    <section id="lm">
      <title>Layout Managers</title>
      <p>The layout managers are set up from the hierarchy of the formatting object tree.
A manager represents a hierachy of area producing objects.
A manager is able to handle the block area(s) that it creates and organise or split areas for page breaks.</p>
      <p>Normally any object that creates a block area will have an associated layout manager.
Other cases are tables and lists, these objects will also have layout managers that will manager the group of layout managers that make up the object.</p>
      <p>A layout manager is also able to determine height (min/max/optimum) and keep status.
This will be used when organising the layout on a page.
The manager will be able to determine the next place a break can be made and then be able to organise the height.</p>
      <p>A layout manager is essentially a bridge between the formatting objects and the area tree.
It will keep a list of line areas inside block areas.
Each line area will contain a list of inline areas that is able to be adjusted if the need arises.</p>
      <p>The objects in the area tree that are organised by the manager will mostly contain the information about there layout such as spacing and keeps, this information will be thrown away once the layout for a page is finalised.</p>
    </section>
    <section id="creating">
      <title>Creating Managers</title>
      <p>The managers are created by the page sequence.
The top level manager is the Page manager.
This asks the flow to add all managers in this page sequence.</p>
      <p>For block level objects they have a layout manager.
Neutral objects don't represent any areas but are used to contain a block level area and as such these objects will ask the appropriate child to create a layout manager.</p>
      <p>Any nested block areas or inline areas may be handled by the layout manager at a later stage.</p>
    </section>
    <section id="using">
      <title>Using Managers</title>
      <p>Block area layout managers are used to create a block area, other block level managers may ask their child layout managers to create block areas which are then added to the area tree(subset).</p>
      <p>A manager is used to add areas to a page until the page is full, then the manages contain all the information necessary to make the decision about page break and spacing.
A manager can split an area that it has created will keep a status about what has been added to the current area tree.</p>
    </section>
    <section id="page">
      <title>Page Layout</title>
      <p>Once the Page layout manager, belonging to the page sequence, is ready then we can start laying out each page.
The page sequence will create the current page to put the page data, the next page and if it exists
a last page.</p>
      <p>The current page will have the areas added to it from the block layout
managers.
The next page will be used when splitting a block that goes over the page break.
Note: any page break overrides the layout decided here.
The last page will be necessary if the last block area is added to this page.
The size of the last page will be considered and the areas will be added to the last page instead.</p>
      <p>The first step is to add areas to the current page until the area is full and the lines of the last block area contain at least n(orphans) and at least n(orphans) + n(widows) in total.
This will only be relevant for areas at the start or end of a particular reference area.</p>
      <!--<p><img src="page.svg" alt="Diagram of Page Layout"/></p>-->
      <p>The spacing between the areas (including spacing in block areas inside an inline-container) will be set to the minimum values.
This will allow the page to have at least all the information it needs to organise the page properly.</p>
      <p>This should handle the situation where there are keeps on some block areas that go over the end of the page better.
It is possible that fitting the blocks on the page using a spacing between min and optimum would give a closer value to the optimum than putting the blocks on the next page and the spacing being between optimum and max.
So if the objects are placed first at optimum then you will need to keep going to see if there is a lower keep further on that has a spacing that is closer to the optimum.</p>
      <p>The spacing and keep information is stored so that the area positions
and sizes can be adjusted.</p>
    </section>
    <section id="page-balance">
      <title>Balancing Page</title>
      <p>The page is vertically justified so that it distributes the areas on the page for the best result when considering keeps and spacing.</p>
    </section>
    <section id="finding-break">
      <title>Finding Break</title>
      <p>First the keeps are checked.
The available space on the page may have changed due to the presence of before floats or footnotes.
The page break will need to be at a height &lt;= the available space on the page.</p>
      <p>A page break should be made at the first available position that has the lowest keep value when searching from the bottom.
Once the first possible break is found then the next possible break, with equally low keep value, is considered.
If the height of the page is closer to the optimal spacing then this break will be used instead.</p>
      <p>Keep values include implicit and explicit values when trying to split a block area into more than one area.
Implicit keeps may be such things as widows/orphans.</p>
      <p>If the page contains before floats or footnotes then as each area or line area is removed the float/footnote should also be removed.
This will change the available space and is a one way operation.
The footnote should be removed first as a footnote may be placed on the next page.
The lowest keep value may need to be reassessed as each conditional area is removed.</p>
      <p>The before float and footnote regions are managed so that the separator
regions will be present if it contains at least one area.</p>
    </section>
    <section id="optimize">
      <title>Optimising</title>
      <p>Once the areas for the page are finalised then the spacing will need to be adjusted.
The available height on the page is compared with the min and max spacing.
All of the spacing in all the areas on the page is then adjusted by the appropriate percentage value.</p>
    </section>
    <section id="multi-column">
      <title>Multi-Column Pages</title>
      <p>In the case of multi-column pages the column breaks and eventually the page break must be found in a slightly different way.</p>
      <p>The columns need to be layed out completely from first to last but this can only be done after a rough estimate of all the elements on the page in case of before floats or footnotes.</p>
      <p>So first the complete page is layed out with all columns filled with areas and the spacing at a minimum.
Then if there are any before floats or footnotes then the availabe space is adjusted.
Then each the best break is found for each column starting from the first column.
If any before floats or footnotes are removed as a result of the new breaks and optimised spacing then all the columns should still be layed out for the same column height.</p>
    </section>
    <section id="page-complete">
      <title>Completing Page</title>
      <p>After the region body has been finished the static areas can be layed out.
The width of the static area is set and the height is inifinite, that is all block areas should be placed in the area and their visibility is controlled be other factors.</p>
      <p>The area tree for the region body will contain the information about markers that may be necessary for the retrieve marker.</p>
      <p>The ordering of the area tree must be adjusted so that the areas are before, start, body, end and after in that order.
The body region should be in the order before float, main then footnote.</p>
    </section>
    <section id="line-area">
      <title>Line Areas</title>
      <p>Creating a line areas uses a similair concept.
Each inline area is placed across the available space until there is no room left.
The line is then split by considering all keeps and spacing.</p>
      <p>Each word (group of adjacent character inline areas) will have keeps based on hyphenation.
The line break is at the lowest keep value starting from the end of the line.</p>
      <p>Once a line has been layed out for a particular width then that line is fixed for the page (except for unresolved page references).</p>
    </section>
    <section id="before-float-footnote">
      <title>Before Floats and Footnotes</title>
      <p>The before float region and footnote region are handled by the page layoutmanger.
These regions will handle the addition and removal of the separator regions when before floats/footnotes area added and removed.</p>
      <p>Footnotes and Before Floats are placed in special areas in the body region
of the page.
The size of these areas is determined by the content.
This in turn effects the available size of the main reference area that contains the flow.</p>
      <p>A layout manager handles the adding and removing of footnotes/floats, this in turn effects the available space in the main reference area.</p>
    </section>
    <section id="side-float">
      <title>Side Floats</title>
      <p>If a float anchor is present in a particular line area then the available space for that line (and other in the block) will be reduced.
The side float adds to the height of the block area and this height also depends on the clear value of subsequent blocks.
The keep status of the block is also effected as there must be enough space on the page to fit the
side float.</p>
      <p>Side floats alter the length of the inline progression dimension for the
current line and following lines for the size of the float.</p>
      <p>This means that the float needs to be handled by the block layout manager
so that it can adjust the available inline progression dimension for the
relevant line areas.</p>
      <!--<p><img src="float.svg" alt="Diagram of Float"/></p>-->
    </section>
    <section id="unresolved-area">
      <title>Unresolved Areas</title>
      <p>Once the layout of the page is complete there may be unresolved areas.</p>
      <p>Page number citations and links may require following pages to be layed out before they can be resolved.
These will remain in the area tree as unresolved areas.</p>
      <p>As each page is completed the list of unresolved id's will be checked and if the id can be resolved it will be.
Once all id's are resolved then the page can be rendered.</p>
      <p>Each page contains a map of all unresolved id's and the corresponding areas.</p>
      <p>In the case of page number citations, the areas reserves the equivalent of 3 number nines in the current font.
When the area is resolved then the area is adjusted to its proper size and the line area is
re-aligned to accomodate the change.</p>
    </section>
    <section id="id-link-area">
      <title>ID and Link Areas</title>
      <p>Any formatting object that has an ID or any inline link defines an area that will be required when rendering and resolving id references.</p>
      <p>This area is stored in the parent area and may be a shape that exists
in more than one page, for example over a page break.
This shape consists of the boundary of all inline (or block) areas that the shape is defined for.</p>
    </section>
    <section id="inline-area">
      <title>Inline Areas</title>
      <p>This is the definition of all inline areas that will exist in the area.</p>
    </section>
    <section id="fixed-area">
      <title>Fixed Areas</title>
      <p>instream-foreign-object, external-graphic, inline-container</p>
      <p>These areas have a fixed width and height. They also have a viewport.</p>
    </section>
    <section id="stretch-area">
      <title>Stretch Areas</title>
      <p>leader, inline space</p>
      <p>These areas have a fixed height but the width may vary.</p>
    </section>
    <section id="character-area">
      <title>Character Areas</title>
      <p>character</p>
      <p>This is an simple character that has fixed properties according to the current font.
There are implicit keeps with adjacent characters.</p>
    </section>
    <section id="anchor-area">
      <title>Anchor Areas</title>
      <p>float anchor, footnote anchor</p>
      <p>This area has no size.
It keeps the position for footnotes and floats and has a keep with the associated inline area.</p>
    </section>
    <section id="unresolved-page-num">
      <title>Unresolved Page Numbers</title>
      <p>page-number-citation</p>
      <p>A page number area that needs resolving, behaves as a character and has the space of 3 normal characters reserved.
The size will adjust when the value is resolved.</p>
    </section>
    <section id="block-area">
      <title>Block Areas</title>
      <p>When a block creating element is complete then it is possible to build the
block area and add it to the paprent.</p>
      <p>A block area will contain either more block areas or line areas, which are
special block areas.
The line areas are created by the LineLayoutManager in which the inline areas flow into.</p>
      <p>So a block area manager handles the lines or blocks as its children and determines things like spacing and breaks.</p>
      <p>In the case of tables and lists the blocks are stacked in a specific way that needs to be handled by the layout manager.</p>
      <p>The block area has info about the following:</p>
      <ul>
        <li>all anchors including which lines they are on</li>
        <li>unresolved page references with line info</li>
        <li>id and link areas</li>
        <li>height (min/max/optimum) or area including floats</li>
        <li>holds space before/after and keep information</li>
        <li>widows and orphans</li>
      </ul>
      <p>Once the layout has been finalised then this information can be discarded.</p>
    </section>
    <section id="page-area">
      <title>Page Areas</title>
      <p>Contains inforamtion about all the block areas in the body, before area and footer area.</p>
      <p>Has a list of the unresolved page references and a list of id refences that can be used to obtain the area associated with that id.</p>
    </section>
    <section id="test-cases">
      <title>Test Cases</title>
      <p>Here a few layout possibilities areas explored to determine how the layout process will handle these situations.</p>
      <section id="test-simple">
        <title>Simple Pages</title>
        <p>All blocks (including nested) are placed on the page with minimum spacing and the last block has the minimum number of lines past the page end.
The lowest keep value is then found within the body area limits.
Then the next equally low keep is found to determine if the spacing will be closer to the optimum values.</p>
      </section>
      <section id="test-before-float-footnote">
        <title>Before Floats/Footnotes</title>
        <p>After filling the page with the block areas then the new body height is used to find the best position to break.
Before each line area or block area is remove any associated before floats and footnotes are removed.
This will then adjust the available space on the page and may allow for a different breaking point.
Areas are removed towards the new breaking point until the areas fit on the page.
When finding the optimum spacing the removal of before floats and footnotes must also be  onsidered.</p>
      </section>
      <section id="test-multi-column">
        <title>Multicolumn</title>
        <p>First the page is filled with all columns for the intial page area.
Then each column is adjusted for the new height starting from the first column.
The best break for the column is found then the next column is considered, any left over areas a pre-pended to the next column.
Once all the columns are finished then all the columns are adjusted to fit in the same height columns.
This handles the situation where before floats or footnotes may have been removed.</p>
      </section>
      <section id="test-last-page">
        <title>Last Page</title>
        <p>If in the process of adding areas to a page it is found that there are no more areas in the flow then this page will need to be changed to the last page (if applicable).
The areas are then placed on a last page.</p>
      </section>
    </section>
    <section id="status">
      <title>Status</title>
      <section id="status-todo">
        <title>To Do</title>
        <table>
          <tr>
            <th>Task</th>
            <th>Priority</th>
            <th>Notes</th>
          </tr>
          <tr>
            <td>Keep Properties</td>
            <td>High</td>
            <td>Need to get keep-* properties working on block level constructs</td>
          </tr>
          <tr>
            <td>Justified Text</td>
            <td>High</td>
            <td>This has been completed, thanks largely to Luca Furini. Although there is still issue 28706
that requires further analysis.</td>
          </tr>
          <tr>
            <td>Multi-column layout</td>
            <td>High</td>
            <td/>
          </tr>
          <tr>
            <td>Get Markers Working</td>
            <td>High</td>
            <td>Main Problem is markers can be added to wrong page. LEWP is returning first on Next page!</td>
          </tr>
          <tr>
            <td>Absolutely positioned block containers</td>
            <td>High</td>
            <td/>
          </tr>
          <tr>
            <td>Background Images</td>
            <td>High</td>
            <td/>
          </tr>
          <tr>
            <td>Conditional space suppression</td>
            <td>High</td>
            <td/>
          </tr>
          <tr>
            <td>Fix break-* properties</td>
            <td>High</td>
            <td/>
          </tr>
          <tr>
            <td>Footnotes</td>
            <td>Medium</td>
            <td/>
          </tr>
          <tr>
            <td>Relative positioned block containers</td>
            <td>Medium</td>
            <td/>
          </tr>
          <tr>
            <td>Fine-tuning line breaking and hypenation</td>
            <td>Medium</td>
            <td/>
          </tr>
          <tr>
            <td>Last page layout</td>
            <td>Medium</td>
            <td/>
          </tr>
          <tr>
            <td>Aligned leaders, especially in justified text</td>
            <td>Medium</td>
            <td/>
          </tr>
          <tr>
            <td>Floats of all kind</td>
            <td>Low</td>
            <td/>
          </tr>
          <tr>
            <td>Border collapsing on tables</td>
            <td>Low</td>
            <td>RenderX hasnt implemented this (17/03/04)</td>
          </tr>
          <tr>
            <td>Fine-tuning all other borders</td>
            <td>Low</td>
            <td>Not sure what Joerg means by this, border collapse priorties?
Dashed and dotted borders have been implemented in PDF</td>
          </tr>
          <tr>
            <td>BIDI support</td>
            <td>Low</td>
            <td/>
          </tr>
        </table>
      </section>
      <section id="status-wip">
        <title>Work in Progress</title>
      </section>
      <section id="status-complete">
        <title>Completed</title>
      </section>
    </section>
  </body>
</document>