1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
|
<?xml version="1.0" standalone="no"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
"http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
<document>
<header>
<title>Layout</title>
<subtitle>Layout Process in FOP</subtitle>
<authors>
<person name="Keiron Liddle" email="keiron@aftexsw.com"/>
</authors>
</header>
<body>
<section>
<title>FO Layout</title>
<p>
The aim of the layout system is to be self contained and allow for
easy changes or extensions for future development. For example the
line breaking should be decided at a particular point in the process
that makes it easier to handle other languages.
</p>
<p>
The layout begins once the hierarchy of FO objects has been constructed.
Note: it may be possible to start immediately after a block formatting
object has been added to the flow but this is not currently in the scope
of the layout. It is also possible to layout all pages in a page sequence
after each page sequence has been added from the xml.
</p>
<p>
The layout process is handled by a set of layout managers. The block
level layout managers are used to create the block areas which are
added to the region area of a page.
</p>
<section>
<title>Layout Managers</title>
<p>
The layout managers are set up from the hierarchy of the formatting
object tree. A manager represents a hierachy of area producing objects.
A manager is able to handle the block area(s) that it creates and
organise or split areas for page breaks.
</p>
<p>
Normally any object that creates a block area will have an associated
layout manager. Other cases are tables and lists, these objects will
also have layout managers that will manager the group of layout managers
that make up the object.
</p>
<p>
A layout manager is also able to determine height (min/max/optimum)
and keep status. This will be used when organising the layout on
a page. The manager will be able to determine the next place a break
can be made and then be able to organise the height.
</p>
<p>
A layout manager is essentially a bridge between the formatting objects
and the area tree. It will keep a list of line areas inside block areas.
Each line area will contain a list of inline areas that is able to be
adjusted if the need arises.
</p>
<p>
The objects in the area tree that are organised by the manager will mostly
contain the information about there layout such as spacing and keeps, this
information will be thrown away once the layout for a page is finalised.
</p>
</section>
<section>
<title>Creating Managers</title>
<p>
The managers are created by the page sequence. The top level manager
is the Page manager. This asks the flow to add all managers in this
page sequence.
</p>
<p>
For block level objects they have a layout manager. Neutral objects
don't represent any areas but are used to contain a block level
area and as such these objects will ask the appropriate child to
create a layout manager.
</p>
<p>
Any nested block areas or inline areas may be handled by the layout
manager at a later stage.
</p>
</section>
<section>
<title>Using Managers</title>
<p>
Block area layout managers are used to create a block area, other block
level managers may ask their child layout managers to create block areas
which are then added to the area tree (subset).
</p>
<p>
A manager is used to add areas to a page until the page is full,
then the manages contain all the information necessary to make
the decision about page break and spacing. A manager can split an
area that it has created will keep a status about what has been
added to the current area tree.
</p>
</section>
<section>
<title>Page Layout</title>
<p>
Once the Page layout manager, belonging to the page sequence, is ready
then we can start laying out each page. The page sequence will create
the current page to put the page data, the next page and if it exists
a last page.
</p>
<p>
The current page will have the areas added to it from the block layout
managers. The next page will be used when splitting a block that goes
over the page break. Note: any page break overrides the layout decided
here. The last page will be necessary if the last block area is added
to this page. The size of the last page will be considered and the
areas will be added to the last page instead.
</p>
<p>
The first step is to add areas to the current page until the area is full
and the lines of the last block area contain at least n(orphans) and at least
n(orphans) + n(widows) in total. This will only be relevant for areas at
the start or end of a particular reference area.
</p>
<p>
<!--img src="page.svg" alt="Diagram of Page Layout"/-->
</p>
<p>
The spacing between the areas (including spacing in block areas inside
an inline-container) will be set to the minimum values. This will allow
the page to have at least all the information it needs to organise the
page properly.
</p>
<p>
This should handle the situation where there are keeps on some
block areas that go over the end of the page better. It is possible that
fitting the blocks on the page using a spacing between min and optimum
would give a closer value to the optimum than putting the blocks on the
next page and the spacing being between optimum and max. So if the objects
are placed first at optimum then you will need to keep going to see if
there is a lower keep further on that has a spacing that is closer to the
optimum.
</p>
<p>
The spacing and keep information is stored so that the area positions
and sizes can be adjusted.
</p>
</section>
<section>
<title>Balancing Page</title>
<p>
The page is vertically justified so that it distributes the areas
on the page for the best result when considering keeps and spacing.
</p>
</section>
<section>
<title>Finding Break</title>
<p>
First the keeps are checked. The available space on the page may have
changed due to the presence of before floats or footnotes. The page break
will need to be at a height <= the available space on the page.
</p>
<p>
A page break should be made at the first available position that
has the lowest keep value when searching from the bottom. Once the first
possible break is found then the next possible break, with equally low
keep value, is considered. If the height of the page is closer to the
optimal spacing then this break will be used instead.
</p>
<p>
Keep values include implicit and explicit values when trying to
split a block area into more than one area. Implicit keeps may
be such things as widows/orphans.
</p>
<p>
If the page contains before floats or footnotes then as each area or line
area is removed the float/footnote should also be removed. This will
change the available space and is a one way operation. The footnote
should be removed first as a footnote may be placed on the next page.
The lowest keep value may need to be reassessed as each conditional
area is removed.
</p>
<p>
The before float and footnote regions are managed so that the separator
regions will be present if it contains at least one area.
</p>
</section>
<section>
<title>Optimising</title>
<p>
Once the areas for the page are finalised then the spacing will
need to be adjusted. The available height on the page is compared
with the min and max spacing. All of the spacing in all the areas
on the page is then adjusted by the appropriate percentage value.
</p>
</section>
<section>
<title>Multi-Column Pages</title>
<p>
In the case of multi-column pages the column breaks and eventually
the page break must be found in a slightly different way.
</p>
<p>
The columns need to be layed out completely from first to last but
this can only be done after a rough estimate of all the elements
on the page in case of before floats or footnotes.
</p>
<p>
So first the complete page is layed out with all columns filled
with areas and the spacing at a minimum. Then if there are any
before floats or footnotes then the availabe space is adjusted.
Then each the best break is found for each column starting from
the first column. If any before floats or footnotes are removed
as a result of the new breaks and optimised spacing then all the
columns should still be layed out for the same column height.
</p>
</section>
<section>
<title>Completing Page</title>
<p>
After the region body has been finished the static areas can be
layed out. The width of the static area is set and the height is
inifinite, that is all block areas should be placed in the area
and their visibility is controlled be other factors.
</p>
<p>
The area tree for the region body will contain the information
about markers that may be necessary for the retrieve marker.
</p>
<p>
The ordering of the area tree must be adjusted so that the areas are
before, start, body, end and after in that order. The body region
should be in the order before float, main then footnote.
</p>
</section>
<section>
<title>Line Areas</title>
<p>
Creating a line areas uses a similair concept. Each inline area
is placed across the available space until there is no room left.
The line is then split by considering all keeps and spacing.
</p>
<p>
Each word (group of adjacent character inline areas) will have keeps
based on hyphenation. The line break is at the lowest keep value
starting from the end of the line.
</p>
<p>
Once a line has been layed out for a particular width
then that line is fixed for the page (except for unresolved
page references).
</p>
</section>
<section>
<title>Before Floats and Footnotes</title>
<p>
The before float region and footnote region are handled by the page
layoutmanger. These regions will handle the addition and removal
of the separator regions when before floats/footnotes area added
and removed.
</p>
</section>
<section>
<title>Side Floats</title>
<p>
If a float anchor is present in a particular line area then the available
space for that line (and other in the block) will be reduced. The side float
adds to the height of the block area and this height also depends
on the clear value of subsequent blocks. The keep status of the block is
also effected as there must be enough space on the page to fit the
side float.
</p>
<p>
<!--img src="float.svg" alt="Diagram of Float"/-->
</p>
</section>
<section>
<title>Unresolved Areas</title>
<p>
Once the layout of the page is complete there may be unresolved areas.
</p>
<p>
Page number citations and links may require following pages to be
layed out before they can be resolved. These will remain in the
area tree as unresolved areas.
</p>
<p>
As each page is completed the list of unresolved id's will be checked
and if the id can be resolved it will be. Once all id's are resolved
then the page can be rendered.
</p>
<p>
Each page contains a map of all unresolved id's and the corresponding
areas.
</p>
<p>
In the case of page number citations. The areas reserves the equivalent
of 3 number nines in the current font. When the area is resolved
then the area is adjusted to its proper size and the line area is
re-aligned to accomodate the change.
</p>
</section>
<section>
<title>ID and Link Areas</title>
<p>
Any formatting object that has an ID or any inline link defines an area
that will be required when rendering and resolving id references.
</p>
<p>
This area is stored in the parent area and may be a shape that exists
in more than one page, for example over a page break. This shape consists
of the boundary of all inline (or block) areas that the shape is defined
for.
</p>
</section>
<section>
<title>Inline Areas</title>
<p>
This is the definition of all inline areas that will exist in the
area.
</p>
</section>
<section>
<title>Fixed Areas</title>
<p>
instream-foreign-object, external-graphic, inline-container
</p>
<p>
These areas have a fixed width and height. They also have a viewport.
</p>
</section>
<section>
<title>Stretch Areas</title>
<p>
leader, inline space
</p>
<p>
These areas have a fixed height but the width may vary.
</p>
</section>
<section>
<title>Character Areas</title>
<p>
character
</p>
<p>
This is an simple character that has fixed properties according to
the current font. There are implicit keeps with adjacent characters.
</p>
</section>
<section>
<title>Anchor Areas</title>
<p>
float anchor, footnote anchor
</p>
<p>
This area has no size. It keeps the position for footnotes and floats
and has a keep with the associated inline area.
</p>
</section>
<section>
<title>Unresolved Page Numbers</title>
<p>
page-number-citation
</p>
<p>
A page number area that needs resolving, behaves as a character and
has the space of 3 normal characters reserved. The size will adjust
when the value is resolved.
</p>
</section>
<section>
<title>Block Areas</title>
<p>
The block area has info about the following:
</p>
<ul>
<li>all anchors including which lines they are on</li>
<li>unresolved page references with line info</li>
<li>id and link areas</li>
<li>height (min/max/optimum) or area including floats</li>
<li>holds space before/after and keep information</li>
<li>widows and orphans</li>
</ul>
<p>
Once the layout has been finalised then this information can be
discarded.
</p>
</section>
<section>
<title>Page Areas</title>
<p>
Contains inforamtion about all the block areas in the body,
before area and footer area.
</p>
<p>
Has a list of the unresolved page references and a list of id refences
that can be used to obtain the area associated with that id.
</p>
</section>
<section>
<title>Test Cases</title>
<p>
Here a few layout possibilities areas explored to determine how the
layout process will handle these situations.
</p>
<section>
<title>Simple Pages</title>
<p>
All blocks (including nested) are placed on the page with minimum spacing
and the last block has the minimum number of lines past the page end.
The lowest keep value is then found within the body area limits. Then the next
equally low keep is found to determine if the spacing will be closer to
the optimum values.
</p>
</section>
<section>
<title>Before Floats/Footnotes</title>
<p>
After filling the page with the block areas then the new body height
is used to find the best position to break. Before each line area or block
area is remove any associated before floats and footnotes are removed.
This will then adjust the available space on the page and may allow
for a different breaking point. Areas are removed towards the new
breaking point until the areas fit on the page. When finding the
optimum spacing the removal of before floats and footnotes must also
be considered.
</p>
</section>
<section>
<title>Multicolumn</title>
<p>
First the page is filled with all columns for the intial page area.
Then each column is adjusted for the new height starting from the
first column. The best break for the column is found then the next
column is considered, any left over areas a pre-pended to the next
column. Once all the columns are finished then all the columns are
adjusted to fit in the same height columns. This handles the situation
where before floats or footnotes may have been removed.
</p>
</section>
<section>
<title>Last Page</title>
<p>
If in the process of adding areas to a page it is found that there
are no more areas in the flow then this page will need to be changed to
the last page (if applicable). The areas are then placed on a last
page.
</p>
</section>
</section>
</section>
</body>
</document>
|