123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315 |
- <?xml version="1.0" encoding="UTF-8" standalone="no"?>
- <!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
- -->
- <!-- $Id$ -->
- <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.3//EN" "http://forrest.apache.org/dtd/document-v13.dtd">
-
- <document>
- <header>
- <title>Apache™ FOP Design: Layout Managers</title>
- <subtitle>Break Possibility Proposal</subtitle>
- <version>$Revision$</version>
- <authors>
- <person name="Karen Lease" email="klease@club-internet.fr"/>
- </authors>
- </header>
-
- <body>
- <section id="intro">
- <title>Introduction</title>
- <p>
- As explained in <link href="layout.html">Layout</link>,
- the hierarchy of Layout Managers is responsible for building and placing
- areas. Each Layout Manager is responsible for creating and filling
- areas of a particular type, either inline or block. This document
- explains one potential algorithm for this process. It is based on the
- the generation of <em>break possibilities</em> (BP for short). The
- Layout Managers (LM for short), will generate one or more BP and
- choose the best one. The BP is then used to generate the corresponding
- areas.
- </p>
- </section>
- <section>
- <title>Anatomy of a Break Possibility</title>
- <p>A break possibility is represented by the BreakPoss class. A
- BreakPoss contains size information in the stacking direction and in
- the
- non-stacking direction (at least for inline areas, it must have both). Flags
- indicating various conditions (ISFIRST, ISLAST, CAN_BREAK_AFTER,
- FORCE_BREAK_AFTER, ANCHORS etc). A BreakPoss contains a reference to
- the top-level LayoutManager which generated it.
- </p>
- <p>A BreakPoss contains an object implementing
- the BreakPoss.Position interface. This object is specific to the layout
- manager which created the BreakPoss. It should indicate where the
- break occurs and allow the LM to
- create an area corresponding to the BP. A higher level LM Position
- must somehow reference or wrap the Position returned by its child LM in its
- BreakPoss object. The layout manager modifies the flags and dimension
- information in the BP to reflect its own requirements. For example an
- inline FO layout manager might add space-start, space-end, border and
- padding values to the stacking or non-stacking dimensions. It might also
- modify the flags based its on keep properties.</p>
- </section>
- <section>
- <title>Turning Break Possibilities into Areas</title>
- <p>Once break possibilities have been generated, the galley-level
- layout manager selects the best one
- and passes it back to the LayoutManager which generated it to create
- the area. A LayoutManager is responsible for
- storing enough information in its Position objects to be able to
- create the corresponding areas.</p>
- </section>
- <section>
- <title>A walk-through</title>
- <p>Layout Managers are created from the top down. First the
- page sequence creates a PageLM and a FlowLM. The PageLM will manage
- finding the right page model (with help from the PageSequenceMaster)
- and managing the balancing act between before-floats, footnotes and
- the normal text flow. The FlowLM will
- manage the normal content in the main flow. We can think of it as a
- <em>galley</em> manager.
- </p>
- <p>In general, each LM asks its child LMs to return sucessive
- break possibilities. It passes some
- information to the child in a flags object and it gets back
- a break possibility which contains the size in
- the stacking direction as well as information about such things as
- anchors, break conditions and span conditions which can change the
- reference area environment. This process continues down to the lowest
- level of the layout manager hierarchy which corresponds to atomic
- inline-level FOs such as characters or graphics.
- </p>
- <p>
- Each layout manager will repeatedly call getNextBreakPoss on its current
- child LM until the child returns a BP with the ISLAST
- flag set. Then the layout manager moves on to its next child LM (ie,
- it asks the next child FO to generate a layout manager.) Galley level
- layout managers which are Line and Flow will return to their parent
- layout managers either when they have finished their content or when
- they encounter a a BP which will fill one of their areas.
- </p>
- <p>The break possibilities are generated from the bottom up.
- All inline content must first be broken into
- lines which are then stacked into block areas. This is done by the
- LineLayoutManager, which creates line areas.
- The LineLM asks its child LM to generate a break possibility, which
- represents a place where the line can end. This
- initially means each potential line-end (primarily spaces or forced
- linefeeds and a few other potential line-end characters such as hard
- hyphens.) The text LM returns an object which stores the size in the
- stacking direction as a MinOptMax triplet
- and a <em>cost</em>, which is based on how well this break
- would satisfy the constraints. The Text LM keeps track of its position in
- the text content and returns the total size of the text area it would
- create if it were to break at a given point. The returned BP
- object also contains information about whether the break is forced
- (linefeed) or whether this is the last area which can be generated by
- the LM (ISLAST flag). If a textFO ends on a non-break character, the
- ISLAST flag is set, but the CAN_BREAK_AFTER flag isn't, since we don't
- know if there is any following text in another inline object for
- example.
- </p>
- <p>Variable size content is taken into account from
- the bottom up. Each LM returns a range of sizes in the stacking
- direction, based on property values. For text, this comes from
- variable word-space values or letter-space values. For other inline
- objects, it may include variable space-start and space-end values
- (after calculation of the entire sequence of space specifiers at a
- particular break possibility.)</p>
- <p>The main constraint for laying out
- lines is the available inline-progression-dimension (IPD) for the line
- area to be created. This
- depends on the IPD of the reference area ancestor, on the indents of the
- containing fo:block, and on any side-floats which may be intruding on
- this line.</p>
- <note>See below <link href="#getRefIPD">Getting the Reference
- IPD</link>
- for discussion of how the reference area IPD is
- transmitted to the Line LM.</note>
- <p>For now, let's assume that only the LineLM knows about the IPD
- available to it. Therefore only it can make a decision about which BP
- is the best one; the lower level inline layout managers can only
- return potential break points.</p>
- <note>There are certainly optimizations to this model which can be
- examined later.</note>
- <p>So the Line LM will ask its child LM(s) for break possibilities until
- it gets back a BP whose stacking dimension <em>could</em> fill the
- line. This means that the BP.stackdim.max >= LineIPD.min. It can look
- for further BP, perhaps one whose stackdim.opt is closer to the
- LineIPD.opt. If it isn't happy with the choice of break possibilities,
- it can go past the end of the line to the next one, and then try to
- find a hyphenation point between the last one which fits and the first
- one which doesn't. If no possibility is found whose min/max values
- enclose the available IPD, some constraint will be violated (and
- reported in the log.) The actual strategy is up to the Line LM and
- should be able to be easily replaced without changing the architecture
- (Strategy pattern).
- </p>
- <p>The definition of a good break possibility depends on the
- properties at the block and inline level which govern things such as
- wrapping behavior and justification mode. For example, if lines are
- not to be wrapped, only an explicit linefeed can serve as a BP. If
- lines are wrapped but not justified then there is no requirement to
- completely fill the IPD on each line, but a sophisticated layout
- manager will try to achieve "aesthetic rag".
- </p>
- <p>Note that no areas have actually been created yet. Once the LineLM
- has found a potential break point for the inline content, it can
- calculate the total size of the line area which would be created. The
- size in the IPD is determined by the Line LM based on the chosen BP.
- The size of the line area in the the block-progression-dimension
- depends on the size of the text (or other inline content). These
- values are set by the inline-level LM
- in their returned BP (in terms of ascender and descender heights with
- respect to the baseline). The LineLM adds spacing implied by the
- current line-stacking strategy and line-height property values. It
- stores a reference to the chosen inline BP and "wraps" that in its own
- Position object which it stores in the BP it returns to its parent LM
- (the block layout manager).
- </p><p>The block LM now has a potential break position after its
- first line. It assigns that possibility a cost, based on widow, orphan
- and keep properties. It can also calculate the total size of the block
- area it would create, were it to end the area after this line. It does
- this by adding any padding and border (taking into account
- conditionality). It also calculates space-before and space-after
- values, or contributes to building up a sequence of such values.
- With this information, the block LM creates a new BP (or
- updates the existing one). It stores a Position object in this
- BP which wraps the returned BP from its child Line LM.
- It returns the new BP to its parent and so on, back up to the
- FlowLM.</p>
- <p>Obviously there is more complicated logic involved when dealing
- with lists and tables. These cases need to be walked through in detail.</p>
- <p>The FlowLM sees if the returned stacking dimension will still
- fit in its available block-progression-dimension (BPD). It repeatedly calls
- getNextBreakPoss on its
- child LMs until it reaches the maximum BPD for the flow reference area
- or until there is no more content to lay out. If one child LM is
- finished, it moves on to the next until the last child LM has returned
- a BP with the ISLAST flag set. If any child LM returns a
- BP with a FORCE_BREAK_BEFORE or SPAN flag set, the FlowLM will
- force layout of any pending break possibilities and return to its
- parent (the PageLM) in order to handle the break or span condition.</p>
- <p>If the returned BP has any new before-float or footnote anchors in
- it (ANCHOR flag in the
- BP), the FlowLM will also return to the PageLM. The PageLM must then
- try to find space to place the floats, possibly asking the FlowLM for
- help if the body contains multiple columns.</p>
- </section>
- <section>
- <title>Some issues</title>
- <p>Following are a few remarks on specific issues.</p>
- <section>
- <title>Where Line Layout Managers are created</title>
- <p>If the first child FO in a block FO is an inline-level FO
- such as text, the block LM creates an intermediate level LineLM
- to layout the
- sequence of inline content into Lines. Note that the whole sequence of
- inline FOs is managed by a single instance of LineLM. The LineLM
- becomes the parent to the various inline-level LM created by each
- individual inline FO.
- Since an fo:block can have both block and inline content, its LM
- may create a sequence of intermixed BlockLM and LineLM.</p>
- </section>
- <section id="getRefIPD">
- <title>Getting the reference IPD</title>
- <p>When the layout process starts, with the FlowLM asking its first
- child LM for a break possibility, the IPD isn't known, since we don't
- know whether
- the first FO might be spanning, or on which page it might start. (Of
- course, if all page masters in the sequence have the same region-body IPD
- and all have only a single column, the IPD will never change
- and could already be calculated before starting layout.)
- The FlowLM gets its
- first child LM and calls its getNextBreakPoss method. That is a child LM for
- some block-level FO. For now, suppose it's an fo:block. The BlockLM
- will create its first child LM, which may be another block-level LM in
- the case of nested blocks or a LineLM as explained above. (Question:
- do we need a START flag for layout status?)
- </p>
- <p>We keep calling getNextBreakPoss on lower level layout managers until we
- get down to the inline level or to a level which cannot have break-before
- properties, such as a list-item-label. At that point, we assume we are
- going to have to layout some actual content. But we can't do that yet
- since we don't know the inline-progression-dimension. So we return a
- BP object which has 0 size in the stacking dimension, but which
- has flags set to signal to
- higher-level layout managers what needs to be done. If it has a break-before
- property or a span property, it stores these in the BP. If
- no reference IPD is yet defined, it sets a flag to get that. It then
- returns to its parent. The parent LM will inspect the BP object
- returned. In general, it "wraps" it with information about its own
- needs. If the returned BP is not actually returning any potential
- areas, the LM can still add information about its own break or span
- requirements. This return path continues back up to the PageLM. It
- will then check break and span requirements and create a new page
- if necessary using the appropriate page-master. At that point, the
- reference IPD for the main
- flow is known and is set in the flags object used for
- the next getNextBreakPoss call to the lower level LM.
- </p><p>Using this information, the BlockLM parent can now calculate
- the available IPD for its LineLM child, based on its indents.
- (If there are any
- side-floats information about the intrusion must be passed down by the
- FlowLM to lower level managers.) The LineLM can now generate a series
- of BreakPoss objects, which it passes back to its parent LM.
- </p>
- </section>
- <section>
- <title>Hyphenation</title>
- <p>
- The LineLM is responsible for initiating hyphenation if it is allowed
- by the properties and if no satisfactory BP can be found without
- hyphenating. The hyphenation manager is passed two break
- possibilities, one whose IPD is less than the desired line area IPD
- and one whose IPD is greater. These break possibilities might have
- been generated by different inline-level layout managers (text + a
- wrapper with a color change for example), though
- frequently they represent two positions in a single text run.
- If hyphenation is successful, a new BP is
- returned. The LineLM may look for several intermediate BP
- based on the "cost" of the returned possibilities. If no intermediate
- BP is found, the line will be "short", the white-space stretch will be
- exceeded, or perhaps the content will be overflowed or clipped,
- depending on various property settings.</p>
- </section>
- <section>
- <title>Optimizing</title>
- <p>It obviously seems inefficient to go down to the lowest level
- LM and back up to the FlowLM for every possible line-break
- decision. It seems like it would be possible to optimize by letting
- the lower level layout managers run until they had exceeded the
- current limit in
- the stacking direction. They would then return control to the "galley"
- level (LineLM or FlowLM) which would fine-tune the break decision by
- asking the lower level LM to find a previous BP which would fit. At
- the inline level, this means hyphenation as described above.</p>
- <p>Another interesting question is at what point pending break
- possibilities can be turned into areas.The idea is to wait until we
- are sure we won't have to redo the breaking. This depends on the
- sophistication of the layout strategy. For example, if a
- linebreak can be considered final if the line is full and there are no
- anchors on the line, we could create the LineArea at that point. But
- if we are willing to change a previous line-end decision to get a
- better overall composition of a whole group of lines (to prevent multiple
- hyphens for example), we might wait until the LineLM had finished
- laying out all its material and then make all the Lines at once.</p>
- </section>
- </section>
- </body>
- </document>
|