Style only: corrected indentation

git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/branches/Temp_Interleaved_Page_Line_Breaking@707045 13f79535-47bb-0310-9956-ffa450edef68
author: Vincent Hennebert <vhennebert@apache.org> 2008-10-22 11:53:11 +0000
committer: Vincent Hennebert <vhennebert@apache.org> 2008-10-22 11:53:11 +0000
commit: 1a1ff8d4477f42fc77e03ccf9fa100816edcc6aa (patch)
tree: 45c520035af796d28348f4beec2205c6340f4fd8
parent: 6fd33a0ca40a436f66a03fe0a05bec465e20d3fc (diff)
download: xmlgraphics-fop-1a1ff8d4477f42fc77e03ccf9fa100816edcc6aa.tar.gz
xmlgraphics-fop-1a1ff8d4477f42fc77e03ccf9fa100816edcc6aa.zip
1 files changed, 278 insertions, 273 deletions
diff --git a/prototype/doc/src/docbook/prototype.xml b/prototype/doc/src/docbook/prototype.xml
index ba578f6bb..6f1867bfb 100644
--- a/prototype/doc/src/docbook/prototype.xml
+++ b/prototype/doc/src/docbook/prototype.xml
@@ -184,148 +184,153 @@ it shone in her face.</literallayout>
         how it behaves in tables.</para>
       <section>
         <title>The Spring Paradigm</title>
-      <para>Glues can be seen as springs that can be both compressed and stretched (in the general 
-        case). Only, instead of having a <ulink 
-          url="http://en.wikipedia.org/wiki/Spring_(device)">spring constant</ulink> 
-        <inlineequation><mathphrase>k</mathphrase></inlineequation>, where the force needed to 
-        compress (resp. extend) the spring is proportional to the amount of compression (resp. 
-        extension), the relation is more complex: the force is a polynomial function of the 
-        displacement. The value of the force actually is the demerits function defined by Knuth. It 
-        was designed in a way that small displacements would be considered as almost equivalent to 
-        the natural length, while big displacements would quickly become unacceptable; <xref 
-          linkend="fig_demerits"/> shows an example where the natural length is 10, and the content 
-        can be shrunk down to 8 and stretched up to 14.</para>
-      <figure id="fig_demerits">
-        <title>Demerits, functions of the size of the part (line, page, etc.)</title>
-        <titleabbrev>Demerits function</titleabbrev>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="demerits"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>The spring image still applies though; we just have to imagine that it will be even more 
-        difficult than usual to compress/extend the spring to its maximum. Then we can see pages or 
-        tables as frames to which we attach springs representing the region body or the columns. In 
-        the case of a table we can imagine to start with the first column, and successively attach a 
-        spring for every column. Depending on what the column contains we will have to compress or 
-        extend the spring, which will have an effect on the other columns as well as the possible 
-        text preceding the table on the page (see <xref linkend="fig_springs"/>).</para>
-      <figure id="fig_springs">
-        <title>Seeing content as springs</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="springs"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>By using this representation it is relatively easy to ‘feel’ what happens: the springs 
-        will adjust such that the total force applied on them and the frame will be minimal. From an 
-        analytical point of view the adjustment ratio for every spring will be computed such that 
-        the sum of the demerits will be minimal. This is not easy with the demerits function as it 
-        is used currently, but it is possible to choose another function that will behave similarly 
-        to Knuth’s original one, yet will simplify the computation of the minimum.</para>
-      </section>
+        <para>Glues can be seen as springs that can be both compressed and stretched (in the general 
+          case). Only, instead of having a <ulink 
+            url="http://en.wikipedia.org/wiki/Spring_(device)">spring constant</ulink> 
+          <inlineequation><mathphrase>k</mathphrase></inlineequation>, where the force needed to 
+          compress (resp. extend) the spring is proportional to the amount of compression (resp. 
+          extension), the relation is more complex: the force is a polynomial function of the 
+          displacement. The value of the force actually is the demerits function defined by Knuth. 
+          It was designed in a way that small displacements would be considered as almost equivalent 
+          to the natural length, while big displacements would quickly become unacceptable; <xref 
+            linkend="fig_demerits"/> shows an example where the natural length is 10, and the 
+          content can be shrunk down to 8 and stretched up to 14.</para>
+        <figure id="fig_demerits">
+          <title>Demerits, functions of the size of the part (line, page, etc.)</title>
+          <titleabbrev>Demerits function</titleabbrev>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="demerits"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>The spring image still applies though; we just have to imagine that it will be even 
+          more difficult than usual to compress/extend the spring to its maximum. Then we can see 
+          pages or tables as frames to which we attach springs representing the region body or the 
+          columns. In the case of a table we can imagine to start with the first column, and 
+          successively attach a spring for every column. Depending on what the column contains we 
+          will have to compress or extend the spring, which will have an effect on the other columns 
+          as well as the possible text preceding the table on the page (see <xref 
+            linkend="fig_springs"/>).</para>
+        <figure id="fig_springs">
+          <title>Seeing content as springs</title>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="springs"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>By using this representation it is relatively easy to ‘feel’ what happens: the springs 
+          will adjust such that the total force applied on them and the frame will be minimal. From 
+          an analytical point of view the adjustment ratio for every spring will be computed such 
+          that the sum of the demerits will be minimal. This is not easy with the demerits function 
+          as it is used currently, but it is possible to choose another function that will behave 
+          similarly to Knuth’s original one, yet will simplify the computation of the 
+          minimum.</para>
+        </section>
       <section>
         <title>Challenges</title>
-      <para>The problem is that there may be many ways to break every column, corresponding to more 
-        or less content on the first page. In other words, there may be many different springs to 
-        play with for each column, and trying every combination of them obviously leads to 
-        combinatorial explosion. Some combinations are obviously nonsensical and should not even be 
-        tested; for example, when there is only one line of content in the first column, three lines 
-        in the second and it’s obvious that the first column may also be filled with three lines 
-        (see <xref linkend="fig_table_silly-combination"/>).</para>
-      <figure id="fig_table_silly-combination">
-        <title>An obviously wrong combination of table columns</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="table_silly-combination"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>A user’s reaction to such a combination is easily predictable… The problem is to detect 
-        that this combination is unacceptable and that a better one must be found. One could play 
-        with demerits, that is, count additional demerits for the empty space at the bottom of the 
-        first column on the first page, and the second column on the second page. That way a 
-        solution with less empty space would be preferred. But this won’t prevent the first one from 
-        being chosen if this leads to a better overall layout (more evenly filled pages, less widows 
-        and orphans, etc.). And even if the overall layout is better, the user is most likely to not 
-        notice it and complain about the awkward way the table was broken.</para>
-      <para>So such a combination must simply be forbidden, and not considered to make a feasible 
-        layout in any case. Now, it is likely that many combinations of the columns will be similar 
-        to that one, therefore not acceptable. It would be good if we could skip them all and 
-        consider only the combination of layouts that <emphasis>will</emphasis> lead to a feasible 
-        result. The number of combinations to consider should then become reasonable (see <xref 
-          linkend="fig_table_acceptable-combinations"/>).</para>
-      <figure id="fig_table_acceptable-combinations">
-        <title>Acceptable combinations of columns: three ways of breaking each column, leading to 
-          only three combinations</title>
-        <titleabbrev>Acceptable combinations of columns</titleabbrev>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="table_acceptable-combinations"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>Another problem is to create a list of elements that will represent all the columns. In 
-        other words, replace a set of springs with a single one that will behave the same way. 
-        That’s probably not possible: either the resulting spring will fail to represent all the 
-        flexibility offered by the individual springs; or the demerits won’t match the sum of the 
-        individual column demerits. On top of that there is the desynchronization that can happen 
-        inside a row (see the <ulink 
-          url="http://wiki.apache.org/xmlgraphics-fop/TableLayout/KnownProblems/">wiki page</ulink> 
-        about known issues in the current approach). Several lists will probably have to be computed 
-        depending on the breaks chosen for the preceding pages.</para>
-      <para>So maybe that the box/glue/penalty model simply needs to be dropped, and that another 
-        model can be found that will be able to represent tables in an elegant way and allow to 
-        solve the problem easily. Needless to say that such a model has yet to be discovered…</para>
+        <para>The problem is that there may be many ways to break every column, corresponding to 
+          more or less content on the first page. In other words, there may be many different 
+          springs to play with for each column, and trying every combination of them obviously leads 
+          to combinatorial explosion. Some combinations are obviously nonsensical and should not 
+          even be tested; for example, when there is only one line of content in the first column, 
+          three lines in the second and it’s obvious that the first column may also be filled with 
+          three lines (see <xref linkend="fig_table_silly-combination"/>).</para>
+        <figure id="fig_table_silly-combination">
+          <title>An obviously wrong combination of table columns</title>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="table_silly-combination"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>A user’s reaction to such a combination is easily predictable… The problem is to 
+          detect that this combination is unacceptable and that a better one must be found. One 
+          could play with demerits, that is, count additional demerits for the empty space at the 
+          bottom of the first column on the first page, and the second column on the second page. 
+          That way a solution with less empty space would be preferred. But this won’t prevent the 
+          first one from being chosen if this leads to a better overall layout (more evenly filled 
+          pages, less widows and orphans, etc.). And even if the overall layout is better, the user 
+          is most likely to not notice it and complain about the awkward way the table was 
+          broken.</para>
+        <para>So such a combination must simply be forbidden, and not considered to make a feasible 
+          layout in any case. Now, it is likely that many combinations of the columns will be 
+          similar to that one, therefore not acceptable. It would be good if we could skip them all 
+          and consider only the combination of layouts that <emphasis>will</emphasis> lead to a 
+          feasible result. The number of combinations to consider should then become reasonable (see 
+          <xref linkend="fig_table_acceptable-combinations"/>).</para>
+        <figure id="fig_table_acceptable-combinations">
+          <title>Acceptable combinations of columns: three ways of breaking each column, leading to 
+            only three combinations</title>
+          <titleabbrev>Acceptable combinations of columns</titleabbrev>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="table_acceptable-combinations"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>Another problem is to create a list of elements that will represent all the columns. 
+          In other words, replace a set of springs with a single one that will behave the same way. 
+          That’s probably not possible: either the resulting spring will fail to represent all the 
+          flexibility offered by the individual springs; or the demerits won’t match the sum of the 
+          individual column demerits. On top of that there is the desynchronization that can happen 
+          inside a row (see the <ulink 
+            url="http://wiki.apache.org/xmlgraphics-fop/TableLayout/KnownProblems/">wiki 
+            page</ulink> about known issues in the current approach). Several lists will probably 
+          have to be computed depending on the breaks chosen for the preceding pages.</para>
+        <para>So maybe that the box/glue/penalty model simply needs to be dropped, and that another 
+          model can be found that will be able to represent tables in an elegant way and allow to 
+          solve the problem easily. Needless to say that such a model has yet to be 
+          discovered…</para>
       </section>
     </section>
     <section>
       <title>Breaking Paragraphs and Merging Columns</title>
       <section>
         <title>Multiple Layouts for a Paragraph</title>
-      <para>With the new approach it is possible to take into account several ways of typesetting a 
-        paragraph, differing by the number of lines that it occupies. For example, the optimal way 
-        will be five lines, but there are also a four-line possibility and a six-line possibility.  
-        In the four-line version spaces are likely to be shrunk a lot, while on the six-line they 
-        will be much stretched. While those possibilities look less good than the five-line version, 
-        they may be preferred in the final layout because they would allow to avoid a widow or 
-        orphan line somewhere else, leading to less global demerits; or even lead to a feasible 
-        global layout while the optimal version wouldn’t.</para>
-      <para>In practice, this is as if we were dealing with several lists of boxes, glues and 
-        penalties, one for each version of the paragraph (note that there is a <ulink 
-          url="http://wiki.apache.org/xmlgraphics-fop/WhitespaceManagement">wiki page</ulink> on a 
-        similar topic). This can be handled transparently by the breaking algorithm: new layouts 
-        will simply be added for each possibility (obviously the number of active nodes will be 
-        multiplied by the number of versions in which a paragraph can be typeset). The principle 
-        upon which the algorithm relies remains the same, that is: “the best way to end part (page) 
-        <inlineequation><mathphrase>n</mathphrase></inlineequation> on <emphasis>this</emphasis> 
-        element is by going <emphasis>that</emphasis> route”. Suppose there is an image after the 
-        paragraph (see <xref linkend="fig_paragraph_different-numbers-of-lines"/>); when considering 
-        a page break after the image, there will be (at least) three available layouts: one made of 
-        the four-line version of the paragraph, one made of the five-line version, and one of the 
-        six-line version. The best of them will be chosen; maybe that the four-line version leads to 
-        an unfinished page (there is not enough elasticity to stretch the content up to the end of 
-        the page); maybe the six-line version makes the content perfectly fit in the page (no 
-        shrinking or stretching). Then it would be chosen over the five-line version, because the 
-        absence of stretching at the page level would compensate for the slightly sub-optimal layout 
-        of that paragraph.</para>
-      <figure id="fig_paragraph_different-numbers-of-lines">
-        <title>Three different paragraph layouts to consider</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="paragraph_different-numbers-of-lines"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>However, weird things may happen; for instance, let’s suppose we have a sequence of 
-        three paragraphs separated by elastic spaces. The first two paragraphs may be typeset on 
-        either four or five lines; the first space has a value of 1+1−1 (natural length 1, 
-        stretchable up to 2 and shrinkable down to 0); the second has a value of 1.5+0−1. Since the 
-        four-line versions are shorter this is possible to add the second space and one line of the 
-        third paragraph on the page. Let’s see what are the resulting min-opt-max:</para>
-      <literallayout class="monospaced">       5 lines:    4 lines:
+        <para>With the new approach it is possible to take into account several ways of typesetting 
+          a paragraph, differing by the number of lines that it occupies. For example, the optimal 
+          way will be five lines, but there are also a four-line possibility and a six-line 
+          possibility.  In the four-line version spaces are likely to be shrunk a lot, while on the 
+          six-line they will be much stretched. While those possibilities look less good than the 
+          five-line version, they may be preferred in the final layout because they would allow to 
+          avoid a widow or orphan line somewhere else, leading to less global demerits; or even lead 
+          to a feasible global layout while the optimal version wouldn’t.</para>
+        <para>In practice, this is as if we were dealing with several lists of boxes, glues and 
+          penalties, one for each version of the paragraph (note that there is a <ulink 
+            url="http://wiki.apache.org/xmlgraphics-fop/WhitespaceManagement">wiki page</ulink> on a 
+          similar topic). This can be handled transparently by the breaking algorithm: new layouts 
+          will simply be added for each possibility (obviously the number of active nodes will be 
+          multiplied by the number of versions in which a paragraph can be typeset). The principle 
+          upon which the algorithm relies remains the same, that is: “the best way to end part 
+          (page) <inlineequation><mathphrase>n</mathphrase></inlineequation> on 
+          <emphasis>this</emphasis> element is by going <emphasis>that</emphasis> route”. Suppose 
+          there is an image after the paragraph (see <xref 
+            linkend="fig_paragraph_different-numbers-of-lines"/>); when considering a page break 
+          after the image, there will be (at least) three available layouts: one made of the 
+          four-line version of the paragraph, one made of the five-line version, and one of the 
+          six-line version. The best of them will be chosen; maybe that the four-line version leads 
+          to an unfinished page (there is not enough elasticity to stretch the content up to the end 
+          of the page); maybe the six-line version makes the content perfectly fit in the page (no 
+          shrinking or stretching). Then it would be chosen over the five-line version, because the 
+          absence of stretching at the page level would compensate for the slightly sub-optimal 
+          layout of that paragraph.</para>
+        <figure id="fig_paragraph_different-numbers-of-lines">
+          <title>Three different paragraph layouts to consider</title>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="paragraph_different-numbers-of-lines"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>However, weird things may happen; for instance, let’s suppose we have a sequence of 
+          three paragraphs separated by elastic spaces. The first two paragraphs may be typeset on 
+          either four or five lines; the first space has a value of 1+1−1 (natural length 1, 
+          stretchable up to 2 and shrinkable down to 0); the second has a value of 1.5+0−1. Since 
+          the four-line versions are shorter this is possible to add the second space and one line 
+          of the third paragraph on the page. Let’s see what are the resulting min-opt-max:</para>
+        <literallayout class="monospaced">       5 lines:    4 lines:
         5           4
         1+1−1       1+1−1
         5           4
@@ -333,149 +338,149 @@ it shone in her face.</literallayout>
         1
        ────────    ──────────
 Total:  11+1−1      11.5+1−2</literallayout>
-      <para>The two possibilities are overlapping, and would both fit in a page with a height of 12; 
-        how to order them? By the total optimum value? By the minimum value? This may be a problem 
-        for the merging algorithm.</para>
-      <para>Maybe there’s a possibility to find a merging process for which this wouldn’t be a 
-        problem, but a simpler way to go is to simply disable this feature when we are inside 
-        tables, and always work with the optimal paragraph layout. Along with restriction 1 of 
-        Knuth’s algorithm (the value <inlineequation><mathphrase>total length − total 
-            shrink</mathphrase></inlineequation> increases when iterating over the elements) this 
-        gives a total ordering of the layouts. Then it’s probably possible to re-use the merging 
-        method currently used in FOP.</para>
+        <para>The two possibilities are overlapping, and would both fit in a page with a height of 
+          12; how to order them? By the total optimum value? By the minimum value? This may be a 
+          problem for the merging algorithm.</para>
+        <para>Maybe there’s a possibility to find a merging process for which this wouldn’t be a 
+          problem, but a simpler way to go is to simply disable this feature when we are inside 
+          tables, and always work with the optimal paragraph layout. Along with restriction 1 of 
+          Knuth’s algorithm (the value <inlineequation><mathphrase>total length − total 
+              shrink</mathphrase></inlineequation> increases when iterating over the elements) this 
+          gives a total ordering of the layouts. Then it’s probably possible to re-use the merging 
+          method currently used in FOP.</para>
       </section>
       <section>
         <title>Adapting the Current Merging Algorithm</title>
-      <para>As a short and simplified reminder, the merging algorithm works the following way:
-        <orderedlist>
-          <listitem>
-            <para>Compute the smallest step for every column (the length of the content before the 
-              first legal break).</para>
-          </listitem>
-          <listitem>
-            <para>Select the maximum of those steps (it has been agreed that every column should 
-              contain some content before a break).</para>
-          </listitem>
-          <listitem>
-            <para>In the other columns, progress along the legal breaks to get as near as this 
-              maximum step without overtaking it. That gives the first legal break in the 
-              row.</para>
-          </listitem>
-          <listitem>
-            <para>From there on, go to the smallest next legal break.</para>
-          </listitem>
-          <listitem>
-            <para>Select every column that has such a break; they make part of the next legal break 
-              inside the table.</para>
+        <para>As a short and simplified reminder, the merging algorithm works the following way:
+          <orderedlist>
+            <listitem>
+              <para>Compute the smallest step for every column (the length of the content before the 
+                first legal break).</para>
             </listitem>
-          <listitem>
-            <para>Go on like this, until the end of the row is reached…</para>
-          </listitem>
-        </orderedlist>
-      </para>
-      <para>Again, this approach works because the elements for the paragraphs are known in advance. 
-        Plus it doesn’t take elastic spaces into account; while this might look like a reasonable 
-        trade-off for tables, the same algorithm is also applied to lists where it becomes an 
-        annoying limitation.</para>
-      <para>When introducing elastic spaces, the legal breaks are no longer at a given length but 
-        may cover an interval. Then a layout from one column may be compatible with several layouts 
-        from another column. The difference will be in the amount of content after the break, which 
-        means that the row may end at different distances from the top of the following page (<xref 
-          linkend="fig_table_distances-from-top"/>). To respect the dynamic programming approach, 
-        all of the layouts must be kept, because the layout that will be part of the optimal 
-        solution may not be the optimal one at the table level (the optimal layout, in the 
-        <emphasis>table context</emphasis>, may not be part of the optimal solution, in the 
-        <emphasis>whole document context</emphasis>).</para>
-      <figure id="fig_table_distances-from-top">
-        <title>Combining one layout on the first column with two different layouts on the second 
-          one, making the table end at different positions</title>
-        <titleabbrev>Multiple column combinations leading to multiple table layouts</titleabbrev>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="table_distances-from-top"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>So two different sets will be available: layouts that will be said 
-        <emphasis>compatible</emphasis>; that is, it’s possible to play with their elastic spaces so 
-        that all of the columns reach the bottom of the page before the break; and the other ones, 
-        for which at least one column is <quote>underfull</quote>: either it doesn’t provide 
-        stretchability and its height is inferior to the height of the row, or it doesn’t have 
-        enough of stretchability to reach the bottom of the page. Compatible layouts are perfect; 
-        incompatible ones are ok, provided that this is not a situation like in <xref 
-          linkend="fig_table_silly-combination"/>, where a better alternative is available. Let’s 
-        call those latter <emphasis>unacceptable</emphasis>. The whole challenge is to make the 
-        difference between layouts that are incompatible but acceptable, and those that are both 
-        incompatible <emphasis>and</emphasis> unacceptable. Open question: if compatible layouts are 
-        available, should incompatible ones even be considered?</para>
-      <para>Since we want to keep only the best layout for a given paragraph, the column combination 
-        process can’t be started before the ends of the column contents are reached. Indeed, when we 
-        are in the middle of the paragraph, some of the currently active layouts may be specific to 
-        a non-optimal version of the paragraph. If they are selected as parts of a certain column 
-        combination, and then removed because they aren’t part of the optimal paragraph layout, that 
-        will invalidate the corresponding column combination.</para>
-      <para>So we would wait that the paragraphs are typeset before merging the columns. In fact, 
-        the column contents would be typeset separately up to the end, and then the merging process 
-        would be started. The process would be similar to the usual page-breaking mechanism: every 
-        column would be considered as a page from the point of view of the column content. Only, in 
-        the general case the first <quote>page</quote> would not have a set height, because of the 
-        content preceding the table that may have elastic spaces (see the springs from <xref 
-          linkend="fig_springs"/>). This is not really a problem, though, simply a generalization of  
-        the test: check that two intervals intersect instead of that a point belongs to an interval 
-        (actually a point may be seen as a singleton interval). Note that the point of view taken 
-        here is the opposite of the usual one: instead of testing if a layout fits in a page, i.e.,  
-        that it can be shrunk or stretched enough so that its actual dimension matches the page 
-        height, we check that the page height belongs to the interval [min, max] formed by the 
-        layout. In the end, this is the same test. In the generalized case the <quote>page</quote> 
-        height would be an interval instead of a single point.</para>
-      <para>Another modification is that underfull layouts must be considered: indeed, a column is 
-        allowed to be underfull if there’s another one that fills the page. The only thing to pay 
-        attention to is to not build a column combination out of underfull layouts only: at least 
-        one of them must be complete. Underfull layouts will therefore have to be properly 
-        flagged.</para>
-      <para>A potential issue with that approach is that too much work may be performed. In general 
-        several layouts will be available for every column, and it’s possible that some layouts will 
-        not be parts of any final combination. For example, if paragraphs in the first column have 
-        an orphan setting of 1, and 2 in all other columns. See <xref 
-          linkend="fig_table-unused-column-layout"/>: using the first possible layout for column 1 
-        will lead to an unacceptable combination.</para>
-      <figure id="fig_table-unused-column-layout">
-        <title>One of the layouts in the first column will never be used in any final 
-          combination</title>
-        <titleabbrev>Unused column layouts</titleabbrev>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="table_unused-column-layout"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>So it may be better to start the combining process as soon as possible, but as seen 
-        above paragraphs must be typeset first. Or maybe not? The problem was to define an ordering 
-        on the layouts. When a page break can occur in the middle of a paragraph, is it really a 
-        problem if that paragraph is not finished yet? As long as the previous paragraphs have their 
-        optimal layouts, the total order should be available. All that can happen inside the 
-        paragraph is that there may be a few lines more or less on the page; that means a few more 
-        boxes added to the set of layouts that’s available at the beginning of the paragraph. If 
-        that set was totally ordered, it should remain so when adding the boxes corresponding to the 
-        lines of the current paragraph. That said, it’s not totally clear what happens when not all 
-        the lines have the same height. Let’s imagine that a paragraph contains two inline images 
-        whose heights are 1.5 times the normal line height. Let’s assume there’s one layout in which 
-        both images are on the same line; and another one where spaces are slightly more shrunk so 
-        that the first image can be placed on the preceding line (<xref 
-          linkend="fig_paragraph-with-images"/>). The first layout will be 4.5 units high while the 
-        second one will be 5 units high. Maybe that won’t be a problem, but…</para>
-      <figure id="fig_paragraph-with-images">
-        <title>Layouts of a paragraph containing two inline images</title>
-        <mediaobject>
-          <imageobject>
-            <imagedata fileref="paragraph-with-images"/>
-          </imageobject>
-        </mediaobject>
-      </figure>
-      <para>Another thing to keep in mind that may complicate things is that not all columns may end 
-        on the same page; some with fewer content may end on earlier pages. They must be considered 
-        in the combination, but they won’t participate in the computation of the resulting 
-        min-opt-max.</para>
+            <listitem>
+              <para>Select the maximum of those steps (it has been agreed that every column should 
+                contain some content before a break).</para>
+            </listitem>
+            <listitem>
+              <para>In the other columns, progress along the legal breaks to get as near as this 
+                maximum step without overtaking it. That gives the first legal break in the 
+                row.</para>
+            </listitem>
+            <listitem>
+              <para>From there on, go to the smallest next legal break.</para>
+            </listitem>
+            <listitem>
+              <para>Select every column that has such a break; they make part of the next legal 
+                break inside the table.</para>
+              </listitem>
+            <listitem>
+              <para>Go on like this, until the end of the row is reached…</para>
+            </listitem>
+          </orderedlist>
+        </para>
+        <para>Again, this approach works because the elements for the paragraphs are known in 
+          advance. Plus it doesn’t take elastic spaces into account; while this might look like a 
+          reasonable trade-off for tables, the same algorithm is also applied to lists where it 
+          becomes an annoying limitation.</para>
+        <para>When introducing elastic spaces, the legal breaks are no longer at a given length but 
+          may cover an interval. Then a layout from one column may be compatible with several 
+          layouts from another column. The difference will be in the amount of content after the 
+          break, which means that the row may end at different distances from the top of the 
+          following page (<xref linkend="fig_table_distances-from-top"/>). To respect the dynamic 
+          programming approach, all of the layouts must be kept, because the layout that will be 
+          part of the optimal solution may not be the optimal one at the table level (the optimal 
+          layout, in the <emphasis>table context</emphasis>, may not be part of the optimal 
+          solution, in the <emphasis>whole document context</emphasis>).</para>
+        <figure id="fig_table_distances-from-top">
+          <title>Combining one layout on the first column with two different layouts on the second 
+            one, making the table end at different positions</title>
+          <titleabbrev>Multiple column combinations leading to multiple table layouts</titleabbrev>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="table_distances-from-top"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>So two different sets will be available: layouts that will be said 
+          <emphasis>compatible</emphasis>; that is, it’s possible to play with their elastic spaces 
+          so that all of the columns reach the bottom of the page before the break; and the other 
+          ones, for which at least one column is <quote>underfull</quote>: either it doesn’t provide 
+          stretchability and its height is inferior to the height of the row, or it doesn’t have 
+          enough of stretchability to reach the bottom of the page. Compatible layouts are perfect; 
+          incompatible ones are ok, provided that this is not a situation like in <xref 
+            linkend="fig_table_silly-combination"/>, where a better alternative is available. Let’s 
+          call those latter <emphasis>unacceptable</emphasis>. The whole challenge is to make the 
+          difference between layouts that are incompatible but acceptable, and those that are both 
+          incompatible <emphasis>and</emphasis> unacceptable. Open question: if compatible layouts 
+          are available, should incompatible ones even be considered?</para>
+        <para>Since we want to keep only the best layout for a given paragraph, the column 
+          combination process can’t be started before the ends of the column contents are reached. 
+          Indeed, when we are in the middle of the paragraph, some of the currently active layouts 
+          may be specific to a non-optimal version of the paragraph. If they are selected as parts 
+          of a certain column combination, and then removed because they aren’t part of the optimal 
+          paragraph layout, that will invalidate the corresponding column combination.</para>
+        <para>So we would wait that the paragraphs are typeset before merging the columns. In fact, 
+          the column contents would be typeset separately up to the end, and then the merging 
+          process would be started. The process would be similar to the usual page-breaking 
+          mechanism: every column would be considered as a page from the point of view of the column 
+          content. Only, in the general case the first <quote>page</quote> would not have a set 
+          height, because of the content preceding the table that may have elastic spaces (see the 
+          springs from <xref linkend="fig_springs"/>). This is not really a problem, though, simply 
+          a generalization of  the test: check that two intervals intersect instead of that a point 
+          belongs to an interval (actually a point may be seen as a singleton interval). Note that 
+          the point of view taken here is the opposite of the usual one: instead of testing if a 
+          layout fits in a page, i.e.,  that it can be shrunk or stretched enough so that its actual 
+          dimension matches the page height, we check that the page height belongs to the interval 
+          [min, max] formed by the layout. In the end, this is the same test. In the generalized 
+          case the <quote>page</quote> height would be an interval instead of a single point.</para>
+        <para>Another modification is that underfull layouts must be considered: indeed, a column is 
+          allowed to be underfull if there’s another one that fills the page. The only thing to pay 
+          attention to is to not build a column combination out of underfull layouts only: at least 
+          one of them must be complete. Underfull layouts will therefore have to be properly 
+          flagged.</para>
+        <para>A potential issue with that approach is that too much work may be performed. In 
+          general several layouts will be available for every column, and it’s possible that some 
+          layouts will not be parts of any final combination. For example, if paragraphs in the 
+          first column have an orphan setting of 1, and 2 in all other columns. See <xref 
+            linkend="fig_table-unused-column-layout"/>: using the first possible layout for column 1 
+          will lead to an unacceptable combination.</para>
+        <figure id="fig_table-unused-column-layout">
+          <title>One of the layouts in the first column will never be used in any final 
+            combination</title>
+          <titleabbrev>Unused column layouts</titleabbrev>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="table_unused-column-layout"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>So it may be better to start the combining process as soon as possible, but as seen 
+          above paragraphs must be typeset first. Or maybe not? The problem was to define an 
+          ordering on the layouts. When a page break can occur in the middle of a paragraph, is it 
+          really a problem if that paragraph is not finished yet? As long as the previous paragraphs 
+          have their optimal layouts, the total order should be available. All that can happen 
+          inside the paragraph is that there may be a few lines more or less on the page; that means 
+          a few more boxes added to the set of layouts that’s available at the beginning of the 
+          paragraph. If that set was totally ordered, it should remain so when adding the boxes 
+          corresponding to the lines of the current paragraph. That said, it’s not totally clear 
+          what happens when not all the lines have the same height. Let’s imagine that a paragraph 
+          contains two inline images whose heights are 1.5 times the normal line height. Let’s 
+          assume there’s one layout in which both images are on the same line; and another one where 
+          spaces are slightly more shrunk so that the first image can be placed on the preceding 
+          line (<xref linkend="fig_paragraph-with-images"/>). The first layout will be 4.5 units 
+          high while the second one will be 5 units high. Maybe that won’t be a problem, but…</para>
+        <figure id="fig_paragraph-with-images">
+          <title>Layouts of a paragraph containing two inline images</title>
+          <mediaobject>
+            <imageobject>
+              <imagedata fileref="paragraph-with-images"/>
+            </imageobject>
+          </mediaobject>
+        </figure>
+        <para>Another thing to keep in mind that may complicate things is that not all columns may 
+          end on the same page; some with fewer content may end on earlier pages. They must be 
+          considered in the combination, but they won’t participate in the computation of the 
+          resulting min-opt-max.</para>
       </section>
     </section>
   </section>
author	Vincent Hennebert <vhennebert@apache.org>	2008-10-22 11:53:11 +0000
committer	Vincent Hennebert <vhennebert@apache.org>	2008-10-22 11:53:11 +0000
commit	1a1ff8d4477f42fc77e03ccf9fa100816edcc6aa (patch)
tree	45c520035af796d28348f4beec2205c6340f4fd8
parent	6fd33a0ca40a436f66a03fe0a05bec465e20d3fc (diff)
download	xmlgraphics-fop-1a1ff8d4477f42fc77e03ccf9fa100816edcc6aa.tar.gz xmlgraphics-fop-1a1ff8d4477f42fc77e03ccf9fa100816edcc6aa.zip