aboutsummaryrefslogtreecommitdiffstats
path: root/src/documentation/content/xdocs/0.20.5/hyphenation.xml
diff options
context:
space:
mode:
Diffstat (limited to 'src/documentation/content/xdocs/0.20.5/hyphenation.xml')
-rw-r--r--src/documentation/content/xdocs/0.20.5/hyphenation.xml75
1 files changed, 74 insertions, 1 deletions
diff --git a/src/documentation/content/xdocs/0.20.5/hyphenation.xml b/src/documentation/content/xdocs/0.20.5/hyphenation.xml
index 61c510ce6..e4669853f 100644
--- a/src/documentation/content/xdocs/0.20.5/hyphenation.xml
+++ b/src/documentation/content/xdocs/0.20.5/hyphenation.xml
@@ -28,7 +28,7 @@
<section id="intro">
<title>Introduction</title>
<p>FOP uses Liang's hyphenation algorithm, well known from TeX. It needs
- language specific pattern and other data for operation.</p>
+ language specific patterns and other data for operation.</p>
<p>Because of <link href="#license-issues">licensing issues</link> (and for
convenience), all hyphenation patterns for FOP are made available through
the <fork href="http://offo.sourceforge.net/hyphenation/index.html">Objects For
@@ -39,6 +39,79 @@
Please inquire on the <link href="../maillist.html#fop-user">FOP User
mailing list</link>.</note>
</section>
+ <section id="using">
+ <title>Using Hyphenation</title>
+ <p>
+ In order to get words hyphenated, hyphenation has to be
+ enabled explicitely (set property hyphenation="true") and a
+ language has to be defined (e.g. language="en"). Optionally, a
+ country can be specified (e.g. country="GB").
+ </p>
+ <p>
+ If hyphenation is requested, at first a serialized instance
+ containing precompiled hyphenation patterns is looked up in
+ the classpath. If only a language is specified, a ressource
+ named <code>hyph/&lt;language>.hyp</code> is loaded. If both
+ language and country are specified, the ressource
+ <code>hyph/&lt;language>_&lt;country>.hyp</code> is looked up,
+ and if this fails, the loader looks also for
+ <code>hyph/&lt;language>.hyp</code>.
+ </p>
+ <p>
+ If no precompiled patterns are found, FOP tries to load raw
+ patterns from the an XML file name
+ <code>/hyph/&lt;language>.xml</code> respective
+ <code>/hyph/&lt;language>_&lt;country>.xml</code> . The /hyph
+ prefix is hardcoded and can't be configured. Note that this
+ usually constitues an absolute file path. FOP can't load raw
+ patterns from other sources than files.
+ </p>
+ <p>
+ If you think hyphenation is enabled but words aren't
+ hyphenated, check whether FOP finds the relevant hyphenation
+ patterns:
+ </p>
+ <ol>
+ <li>Did you download and install the hyphenation patterns
+ properly? In case you downloaded the files from OFFO, check
+ whether you have downloaded the patterns for the correct FOP
+ version (0.20.5 or the development version), and check whether
+ you followed the installation instructions.</li>
+ <li>Check whether you have spelled the language code and
+ optionally the country code correctly. Note that the country
+ codes are in uppercase, by convention. This matters.</li>
+ </ol>
+ <p>
+ If hyphenation works in general, but specific words aren't
+ hyphenated, or aren't hyphenated as expected, you may have one
+ of the following problems:
+ </p>
+ <ol>
+ <li>The patters contain a bug, or simply wont do as you
+ expect. In order to reduce the amount of patters, they are
+ usually cut some slack.</li>
+ <li>The patterns may be for an unexpected, unofficial or
+ outdated dialect of the language. For example, the turkish
+ patterns were (and maybe still are) made for 17c Osman rather
+ than modern turkish.</li>
+ <li>The word may contain invisible characters which prevent it
+ from being parsed properly from the content stream, or from
+ being properly matched. Examples of such characters are the
+ soft hyphen (U+00AD) and the zero width joiner (U+200D). You
+ have to remove them in order to get the words hyphenated
+ properly. OTOH, you can use them in order to prevent certain
+ (unwanted, spurious or incorrect) hyphenations</li>
+ <li>If the word contains characters which can be composed from
+ other Unicode characters, or vice versa (e.g. U+00E4 "latin
+ small a with diaresis" and U+0061 U+0308 "latin small a"
+ "combining diaresis"), the patterns may just contain the
+ opposite form. FOP doesn't run <link
+ href="http://www.unicode.org/reports/tr15/">Unicode
+ normalization</link> on either the content nor on the
+ patterns. You have no choice but to check which form the
+ patterns use and adapt your FO source.</li>
+ </ol>
+ </section>
<section id="license-issues">
<title>License Issues</title>
<p>Many of the hyphenation files distributed with TeX and its offspring are