diff options
Diffstat (limited to 'src/documentation/content/xdocs/0.20.5/hyphenation.xml')
-rw-r--r-- | src/documentation/content/xdocs/0.20.5/hyphenation.xml | 75 |
1 files changed, 74 insertions, 1 deletions
diff --git a/src/documentation/content/xdocs/0.20.5/hyphenation.xml b/src/documentation/content/xdocs/0.20.5/hyphenation.xml index 61c510ce6..e4669853f 100644 --- a/src/documentation/content/xdocs/0.20.5/hyphenation.xml +++ b/src/documentation/content/xdocs/0.20.5/hyphenation.xml @@ -28,7 +28,7 @@ <section id="intro"> <title>Introduction</title> <p>FOP uses Liang's hyphenation algorithm, well known from TeX. It needs - language specific pattern and other data for operation.</p> + language specific patterns and other data for operation.</p> <p>Because of <link href="#license-issues">licensing issues</link> (and for convenience), all hyphenation patterns for FOP are made available through the <fork href="http://offo.sourceforge.net/hyphenation/index.html">Objects For @@ -39,6 +39,79 @@ Please inquire on the <link href="../maillist.html#fop-user">FOP User mailing list</link>.</note> </section> + <section id="using"> + <title>Using Hyphenation</title> + <p> + In order to get words hyphenated, hyphenation has to be + enabled explicitely (set property hyphenation="true") and a + language has to be defined (e.g. language="en"). Optionally, a + country can be specified (e.g. country="GB"). + </p> + <p> + If hyphenation is requested, at first a serialized instance + containing precompiled hyphenation patterns is looked up in + the classpath. If only a language is specified, a ressource + named <code>hyph/<language>.hyp</code> is loaded. If both + language and country are specified, the ressource + <code>hyph/<language>_<country>.hyp</code> is looked up, + and if this fails, the loader looks also for + <code>hyph/<language>.hyp</code>. + </p> + <p> + If no precompiled patterns are found, FOP tries to load raw + patterns from the an XML file name + <code>/hyph/<language>.xml</code> respective + <code>/hyph/<language>_<country>.xml</code> . The /hyph + prefix is hardcoded and can't be configured. Note that this + usually constitues an absolute file path. FOP can't load raw + patterns from other sources than files. + </p> + <p> + If you think hyphenation is enabled but words aren't + hyphenated, check whether FOP finds the relevant hyphenation + patterns: + </p> + <ol> + <li>Did you download and install the hyphenation patterns + properly? In case you downloaded the files from OFFO, check + whether you have downloaded the patterns for the correct FOP + version (0.20.5 or the development version), and check whether + you followed the installation instructions.</li> + <li>Check whether you have spelled the language code and + optionally the country code correctly. Note that the country + codes are in uppercase, by convention. This matters.</li> + </ol> + <p> + If hyphenation works in general, but specific words aren't + hyphenated, or aren't hyphenated as expected, you may have one + of the following problems: + </p> + <ol> + <li>The patters contain a bug, or simply wont do as you + expect. In order to reduce the amount of patters, they are + usually cut some slack.</li> + <li>The patterns may be for an unexpected, unofficial or + outdated dialect of the language. For example, the turkish + patterns were (and maybe still are) made for 17c Osman rather + than modern turkish.</li> + <li>The word may contain invisible characters which prevent it + from being parsed properly from the content stream, or from + being properly matched. Examples of such characters are the + soft hyphen (U+00AD) and the zero width joiner (U+200D). You + have to remove them in order to get the words hyphenated + properly. OTOH, you can use them in order to prevent certain + (unwanted, spurious or incorrect) hyphenations</li> + <li>If the word contains characters which can be composed from + other Unicode characters, or vice versa (e.g. U+00E4 "latin + small a with diaresis" and U+0061 U+0308 "latin small a" + "combining diaresis"), the patterns may just contain the + opposite form. FOP doesn't run <link + href="http://www.unicode.org/reports/tr15/">Unicode + normalization</link> on either the content nor on the + patterns. You have no choice but to check which form the + patterns use and adapt your FO source.</li> + </ol> + </section> <section id="license-issues"> <title>License Issues</title> <p>Many of the hyphenation files distributed with TeX and its offspring are |