From 44d5c5697c33c5bf6571b8efb32f0050b04a9151 Mon Sep 17 00:00:00 2001 From: Joerg Pietschmann Date: Mon, 7 Nov 2005 15:14:01 +0000 Subject: [PATCH] some additions to the FAQ and hyphenation usage. git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk@331278 13f79535-47bb-0310-9956-ffa450edef68 --- .../content/xdocs/0.20.5/hyphenation.xml | 75 ++++++++++++++++++- src/documentation/content/xdocs/faq.xml | 40 ++++++---- 2 files changed, 98 insertions(+), 17 deletions(-) diff --git a/src/documentation/content/xdocs/0.20.5/hyphenation.xml b/src/documentation/content/xdocs/0.20.5/hyphenation.xml index 61c510ce6..e4669853f 100644 --- a/src/documentation/content/xdocs/0.20.5/hyphenation.xml +++ b/src/documentation/content/xdocs/0.20.5/hyphenation.xml @@ -28,7 +28,7 @@
Introduction

FOP uses Liang's hyphenation algorithm, well known from TeX. It needs - language specific pattern and other data for operation.

+ language specific patterns and other data for operation.

Because of licensing issues (and for convenience), all hyphenation patterns for FOP are made available through the Objects For @@ -39,6 +39,79 @@ Please inquire on the FOP User mailing list.

+
+ Using Hyphenation +

+ In order to get words hyphenated, hyphenation has to be + enabled explicitely (set property hyphenation="true") and a + language has to be defined (e.g. language="en"). Optionally, a + country can be specified (e.g. country="GB"). +

+

+ If hyphenation is requested, at first a serialized instance + containing precompiled hyphenation patterns is looked up in + the classpath. If only a language is specified, a ressource + named hyph/<language>.hyp is loaded. If both + language and country are specified, the ressource + hyph/<language>_<country>.hyp is looked up, + and if this fails, the loader looks also for + hyph/<language>.hyp. +

+

+ If no precompiled patterns are found, FOP tries to load raw + patterns from the an XML file name + /hyph/<language>.xml respective + /hyph/<language>_<country>.xml . The /hyph + prefix is hardcoded and can't be configured. Note that this + usually constitues an absolute file path. FOP can't load raw + patterns from other sources than files. +

+

+ If you think hyphenation is enabled but words aren't + hyphenated, check whether FOP finds the relevant hyphenation + patterns: +

+
    +
  1. Did you download and install the hyphenation patterns + properly? In case you downloaded the files from OFFO, check + whether you have downloaded the patterns for the correct FOP + version (0.20.5 or the development version), and check whether + you followed the installation instructions.
  2. +
  3. Check whether you have spelled the language code and + optionally the country code correctly. Note that the country + codes are in uppercase, by convention. This matters.
  4. +
+

+ If hyphenation works in general, but specific words aren't + hyphenated, or aren't hyphenated as expected, you may have one + of the following problems: +

+
    +
  1. The patters contain a bug, or simply wont do as you + expect. In order to reduce the amount of patters, they are + usually cut some slack.
  2. +
  3. The patterns may be for an unexpected, unofficial or + outdated dialect of the language. For example, the turkish + patterns were (and maybe still are) made for 17c Osman rather + than modern turkish.
  4. +
  5. The word may contain invisible characters which prevent it + from being parsed properly from the content stream, or from + being properly matched. Examples of such characters are the + soft hyphen (U+00AD) and the zero width joiner (U+200D). You + have to remove them in order to get the words hyphenated + properly. OTOH, you can use them in order to prevent certain + (unwanted, spurious or incorrect) hyphenations
  6. +
  7. If the word contains characters which can be composed from + other Unicode characters, or vice versa (e.g. U+00E4 "latin + small a with diaresis" and U+0061 U+0308 "latin small a" + "combining diaresis"), the patterns may just contain the + opposite form. FOP doesn't run Unicode + normalization on either the content nor on the + patterns. You have no choice but to check which form the + patterns use and adapt your FO source.
  8. +
+
License Issues

Many of the hyphenation files distributed with TeX and its offspring are diff --git a/src/documentation/content/xdocs/faq.xml b/src/documentation/content/xdocs/faq.xml index f9f5c6c0d..4c5dd8a78 100644 --- a/src/documentation/content/xdocs/faq.xml +++ b/src/documentation/content/xdocs/faq.xml @@ -194,7 +194,7 @@

If you are running FOP from the command line: