From 99c898112dda5525429c8d235cb56fbfb9fc8000 Mon Sep 17 00:00:00 2001 From: Glenn Adams Date: Tue, 3 Apr 2012 04:33:34 +0000 Subject: add complex script feature documentation git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk@1308679 13f79535-47bb-0310-9956-ffa450edef68 --- src/documentation/content/xdocs/site.xml | 25 +- .../content/xdocs/trunk/complexscripts.xml | 628 +++++++++++++++++++++ src/documentation/content/xdocs/trunk/running.xml | 3 +- 3 files changed, 646 insertions(+), 10 deletions(-) create mode 100644 src/documentation/content/xdocs/trunk/complexscripts.xml (limited to 'src/documentation/content/xdocs') diff --git a/src/documentation/content/xdocs/site.xml b/src/documentation/content/xdocs/site.xml index d7ca6f5e1..322f5e52c 100644 --- a/src/documentation/content/xdocs/site.xml +++ b/src/documentation/content/xdocs/site.xml @@ -151,19 +151,26 @@ + - - - - - - + + + + + - - + - + + + + diff --git a/src/documentation/content/xdocs/trunk/complexscripts.xml b/src/documentation/content/xdocs/trunk/complexscripts.xml new file mode 100644 index 000000000..255bdc5ae --- /dev/null +++ b/src/documentation/content/xdocs/trunk/complexscripts.xml @@ -0,0 +1,628 @@ + + + + + +
+ Apache™ FOP: Complex Scripts +
+ +
+ Overview +

+ This page describes the + complex scripts + features of Apache™ FOP, which include: +

+
    +
  • Support for languages written with right-to-left scripts, such as Arabic and Hebrew scripts.
  • +
  • Support for languages written with South Asian and Southeast Asian scripts, such as Devanagari, + Khmer, Tamil, Thai, and others.
  • +
  • Support for advanced substitution, reordering, and positioning of glyphs according to language + and script sensitive rules.
  • +
  • Support for advanced number to string formatting.
  • +
+
+
+ Disabling complex scripts +

Complex script features are enabled by default. If some application of FOP does not + require this support, then it can be disabled in three ways:

+
    +
  1. + Command line: The command line option -nocs turns off complex script + features: fop -nocs -fo mydocument.fo -pdf mydocument.pdf +
  2. +
  3. + Embedding: userAgent.setComplexScriptFeaturesEnabled(false); +
  4. +
  5. + Optional setting in fop.xconf file: +
    +<fop version="1.0">
    +  <complex-scripts disabled="true"/>
    +  ...
    +</fop>
    +          
    +
  6. +
+

+ When complex scripts features are enabled, additional information related to bidirectional + level resolution, the association between characters and glyphs, and glyph position adjustments + are added to the internal, parsed representation of the XSL-FO tree and its corresponding + formatted area tree. This additional information will somewhat increase the memory requirements for + processing documents that use these features. +

+ A document author need not make explicit use of any complex scripts feature in order + for this additional information to be created. For example, if the author makes use of a font + that contains OpenType GSUB and/or GPOS tables, then those tables will be automatically used + unless complex scripts features are disabled. +
+
+ Changes to your XSL-FO input files +

+ In most circumstances, XSL-FO content does not need to change in order to make use of + complex scripts features; however, in certain contexts, fully automatic processing is not + sufficient. In these cases, an author may make use of the following XSL-FO constructs: +

+
    +
  • The script property.
  • +
  • The language property.
  • +
  • The writing-mode property.
  • +
  • The number to string conversion properties: + format, + grouping-separator, + grouping-separator, + grouping-size, + letter-value, + and fox:number-conversion-features.
  • +
  • The fo:bidi-override element.
  • +
  • Explicit bidirectional control characters: U+200E LRM, U+200F RLM, U+202A LRE, + U+202B RLE, U+202C PDF, U+202D LRO, U+202E RLO.
  • +
  • Explicit join control characters: U+200C ZWNJ and U+200D ZWJ.
  • +
+
+
+ Authoring Details +

The complex scripts related effects of the above enumerated XSL-FO constructs are more + fully described in the following sub-sections.

+
+ Script Property +

In order to apply font specific complex script features, it is necessary to know + the script that applies to the text undergoing layout processing. This script is determined + using the following algorithm: +

+
    +
  1. If the FO element that governs the text specifies a + script + property and its value is not the empty string or "auto", then that script is used.
  2. +
  3. Otherwise, the dominant script of the text is determined automatically by finding the + script whose constituent characters appear most frequently in the text.
  4. +
+

In case the automatic algorithm does not produce the desired results, an author may + explicitly specify a script property with the desired script. If specified, + it must be one of the four-letter script code specified in + ISO 15924 Code List or + in the Extended Script Codes table. Comparison + of script codes is performed in a case-insensitive manner, so it does not matter what case + is used when specifying these codes in an XSL-FO document.

+
+ Standard Script Codes +

The following table enumerates the standard ISO 15924 4-letter codes recognized by FOP.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CodeScript
arabArabic
bengBengali
bopoBopomofo
cyrlCyrillic
devaDevanagari
ethiEthiopic
georGeorgian
grekGreek
gujrGujarati
guruGurmukhi
hangHangul
haniHan
hebrHebrew
hiraHiragana
kanaKatakana
kndaKannada
khmrKhmer
laooLao
latnLatin
mlymMalayalam
mymrBurmese
mongMongolian
oryaOriya
sinhSinhalese
tamlTamil
teluTelugu
thaiThai
tibtTibetan
zmthMath
zsymSymbol
zyyyUndetermined
zzzzUncoded
+
+
+ Extended Script Codes +

The following table enumerates a number of non-standard extended script codes recognized by FOP.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CodeScriptComments
bng2BengaliOpenType Indic Version 2 (May 2008 and following) behavior.
dev2DevanagariOpenType Indic Version 2 (May 2008 and following) behavior.
gur2GurmukhiOpenType Indic Version 2 (May 2008 and following) behavior.
gjr2GujaratiOpenType Indic Version 2 (May 2008 and following) behavior.
knd2KannadaOpenType Indic Version 2 (May 2008 and following) behavior.
mlm2MalayalamOpenType Indic Version 2 (May 2008 and following) behavior.
ory2OriyaOpenType Indic Version 2 (May 2008 and following) behavior.
tml2TamilOpenType Indic Version 2 (May 2008 and following) behavior.
tel2TeluguOpenType Indic Version 2 (May 2008 and following) behavior.
+ + Explicit use of one of the above extended script codes is not portable, + and should be limited to use with FOP only. + + + When performing automatic script determination, FOP selects the OpenType Indic + Version 2 script codes by default. If the author requires Version 1 behavior, then + an explicit, non-extension script code should be specified in a governing script + property. + +
+
+
+ Language Property +

Certain fonts that support complex script features can make use of language information in order for + language specific processing rules to be applied. For example, a font designed for the Arabic script may support + typographic variations according to whether the written language is Arabic, Farsi (Persian), Sindhi, Urdu, or + another language written with the Arabic script. In order to apply these language specific features, the author + may explicitly mark the text with a language + property.

+

When specifying the language property, the value of the property must be either an + ISO639-2 3-letter code or an + ISO639-1 2-letter code. Comparison of language + codes is performed in a case-insensitive manner, so it does not matter what case is used when specifying these + codes in an XSL-FO document.

+
+
+ Writing Mode Property +
+
+ Number Conversion Properties +
+
+ Bidi Override Element +
+
+ Bidi Control Characters +
+
+ Join Control Characters +
+
+
+ Supported Scripts +

Support for specific complex scripts is enumerated in the following table. Support + for those marked as not being supported is expected to be added in future revisions.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ScriptSupportTestedComments
Arabicfullfull
Bengalinonenone
Burmesenonenone
Devanagaripartialpartialjoin controls (ZWJ, ZWNJ) not yet supported
Khmernonenone
Gujaratipartialnonepre-alpha
Gurmukhipartialnonepre-alpha
Hebrewfullpartial
Kannadanonenone
Laononenone
Malayalamnonenone
Mongoliannonenone
Oriyanonenone
Tamilnonenone
Telugunonenone
Tibetannonenone
Thainonenone
+
+
+ Supported Fonts +

Support for specific fonts is enumerated in the following sub-sections. If a given + font is not listed, then it has not been tested with these complex scripts features.

+
+ Arabic Fonts + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FontVersionGlyphsComments
Arial Unicode MS1.0150377limited GPOS support
Lateef1.01147language features for Kurdish (KUR), Sindhi (SND), Urdu (URD)
Scheherazade1.01197language features for Kurdish (KUR), Sindhi (SND), Urdu (URD)
Simplified Arabic1.01 + contains invalid, out of order coverage table entries
Simplified Arabic5.00414lacks GPOS support
Simplified Arabic5.92473includes GPOS for advanced position adjustment
Traditional Arabic1.01530lacks GPOS support
Traditional Arabic5.00530lacks GPOS support
Traditional Arabic5.92589includes GPOS for advanced position adjustment
+
+
+ Devanagari Fonts + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FontVersionGlyphsComments
Aparajita1.00706 +
Kokila1.00706 +
Mangal5.01885designed for use in user interfaces
Utsaah1.00706 +
+
+
+
+ Other Limitations +

+ Complex scripts support in Apache FOP is relatively new, so there are certain + limitations. Please help us identify and close any gaps. +

+
    +
  • Only the PDF output format fully supports complex scripts features at the present time.
  • +
  • Shaping context does not extend across an element boundary. This limitation prevents the use of + fo:character, fo:inline or fo:wrapper in order to colorize + individual Arabic letters without affecting shaping behavior across the element boundary.
  • +
+
+ + +
diff --git a/src/documentation/content/xdocs/trunk/running.xml b/src/documentation/content/xdocs/trunk/running.xml index 6bfa6bb3f..5e30bb25e 100644 --- a/src/documentation/content/xdocs/trunk/running.xml +++ b/src/documentation/content/xdocs/trunk/running.xml @@ -117,6 +117,7 @@ Fop [options] [-fo|-xml] infile [-xsl file] [-awt|-pdf|-mif|-rtf|-tiff|-png|-pcl -q quiet mode -c cfg.xml use additional configuration file cfg.xml -l lang the language to use for user information + -nocs disable complex script features -r relaxed/less strict validation (where available) -dpi xxx target resolution in dots per inch (dpi) where xxx is a number -s for area tree XML, down to block areas only @@ -366,4 +367,4 @@ Fop [options] [-fo|-xml] infile [-xsl file] [-awt|-pdf|-mif|-rtf|-tiff|-png|-pcl

If you have problems running FOP, please see the "How to get Help" page.

- \ No newline at end of file + -- cgit v1.2.3