You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

xml_parsing.xml 5.6KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
  1. <?xml version="1.0"?>
  2. <document>
  3. <header>
  4. <title>XML Parsing</title>
  5. <subtitle>All you wanted to know about XML Parsing !</subtitle>
  6. <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/>
  7. </authors>
  8. </header>
  9. <body>
  10. <s1 title="XML Parsing"><p>Since everyone knows the basics we can get
  11. into the various stages starting with the XML handling.</p>
  12. <s2 title="XML Input"><p>FOP can take the input XML in a number of ways:
  13. </p>
  14. <ul>
  15. <li>SAX Events through SAX Handler
  16. <ul>
  17. <li>
  18. <code>FOTreeBuilder</code> is the SAX Handler which is
  19. obtained through <code>getContentHandler</code> on
  20. <code>Driver</code>.
  21. </li>
  22. </ul>
  23. </li>
  24. <li>
  25. DOM which is converted into SAX Events
  26. <ul>
  27. <li>
  28. The conversion of a DOM tree is done via the
  29. <code>render(Document)</code> method on
  30. <code>Driver</code>.
  31. </li>
  32. </ul>
  33. </li>
  34. <li>
  35. data source which is parsed and converted into SAX Events
  36. <ul>
  37. <li>
  38. The <code>Driver</code> can take an
  39. <code>InputSource</code> as input. This can use a
  40. <code>Stream</code>, <code>String</code> etc.
  41. </li>
  42. </ul>
  43. </li>
  44. <li>
  45. XML+XSLT which is transformed using an XSLT Processor and
  46. the result is fired as SAX Events
  47. <ul>
  48. <li>
  49. <code>XSLTInputHandler</code> is used as an
  50. <code>InputSource</code> in the
  51. render(<code>XMLReader</code>,
  52. <code>InputSource</code>) method on
  53. <code>Driver</code>
  54. </li>
  55. </ul>
  56. </li>
  57. </ul>
  58. <p>The SAX Events which are fired on the SAX Handler, class
  59. <code>FOTreeBuilder</code>, must represent an XSL:FO document. If not there will be an
  60. error. Any problems with the XML being well formed are handled here.</p></s2>
  61. <s2 title="Element Mappings"><p> The element mapping is a hashmap of all
  62. the elements in a particular namespace. This makes it easy to create a
  63. different object for each element. Element mappings are static to save on
  64. memory. </p><p>To add an extension a developer can put in the classpath a jar
  65. that contains the file <code>/META-INF/services/org.apache.fop.fo.ElementMapping</code>.
  66. This must contain a line with the fully qualified name of a class that
  67. implements the <em>org.apache.fop.fo.ElementMapping</em> interface. This will then be
  68. loaded automatically at the start. Internal mappings are: FO, SVG and Extension
  69. (pdf bookmarks)</p></s2>
  70. <s2 title="Tree Building"><p>The SAX Events will fire all the information
  71. for the document with start element, end element, text data etc. This
  72. information is used to build up a representation of the FO document. To do this
  73. for a namespace there is a set of element mappings. When an element + namepsace
  74. mapping is found then it can create an object for that element. If the element
  75. is not found then it creates a dummy object or a generic DOM for unknown
  76. namespaces.</p>
  77. <p>The object is then setup and then given attributes for the element.
  78. For the FO Tree the attributes are converted into properties. The FO objects
  79. use a property list mapping to convert the attributes into a list of properties
  80. for the element. For other XML, for example SVG, a DOM of the XML is
  81. constructed. This DOM can then be passed through to the renderer. Other element
  82. mappings can be used in different ways, for example to create elements that
  83. create areas during the layout process or setup information for the renderer
  84. etc.</p>
  85. <p>
  86. While the tree building is mainly about creating the FO Tree
  87. there are some stages that can propagate to the renderer. At
  88. the end of a page sequence we know that all pages in the
  89. page sequence can be laid out without being effected by any
  90. further XML. The significance of this is that the FO Tree
  91. for the page sequence may be able to be disposed of. The
  92. end of the XML document also tells us that we can finalise
  93. the output document. (The layout of individual pages is
  94. accomplished by the layout managers page at a time;
  95. i.e. they do not need to wait for the end of the page
  96. sequence. The page may not yet be complete, however,
  97. containing forward page number references, for example.)
  98. </p>
  99. </s2>
  100. <s2 title="Associated Tasks">
  101. <ul><li>Error handling for xml not well formed.</li>
  102. <li>Error handling for other XML parsing errors.</li><li>Developer
  103. info for adding namespace handlers.</li></ul></s2></s1>
  104. </body></document>