From 4e5ca23d62d184bb8967c98a4a71d2263566c926 Mon Sep 17 00:00:00 2001 From: Jeremias Maerki Date: Mon, 25 Aug 2008 08:23:31 +0000 Subject: [PATCH] Added a page on metadata (partly ported from the FOP Wiki). git-svn-id: https://svn.apache.org/repos/asf/xmlgraphics/fop/trunk@688653 13f79535-47bb-0310-9956-ffa450edef68 --- .../content/xdocs/0.95/metadata.xml | 243 ++++++++++++++++++ src/documentation/content/xdocs/0.95/pdfa.xml | 3 - src/documentation/content/xdocs/site.xml | 2 + .../content/xdocs/trunk/metadata.xml | 243 ++++++++++++++++++ .../content/xdocs/trunk/pdfa.xml | 3 - 5 files changed, 488 insertions(+), 6 deletions(-) create mode 100644 src/documentation/content/xdocs/0.95/metadata.xml create mode 100644 src/documentation/content/xdocs/trunk/metadata.xml diff --git a/src/documentation/content/xdocs/0.95/metadata.xml b/src/documentation/content/xdocs/0.95/metadata.xml new file mode 100644 index 000000000..8e831e800 --- /dev/null +++ b/src/documentation/content/xdocs/0.95/metadata.xml @@ -0,0 +1,243 @@ + + + + + +
+ Metadata +
+ +
+ Overview +

+ Document metadata is an important tool for categorizing and finding documents. + Various formats support different kinds of metadata representation and to + different levels. One of the more popular and flexible means of representing + document or object metadata is + XMP (eXtensible Metadata Platform, specified by Adobe). + PDF 1.4 introduced the use of XMP. The XMP specification lists recommendation for + embedding XMP metdata in other document and image formats. Given its flexibility it makes + sense to make use this approach in the XSL-FO context. Unfortunately, unlike SVG which + also refers to XMP, XSL-FO doesn't recommend a preferred way of specifying document and + object metadata. Therefore, there's no portable way to represent metadata in XSL-FO + documents. Each implementation does it differently. +

+
+
+ Embedding XMP in an XSL-FO document +

+ As noted above, there's no officially recommended way to embed metadata in XSL-FO. + Apache FOP supports embedding XMP in XSL-FO. Currently, only support for document-level + metadata is implemented. Object-level metadata will be implemented when there's + interest. +

+

+ Document-level metadata can be specified in the fo:declarations element. + XMP specification recommends to use x:xmpmeta, rdf:RDF, and + rdf:Description elements as shown in example below. Both + x:xmpmeta and rdf:RDF elements are recognized as the top-level + element introducing an XMP fragment (as per the XMP specification). +

+
+ Example + + + + + + + Document title + Document author + Document subject + + + + Tool used to make the PDF + + + + + + + fo:declarations must be declared after + fo:layout-master-set and before the first page-sequence. + +
+
+
+ Implementation in Apache FOP +

+ Currently, XMP support is only available for PDF output. +

+

+ Originally, you could set some metadata information through FOP's FOUserAgent by + using its set*() methods (like setTitle(String) or setAuthor(String). These values are + directly used to set value in the PDF Info object. Since PDF 1.4, adding metadata as an + XMP document to a PDF is possible. That means that there are now two mechanisms in PDF + that hold metadata. +

+

+ Apache FOP now synchronizes the Info and the Metadata object in PDF, i.e. when you + set the title and the author through the FOUserAgent, the two values will end up in + the (old) Info object and in the new Metadata object as XMP content. If instead of + FOUserAgent, you embed XMP metadata in the XSL-FO document (as shown above), the + XMP metadata will be used as-is in the PDF Metadata object and some values from the + XMP metadata will be copied to the Info object to maintain backwards-compatibility + for PDF readers that don't support XMP metadata. +

+

+ The mapping between the Info and the Metadata object used by Apache FOP comes from + the PDF/A-1 specification. + For convenience, here's the mapping table: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Document information dictionaryXMP
EntryPDF typePropertyXMP typeCategory
Titletext stringdc:titleTextExternal
Authortext stringdc:creatorseq TextExternal
Subjecttext stringdc:description["x-default"]TextExternal
Keywordstext stringpdf:KeywordsTextExternal
Creatortext stringxmp:CreatorToolTextExternal
Producertext stringpdf:ProducerTextInternal
CreationDatedatexmp:CreationDateDateInternal
ModDatedatexmp:ModifyDateDateInternal
+ + "Internal" in the Category column means that the user should not set this value. + It is set by the application. + + + The "Subject" used to be mapped to dc:subject in the initial publication of + PDF/A-1 (ISO 19005-1). In the + Technical Corrigendum 1 + this was changed to map to dc:description["x-default"]. + +
+ Namespaces +

+ Metadata is made of property sets where each property set uses a different namespace URI. +

+

+ The following is a listing of namespaces that Apache FOP recognizes and acts upon, + mostly to synchronize the XMP metadata with the PDF Info dictionary: +

+ + + + + + + + + + + + + + + + + + + + + +
Set/SchemaNamespace PrefixNamespace URI
Dublin Coredchttp://purl.org/dc/elements/1.1/
XMP Basicxmphttp://ns.adobe.com/xap/1.0/
Adobe PDF Schemapdfhttp://ns.adobe.com/pdf/1.3/
+

+ Please refer to the XMP Specification + for information on other metadata namespaces. +

+

+ Property sets (Namespaces) not listed here are simply passed through to the final + document (if supported). That is useful if you want to specify a custom metadata + schema. +

+
+
+ + +
diff --git a/src/documentation/content/xdocs/0.95/pdfa.xml b/src/documentation/content/xdocs/0.95/pdfa.xml index 1b3b75561..bfa1ae33e 100644 --- a/src/documentation/content/xdocs/0.95/pdfa.xml +++ b/src/documentation/content/xdocs/0.95/pdfa.xml @@ -28,9 +28,6 @@
Overview - - Support for PDF/A is available beginning with version 0.92. -

PDF/A is a standard which turns PDF into an "electronic document file format for long-term preservation". PDF/A-1 is the first part of the diff --git a/src/documentation/content/xdocs/site.xml b/src/documentation/content/xdocs/site.xml index 5ebdef322..d87b68aab 100644 --- a/src/documentation/content/xdocs/site.xml +++ b/src/documentation/content/xdocs/site.xml @@ -123,6 +123,7 @@ + @@ -157,6 +158,7 @@ + diff --git a/src/documentation/content/xdocs/trunk/metadata.xml b/src/documentation/content/xdocs/trunk/metadata.xml new file mode 100644 index 000000000..8e831e800 --- /dev/null +++ b/src/documentation/content/xdocs/trunk/metadata.xml @@ -0,0 +1,243 @@ + + + + + +

+ Metadata +
+ +
+ Overview +

+ Document metadata is an important tool for categorizing and finding documents. + Various formats support different kinds of metadata representation and to + different levels. One of the more popular and flexible means of representing + document or object metadata is + XMP (eXtensible Metadata Platform, specified by Adobe). + PDF 1.4 introduced the use of XMP. The XMP specification lists recommendation for + embedding XMP metdata in other document and image formats. Given its flexibility it makes + sense to make use this approach in the XSL-FO context. Unfortunately, unlike SVG which + also refers to XMP, XSL-FO doesn't recommend a preferred way of specifying document and + object metadata. Therefore, there's no portable way to represent metadata in XSL-FO + documents. Each implementation does it differently. +

+
+
+ Embedding XMP in an XSL-FO document +

+ As noted above, there's no officially recommended way to embed metadata in XSL-FO. + Apache FOP supports embedding XMP in XSL-FO. Currently, only support for document-level + metadata is implemented. Object-level metadata will be implemented when there's + interest. +

+

+ Document-level metadata can be specified in the fo:declarations element. + XMP specification recommends to use x:xmpmeta, rdf:RDF, and + rdf:Description elements as shown in example below. Both + x:xmpmeta and rdf:RDF elements are recognized as the top-level + element introducing an XMP fragment (as per the XMP specification). +

+
+ Example + + + + + + + Document title + Document author + Document subject + + + + Tool used to make the PDF + + + + + + + fo:declarations must be declared after + fo:layout-master-set and before the first page-sequence. + +
+
+
+ Implementation in Apache FOP +

+ Currently, XMP support is only available for PDF output. +

+

+ Originally, you could set some metadata information through FOP's FOUserAgent by + using its set*() methods (like setTitle(String) or setAuthor(String). These values are + directly used to set value in the PDF Info object. Since PDF 1.4, adding metadata as an + XMP document to a PDF is possible. That means that there are now two mechanisms in PDF + that hold metadata. +

+

+ Apache FOP now synchronizes the Info and the Metadata object in PDF, i.e. when you + set the title and the author through the FOUserAgent, the two values will end up in + the (old) Info object and in the new Metadata object as XMP content. If instead of + FOUserAgent, you embed XMP metadata in the XSL-FO document (as shown above), the + XMP metadata will be used as-is in the PDF Metadata object and some values from the + XMP metadata will be copied to the Info object to maintain backwards-compatibility + for PDF readers that don't support XMP metadata. +

+

+ The mapping between the Info and the Metadata object used by Apache FOP comes from + the PDF/A-1 specification. + For convenience, here's the mapping table: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Document information dictionaryXMP
EntryPDF typePropertyXMP typeCategory
Titletext stringdc:titleTextExternal
Authortext stringdc:creatorseq TextExternal
Subjecttext stringdc:description["x-default"]TextExternal
Keywordstext stringpdf:KeywordsTextExternal
Creatortext stringxmp:CreatorToolTextExternal
Producertext stringpdf:ProducerTextInternal
CreationDatedatexmp:CreationDateDateInternal
ModDatedatexmp:ModifyDateDateInternal
+ + "Internal" in the Category column means that the user should not set this value. + It is set by the application. + + + The "Subject" used to be mapped to dc:subject in the initial publication of + PDF/A-1 (ISO 19005-1). In the + Technical Corrigendum 1 + this was changed to map to dc:description["x-default"]. + +
+ Namespaces +

+ Metadata is made of property sets where each property set uses a different namespace URI. +

+

+ The following is a listing of namespaces that Apache FOP recognizes and acts upon, + mostly to synchronize the XMP metadata with the PDF Info dictionary: +

+ + + + + + + + + + + + + + + + + + + + + +
Set/SchemaNamespace PrefixNamespace URI
Dublin Coredchttp://purl.org/dc/elements/1.1/
XMP Basicxmphttp://ns.adobe.com/xap/1.0/
Adobe PDF Schemapdfhttp://ns.adobe.com/pdf/1.3/
+

+ Please refer to the XMP Specification + for information on other metadata namespaces. +

+

+ Property sets (Namespaces) not listed here are simply passed through to the final + document (if supported). That is useful if you want to specify a custom metadata + schema. +

+
+
+ + + diff --git a/src/documentation/content/xdocs/trunk/pdfa.xml b/src/documentation/content/xdocs/trunk/pdfa.xml index 1b3b75561..bfa1ae33e 100644 --- a/src/documentation/content/xdocs/trunk/pdfa.xml +++ b/src/documentation/content/xdocs/trunk/pdfa.xml @@ -28,9 +28,6 @@
Overview - - Support for PDF/A is available beginning with version 0.92. -

PDF/A is a standard which turns PDF into an "electronic document file format for long-term preservation". PDF/A-1 is the first part of the -- 2.39.5