docs/design/understanding/images.xml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146

<?xml version="1.0" standalone="no"?>
<!-- Overview -->
<document> 
  <header> 
	 <title>Images</title> 
	 <subtitle>All you wanted to know about Images in FOP !</subtitle> 
	 <authors> <person name="Keiron Liddle" email="keiron@aftexsw.com"/> 
	 </authors> 
  </header> 
  <body>
 

  <s1 title="Images in FOP"> <note> this is still in progress, input in the code is welcome. Needs documenting formats, testing. 
  So all those people interested in images should get involved.</note>
		<p>Images may only be needed to be loaded when the image is rendered to the
output or to find the dimensions.<br/>
An image url may be invalid, this can be costly to find out so we need to
keep a list of invalid image urls.</p> 
<p>We have a number of different caching schemes that are possible.</p>
<p>All images are referred to using the url given in the XSL:FO after
removing "url('')" wrapping. This does
not include any sort of resolving such as relative -> absolute. The
external graphic in the FO Tree and the image area in the Area Tree only
have the url as a reference.
The images are handled through a static interface in ImageFactory.<br/></p>


<p>(insert image)</p>		


<s2 title="Threading">


<p>In a single threaded case with one document the image should be released
as soon as the renderer caches it. If there are multiple documents then
the images could be held in a weak cache in case another document needs to
load the same image.</p>


<p>In a multi threaded case many threads could be attempting to get the same
image. We need to make sure an image will only be loaded once at a
particular time. Once a particular document is finished then we can move
all the images to a common weak cache.</p>
</s2>

<s2 title="Caches">
<s3 title="LRU">
<p>All images are in a common cache regardless of context. To limit the size
of the cache the LRU image is removed to keep the amount of memory used
low. Each image can supply the amount of data held in memory.</p>
</s3>

<s3 title="Context">
<p>Images are cached according to the context, using the FOUserAgent as a key.
Once the context is finished the images are added to a common weak hashmap
so that other contexts can load these images or the data will be garbage
collected if required.</p>
<p>If images are to be used commonly then we cannot dispose of data in the
FopImage when cached by the renderer. Also if different contexts have
different base directories for resolving relative url's then the loading
and caching must be separate. We can have a cache that shares images among
all contexts or only loads an image for a context.</p>
</s3>

<p>The cache uses an image loader so that it can synchronize the image
loading on an image by image basis. Finding and adding an image loader to
the cache is also synchronized to prevent thread problems.</p>
</s2>

<s2 title="Invalid Images">


<p>
If an image cannot be loaded for some reason, for example the url is
invalid or the image data is corrupt or an unknown type. Then it should
only attempt to load the image once. All other attempts to get the image
should return null so that it can be easily handled.<br/>
This will prevent any extra processing or waiting.</p>
</s2>


<s2 title="Reading">
<p>Once a stream is opened for the image url then a set of image readers is
used to determine what type of image it is. The reader can peek at the
image header or if necessary load the image. The reader can also get the
image size at this stage.
The reader then can provide the mime type to create the image object to
load the rest of the information.<br/></p></s2>


<s2 title="Data">


<p>The data usually need for an image is the size and either a bitmap or the
original data. Images such as jpeg and eps can be embedded into the
document with the original data. SVG images are converted into a DOM which
needs to be rendered to the PDF. Other images such as gif, tiff etc. are
converted into a bitmap.
Data is loaded by the FopImage by calling load(type) where type is the type of data to load.<br/></p></s2>


<s2 title="Rendering">

<p>Different renderers need to have the information in different forms.</p>


<s3 title="PDF">
<dl><dt>original data</dt>  <dd>JPG, EPS</dd>
<dt>bitmap</dt>  <dd>gif, tiff, bmp, png</dd>
<dt>other</dt>  <dd>SVG</dd></dl>
</s3>

<s3 title="PS">
<dl><dt>bitmap</dt>  <dd>JPG, gif, tiff, bmp, png</dd>
<dt>other</dt> <dd>SVG</dd></dl>
</s3>

<s3 title="awt">
<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
<dt>other</dt>  <dd>SVG</dd></dl></s3>


<p>The renderer uses the url to retrieve the image from the ImageFactory and
then load the required data depending on the image mime type. If the
renderer can insert the image into the document and use that data for all
future references of the same image then it can cache the reference in the
renderer and the image can be released from the image cache.</p></s2>
</s1> 
  </body></document>