aboutsummaryrefslogtreecommitdiffstats
path: root/src/documentation/content/xdocs/components/hmef/index.xml
blob: 168ba654089213c3d2df9d4226416bb95d2f77e5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
<?xml version="1.0" encoding="UTF-8"?>
<!--
   ====================================================================
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
   ====================================================================
-->
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd">

<document>
    <header>
        <title>POI-HMEF - Java API To Access Microsoft Transport Neutral Encoding Files (TNEF)</title>
        <subtitle>Overview</subtitle>
        <authors>
            <person name="Nick Burch" email="nick at apache dot org"/>
        </authors>
    </header>

    <body>
       <section>
         <title>Overview</title>

         <p>HMEF is the POI Project's pure Java implementation of Microsoft's
            TNEF (Transport Neutral Encoding Format), aka winmail.dat,
            which is used by Outlook and Exchange in some situations.</p>
          <p>Currently, HMEF provides a read-only api for accessing common
            message and attachment attributes, including the message body
            and attachment files. In addition, it's possible to have
            read-only access to all of the underlying TNEF and MAPI
            attributes of the message and attachments.</p>
          <p>HMEF also provides a command line tool for extracting out
            the message body and attachment files from a TNEF (winmail.dat)
            file.</p>
          <p>Write support, both for saving changes and for creating new
            files, is currently unavailable. Anyone interested in working
            on these areas is advised to read the
            <a href="site:guidelines">Contribution Guidelines</a> then
            <a href="site:mailinglists">join the dev list</a>!</p>

         <note>
            This code currently lives the
            <a href="https://github.com/apache/poi/tree/trunk/poi-scratchpad/">scratchpad area</a>
            of the POI Git repository. To use this component, ensure
            you have the Scratchpad Jar on your classpath, or a dependency
            defined on the <em>poi-scratchpad</em> artifact - the main POI
            jar is not enough! See the
            <a href="site:components">POI Components Map</a>
            for more details.
			</note>
       </section>

       <section>
         <title>Using HMEF to access TNEF (winmail.dat) files</title>

         <section>
           <title>Easy extraction of message body and attachment files</title>

           <p>The class <em>org.apache.poi.hmef.extractor.HMEFContentsExtractor</em>
             provides both command line and Java extraction. It allows the
             saving of the message body (an RTF file), and all of the
             attachment files, to a single directory as specified.</p>

           <p>From the command line, simply call the class specifying the
             TNEF file to extract, and the directory to place the extracted
             files into, eg:</p>
           <source>
              java -classpath poi-5.4.1.jar:poi-scratchpad-5.4.1.jar org.apache.poi.hmef.extractor.HMEFContentsExtractor winmail.dat /tmp/extracted/
           </source>

           <p>From Java, there are two method calls on the class, one to
             extract the message body RTF to a file, and the other to extract
             all the attachments to a directory. A typical use would be:</p>
           <source>
public void extract(String winmailFilename, String directoryName) throws Exception {
   HMEFContentsExtractor ext = new HMEFContentsExtractor(new File(winmailFilename));

   File dir = new File(directoryName);
   File rtf = new File(dir, "message.rtf");
   if(! dir.exists()) {
       throw new FileNotFoundException("Output directory " + dir.getName() + " not found");
   }

   System.out.println("Extracting...");
   ext.extractMessageBody(rtf);
   ext.extractAttachments(dir);
   System.out.println("Extraction completed");
}
           </source>
         </section>

         <section>
           <title>Attachment attributes and contents</title>

           <p>To get at your attachments, simply call the
             <em>getAttachments()</em> method on a <em>HMEFMessage</em>
             instance, and you'll receive a list of all the attachments.</p>
           <p>When you have a <em>org.apache.poi.hmef.Attachment</em> object,
             there are several helper methods available. These will all
             return the value of the appropriate underlying attachment
             attributes, or null if for some reason the attribute isn't
             present in your file.</p>
           <ul>
            <li><em>getFilename()</em> - returns the name of the attachment
              file, possibly in 8.3 format</li>
            <li><em>getLongFilename()</em> - returns the full name of the
              attachment file</li>
            <li><em>getExtension()</em> - returns the extension of the
              attachment file, including the "."</li>
            <li><em>getModifiedDate()</em> - returns the date that the
              attachment file was last edited on</li>
            <li><em>getContents()</em> - returns a byte array of the contents
              of the attached file</li>
            <li><em>getRenderedMetaFile()</em> - returns a byte array of
              a windows meta file representation of the attached file</li>
           </ul>
         </section>

         <section>
           <title>Message attributes and message body</title>

           <p>A <em>org.apache.poi.hmef.HMEFMessage</em> instance is created
             from an <em>InputStream</em> of the underlying TNEF (winmail.dat)
             file.</p>
           <p>From a <em>HMEFMessage</em>, there are three main methods of
            interest to call:</p>
           <ul>
             <li><em>getBody()</em> - returns a String containing the RTF
               contents of the message body. </li>
             <li><em>getSubject()</em> - returns the message subject</li>
             <li><em>getAttachments()</em> - returns the list of
               <em>Attachment</em> objects for the message</li>
           </ul>
         </section>

         <section>
           <title>Low level attribute access</title>

           <p>Both Messages and Attachments contain two kinds of attributes.
             These are <em>TNEFAttribute</em> and <em>MAPIAttribute</em>.</p>
           <p>TNEFAttribute is specific to TNEF files in terms of the
             available types and properties. In general, Attachments have a
             few more useful ones of these then Messages.</p>
           <p>MAPIAttributes hold standard MAPI properties and values, and
             work in a similar way to <a href="../hsmf/">HSMF
             (Outlook)</a> does. There are typically many of these on both
             Messages and Attachments. <em>Note - see limitations</em></p>
           <p>Both <em>HMEFMessage</em> and <em>Attachment</em> supports
             support two different ways of getting to attributes of interest.
             Firstly, they support list getters, to return all attributes
             (either TNEF or MAPI). Secondly, they support specific getters by
             TNEF or MAPI property.</p>
           <source>
HMEFMessage msg = new HMEFMessage(new FileInputStream(file));
for(TNEFAttribute attr : msg.getMessageAttributes()) {
   System.out.println("TNEF : " + attr);
}
for(MAPIAttribute attr : msg.getMessageMAPIAttributes()) {
   System.out.println("MAPI : " + attr);
}
System.out.println("Subject is " + msg.getMessageMAPIAttribute(MAPIProperty.CONVERSATION_TOPIC));

for(Attachment attach : msg.getAttachments()) {
   for(TNEFAttribute attr : attach.getAttributes()) {
      System.out.println("A.TNEF : " + attr);
   }
   for(MAPIAttribute attr : attach.getMAPIAttributes()) {
      System.out.println("A.MAPI : " + attr);
   }
   System.out.println("Filename is " + attach.getAttribute(TNEFProperty.ID_ATTACHTITLE));
   System.out.println("Extension is " + attach.getMAPIAttribute(MAPIProperty.ATTACH_EXTENSION));
}
           </source>
         </section>
       </section>

       <section>
         <title>Investigating a TNEF file</title>

			<p>To get a feel for the contents of a file, and to track down
			 where data of interest is stored, HMEF comes with
			 <a href="https://github.com/apache/poi/tree/trunk/poi-scratchpad/src/main/java/org/apache/poi/hmef/dev/">HMEFDumper</a>
			 to print out the contents of the file.</p>
       </section>

       <section>
         <title>Limitations</title>

          <p>HMEF is currently a work-in-progress, and not everything
            works yet. The current limitations are:</p>
          <ul>
            <li>Non-standard MAPI properties from the range 0x8000 to 0x8fff
              may not be being quite correctly turned into attributes.
              The values show up, but the name and type may not always
              be correct.</li>
            <li>All testing so far has been performed on a small number of
              English documents. We think we're correctly turning bytes into
              Java unicode strings, but we need a few non-English sample
              files in the test suite to verify this!</li>
            <li>There is no support for saving changes, nor for creating new
              files</li>
          </ul>
       </section>
    </body>
</document>