Recently I was in need to parse Outlook emails to extract some values so that automated tests can pass multifactor authentication. I was hoping for some naïve implementation in JavaScript but could not found reliable solution there so that I search for good library in Java. I was not even surprised that there were several solutions for parsing Outlook msg files. Java truly has library for everything.
I chose the the Auxilii msgparser library. As it seemed like the easiest to use solution.
Added via Maven.
<dependency>
<groupId>com.auxilii.msgparser</groupId>
<artifactId>msgparser</artifactId>
<version>1.1.15</version>
</dependency>
Usage is then straight forward
Message parsedMessage = new MsgParser().parseMsg(msgFile.getInputStream());
String body = parsedMessage.getBodyText();
List<Attachment> attachments = parsedMessage.getAttachments();
Please be aware that Outlook on MacOS does not use msg
format for it’s emails. Exported emails on mac are eml
. Those are exported in plain text so they could be parsed via regex just be reading the file.
The whole code supporting all would look like this.
String body = "";
if(file.getName().endsWith("msg")) {
Message parsedMessage = new MsgParser().parseMsg(file);
body = parsedMessage.getBodyText();
} else if (file.getName().endsWith("eml")) {
body = new String(Files.readAllBytes(file.toPath()), StandardCharsets.UTF_8);
}
// here parse your body
If this is interesting to you, you can follow me on Twitter.