Approach 1: Mailbox Archiving (Part 2 of 4)
Posted by Rick Dales, VP Product Management
In a previous post, I introduced the idea that there are multiple approaches to archiving. In this post, I will dive more deeply into one of the two most common approaches, known as mailbox archiving, including how it works and what problems it is best suited to address.
Mailbox archiving is the process of periodically connecting to a user’s mailbox and looking for content that matches some criteria (an archiving policy) and adding it to the archive. While a mailbox archiving process might run on a nightly basis, typically the archiving policies are set to only store messages that are older than a certain age (typically 30-90 days).
Strengths
- Visibility to all content and state information in the mailbox
By connecting directly to the user’s mailbox, the archiving system can see (and choose to capture) any type of content, including calendar events, that wouldn’t be sent to another user. Similarly, they can capture which folder the user has put the item into. - Ability to modify messages in the mailbox
With direct access to the user’s mailbox, the original message can be modified (flagged), deleted or replaced with a pointer to the copy in the archive. - Easy to provide end-user access
As the archive knows which mailbox it found a message in, it can easily provide the appropriate security controls to provide users with access to the messages in their mailbox without granting access to other messages.
Weaknesses
- Incomplete set of messages are captured
Similar to backups, any periodic snapshot activity cannot record things that arrived and were subsequently deleted between capture cycles. Given that users read and then deleted over 50% of messages on the day they receive them, periodic capture will miss the majority of mail – even if the archiving policy is set to capture messages immediately. - Incomplete picture of each message’s recipients
When a user receives a message they have no visibility to the set of recipients that were BCC’d. In addition, if the message was sent to a distribution list, the actual set of recipients isn’t stored with the message. In the period between message receipt and capture, the membership of the distribution list can change materially (or the distribution list can be deleted from the mail system entirely). - Duplicate message removal is very difficult
While digital signatures can be used to find and remove duplication of message bodies and attachments to optimize the storage within the archive, removing duplication of the messages themselves is difficult because the set of recipients may be different and the meta data about when a message was received will vary from mailbox to mailbox. When performing legal discovery across a set of users, duplicate copies of messages from different user’s mailboxes dramatically increases the costs of reviewing messages to be produced for opposing counsel.
Appropriate Uses of Mailbox Archiving
Bested suited for: Mailbox Storage Management
Mailbox archiving is appropriate for active mailbox storage management. A significant advantage - mailbox archiving systems can “stub” or “shortcut” messages so that users don’t need to change their behavior to access historical mail. It is important to note, however, that without an active process that removes content from user’s mailbox, an archive only aids in storage management if combined with tight mailbox quotas – requiring users to spend hours each month on manual cleanup tasks.
Not appropriate for: Legal Discovery or Regulatory Compliance
Since mailbox archiving does not ensure the archiving all messages, nor does it provide a complete view of all message traffic, it is not suitable to address legal discovery or regulatory compliance requirements.
Click here to read Part 1 of Different Approaches to Archiving Email

Comments