To Stub or Not to Stub (Part 1 of 2)
Posted by Rick Dales, VP Product Management
In the world of email archiving, there is an ongoing argument about the value of stubbing, a process designed to help manage the storage in Exchange by replacing messages or attachments on an email server with a link to a copy of the file in an archive. I thought I’d weigh in on this topic, first by explaining the concept and looking at the pros and cons, and then (in a second post), providing a list of four best practices that businesses should follow if they’re relying on stubbing in their organization.
With the growth of email volume outpacing the reduction in total cost of storage ownership, it comes as no surprise that IT is struggling to manage Exchange storage. The real frustration for most Exchange administrators is that the vast majority of their storage is occupied with content that people almost never read. For performance and reliability reasons, Exchange is usually implemented on the most expensive of storage platforms making this usage pattern extremely expensive. Furthermore, as a transaction system, every piece of data is open for modification. This means that every piece of data needs to be backed up on a regular basis.
Introducing Stubbing – How it works
All of these factors
have led IT to investigate archiving as a means to address their storage challenges. The idea is simple – focus the Exchange
server on the delivery and management of current mail, and push the older mail
to another repository that can be managed on less expensive infrastructure.
That repository can then use archival storage management processes that allow
for incremental backup of only newly added information, rather than the entire
set.
Moving the data to another location (the archive) benefits IT; however, training users to change their behavior and look for this information in a new application (often with unique user interfaces and workflows) is often too cumbersome for broad adoption. To address these concerns, archiving vendors introduced features known as stubbing or shortcutting. This involves replacing the messages or attachments in users’ mailboxes with a pointer to the copy in the archive. From an end-user’s perspective, the email data is still accessible from Outlook, and yet they don’t run into their mailbox quota less often.
Stubbing Drawbacks
Stubbing isn’t without
its drawbacks, however. To understand the
impact on storage, you need a solid understanding of Exchange’s single instance
storage model. When a message is
delivered to multiple recipients within the same mailbox database (storage
group), the message body and attachments are only stored once, and the message
entry in each mailbox simply references the single copy of this data.
When a user modifies a message in their mailbox, Exchange creates a unique copy of the content and points the message in the user’s mailbox to that copy. As Exchange doesn’t provide any way to access the single-instance store of content, stubbing processes behave like end-user edits -- modifying messages on a mailbox by mailbox basis. If a message was sent to multiple recipients on the same mailbox database, but you only stub content for some of them, you actually increase not decrease storage by implementing stubbing. Furthermore, even though stubs may be small (typically <2K), as the stubbing process works through each mailbox, it is creating separate items in the single-instance store.
Since many elements of Exchange and data management processes are impacted by the number of entries in the tables, not just their total size, the unwinding of single-instance storage in Exchange can be problematic. As it happens, however, Microsoft Office has a habit of updating attachment metadata when a user views the item, which in most environments means that single-instance storage is pretty much non-existent within Exchange. The more of these changes that are made in Exchange between backups, the longer an incremental backup of the mail system will take.
Microsoft’s answer to the storage management problem is to change Exchange 2007 to support dramatically larger mailboxes and to change the way backup processes work so that managing these larger mailboxes databases becomes more practical. While most firms that I’ve talked to plan to increase mailbox sizes with their conversion to Exchange 2007, few are creating the 1GB mailboxes that Microsoft touts.
Conclusion
Clearly,
stubbing is not the straightforward Exchange storage management solution that
some vendors would have you believe. That having been said, when implemented
properly, it can be a valuable tool to manage the growth of Exchange storage with
minimal impact on end-user behavior. In my next post, I’ll talk about four best
practices to make the most of stubbing in your organization.




