Archive and Backup

Reading time for this article .

Last week I looked at how archiving data can help take the strain off backup systems.  The theory is pretty simple: if you have less data to backup, then it is easier to meet backup windows with fewer resources.

It should not come as a shock that this basic idea also helps in a disaster recovery event.  I am going to stick with the same example as last week, which is a customer with 100 TB of data.  I am doing this both to be consistent for those of you that read my article last week, and also because I am lazy and have already done most of the work. 

Before I get too deep into this extremely interesting conversation, I want to make sure we are all thinking about the same scenario in respect to archive.  Many people think of archives as a place where data goes to die.  Today's Active Archive systems are very different.  Most file-based data that is fairly static can easily reside in the archive.  When needed, the data is read from the archive directly into the application requesting it.  Active Archiving is as much about how we manage data as it is how we store it.

It is this direct accessibility of the data that has the biggest impact to disaster recovery.  All good archive software can create multiple copies of the data in the archive.   These copies can be cross platform.  In last week’s blog, I assumed three total copies of the archived data, one on disk, two on tape.  One of the tape copies is described as off site. 



Before implementing and archive solution, this 100 TB organization needed to plan to recover 100 TB of data.   This includes moving the data, having storage to recover it to and most important, the time to recover 100 TB. 

After moving 80 TB into the archive, recovery looks different.  All that needs to be done to access the 80 TB in the archive is to ensure the archive application is running.  The data is directly accessible from the archive medium, either disk or tape.  Actual data recovery is now reduced to the 20 TB that are not archived. 

I think of this as a double win.  Day to day, an offsite DR copy gets created and moved.  Instead of 100 TB of data to keep constantly up to date at the remote site, we now only have to keep 20 TB up to date.  The Archive application keeps the 80 TB of archive data automatically updated.  Since that data does not change much, is does not require large amounts of data to move.  Then, when recovery is necessary, only 20 TB must be recovered, rather than the full 100 TB of data, as we can access the archive without recovery.

This just might make it possible for people to implement better DR plans.

How are you using archive in relation to disaster recovery? We’d love to hear your stories and best practices.

Follow me on