Backup Is Broken


Every night millions of terabytes of data are read and written. In my view, this is a wasted IO operation. This graph should look familiar, its what I’m speaking of when I say that backup is broken. That every time I need to protect or replicate data, I have to read and write data, in many cases to multiple locations and multiple mediums. In the beginning, this graph would have been equal Reads/Writes. As backup software became more intelligent it allowed us to simply write out the deltas or changes that occurred since the last time we went to protect the data.

 You’re Doing It Wrong

As a former TSM user (and Backup Exec, Netbackup, Veeam, vRanger, ArcServe, Legato), the idea of incremental forever was always appealing to me. Yet still there are a host of data protection programs today that are still subscribing to data protection schemes cooked up in the stone age of IT.wrong I’m not the only person who believes this, over the last few years I’ve started seeing more focus being placed on intelligent backup and data protection. Still the underlying fact remains, that a significant number of organizations are still following the traditional data protection schemes that were designed in the 1990’s. Things get further complicated at the storage layer, where Snapshots are being leveraged and then being called “backups”. Repeat after me:

Storage Layer Data Protection Isn’t Cutting It.

One of the primary tools used at the storage level is the snapshot. We use them for data protection, as well as with virtual machines to protect data (to an extent). As I’m fond of screaming on Twitter: Snapshots are Not Backups!  Snapshots are helpful in my view for wanting a quick rollback position for something I want to change on a temporary basis and test. If a platform leverages a snapshot aspect today and calls it data protection, run away, and run fast, like as if the fast zombies from 28 Days Later were chasing you. Thankfully that’s probably not going to happen. wastefulNow for the sake of argument, lets say that snapshots are a valid form of data protection. At the storage level, what happens if I want to protect just one VM (the green one in the image to the left) in a datastore or LUN? At that point, and in general terms, the storage arrays have no inherent understanding of the data within at the block level. Thus the concept of VM-Centricity is alien. There is no inherent understanding of what constitutes the VM object in the LUN, and therefore the ability to segment or isolate a single Virtual Machine at that layer is not possible. And thats where we get into the situation of having to protect and replicate large chunks of data we don’t need. Now backup software itself can address the limitations I’ve discussed above, but along with my “What If” question in the last post, what if VM-Centric protection was built into your infrastructure as a native function and allowed you to not have to leverage a 3rd party software layer?

Now thats just local data protection, and I’ve yet to even touch on moving data from one location to another for offsite protection. The same scenario plays out when we do replication at the array level. The lack of data awareness means that we can and do replicate a good amount of data we don’t need or want. Swap files, temp files? All included even when we have no need of them. To go further, if I want to get granular down to the individual VM level, the storage systems we have today cannot provide replication features at that level. We have to leave that functionality up to software solutions.

Commercial Time!

12-19-2013 8-34-39 PMSo my counterpart Damian Bowersock has written a bit about our New Approach to Data Protection: Part 1 and Part 2. I wanted to discuss it by putting my own take on it and how I see some of the benefits with OmniCube.

One functional aspect of the SimpliVity OmniCube platform that sets it apart from other Converged systems is the built in backup and replication capabilities. These are native operations that come with the platform. These are not ala-carte or pay for services, it’s all included. These are independent, crash consistent, point in time backups of the virtual machines that run on the OmniCube. They are also VM-Centric in their nature, meaning we address this data protection at the VM object level for local data protection, as well as off-site replication. Furthermore, these are policy driven and automated. Create a policy with your data protection scheme and apply it to the VM’s you wish.policies Need to backup that VM every 10 minutes? Simply create a policy and set the rules in place to do so. From that point on, its an automated process with all your standard aging and data retention rules. Oh and the best part, that a local backup takes 3 seconds, consumes no space, and has zero impact on the running virtual machine. That backup window in the first image presented in this blog post becomes a thing of the past.

RPO/RTO can be met very quickly, and replication schedules can be crafted at the per-VM level based on data change rates instead of trying to meet a one size fits all scheme. Now that is for local VM’s running on High Availability enabled OmniCube platform (two OmniCubes). I need to point out that one thing to keep in mind is that when the system scales there is not just high availability from a standpoint of running VM’s, there is high availability from the standpoint of the storage level as well. So if I have two OmniCube systems connected together and one fails, not only is the VM protected and standard VMware based HA kicks off to bring the VM up on the surviving system, but the backup data and VM storage is protected as well.



Blah Blah Blah

So yet again, I find myself being a bit long winded and prattling on. What started out as a screengrab long-windedfrom a system in the lab, moved into a full blog post, that migrated into a second blog post, that looks like it might move into a third. Hopefully you have not fallen asleep yet and see that there is something to this Hyper-Convergence stuff that you were not aware of. Next up, VM Mobility. Oh and if you are looking for additional information on OmniCube, this is a good starting point.

Edited to add: because my Veeam friends asked, what about granular file level recovery and application integrated backup/recovery? Application consistent backup is available currently but single level file is not. For the larger part of this discussion I’ve focused more on the machine level protection and not app/file. Hope that helps.

Leave a Reply