This year at VMworld during my “Enterprise Protection and Reliability for VMware Cloud Foundation with Cohesity” presentation in both Las Vegas and Barcelona, I discussed the topic of consistency models and the different implementation options applied in distributed computing and storage systems. One of the reasons I included this topic as part of my presentation was that I wanted to raise awareness around the importance of the implementation of consistency models in general from architecture and technology perspective. I feel that this is an important concept to understand when looking to the use or procuring any data protection and recovery solution based on distributed storage architectures.

Consistency models are resiliency and availability techniques implemented in distributed systems for data accessibility and availability. I’m going to cover two consistency models in particular – Eventual Consistency and Strict Consistency. I will describe and demonstrate their concepts and behavior and their applicability in the context of VMware vSphere and VMware Cloud Foundation and Cohesity. While I will cover the logical implementation and behavior of the consistency models from the Cohesity implementation perspective, It is crucial for everyone to consider this as a general technological approach that applies to any suitable traditional and modern data protection and recovery product sold by any vendor.

For specific details on this topic of Eventual and Strict consistency for the Cohesity DataPlatform check out the article Strict vs. Eventual Consistency written by Apurv Gupta, Chief Architect at Cohesity. Now to set the context, I will begin by providing an abbreviated description for both consistency models, and then I will present a logical illustration their functions before mapping them to what they mean to a VMware vSphere and VMware Cloud Foundation environments.

Eventual Consistency – implemented to achieve high availability that is informally guaranteed. Data updates are typically written and acknowledged onto a single node within a cluster. This model doesn’t assure that data writes and their updates are being distributed throughout the cluster.

Logical Illustration and workflow Animation of Eventual Consistency

Strick Consistency – is the most reliable model to implement as it guarantees data writes and updates are acknowledged and distributed throughout the cluster. All data is accessible from any node in the cluster.

Logical Illustration and workflow Animation of Strict Consistency

Now that I have provided the brief description for the two consistency models let me move on to the points I want to highlight and map them to the VMware vSphere, and VMware Cloud Foundation context. I will zoom in on the risks and problems that organization using traditional and even modern data protection, and recovery solutions are exposed based on the implementation of the consistency model of their solution.

My goal is to present and illustrate this in the most practical way possible so that everyone can understand and realize the risks they may be exposed to because of lack of awareness and understanding on this topic. Traditional and modern data protection and recovery solutions offered by many vendors provide the ability to quickly restore a VM or data with a feature often called Instant Recovery.

The restoration workflow and implementation details are different based on the vendor and product specific details.  From a vSphere environment perspective, there are a series of recovery functions that are performed (manual or automatic) to restore the necessary VM. Typically, the data protection and recovery solution where a copy of the VM or data is stored provides some form of a storage abstraction that is mounted onto vSphere. This is part of the reason the VM is recovered instantaneously. At this point, vSphere will provide the compute resources to run the VM if its necessary.

After the VM is recovered a time comes when the VM has to be migrated to the primary storage platform where it formerly resided.

In vSphere, Storage vMotion is used to migrate data over the network. Now while there are technologies and different features that make it possible to recover and instantiate a VM in minutes, those capabilities don’t exist when it comes to potentially moving hundreds of gigabytes across a network. Depending on the size and capacity being transferred across the network, the process can take a long time to complete depending on network bandwidth, interface saturation, etc.

The logical illustrations below describe the behavior and procedure of what happens in a recovery scenario similar to the one described above and what the impact of the implementation of the different consistency models.

Impact of Data Protection and Recovery Solutions with Eventual Consistency in VMware vSphere and VMware Cloud Foundation environments when a VM needs to be restored:
  1. Prepare and restore the VM locally onto a storage abstraction that is presented to vSphere in the form of an NFS volume. The abstraction is presented from a single node based on eventual consistency.
  2. Automatically present and mount the storage abstraction to vSphere (NFS) from one of the nodes in the data protection and recovery cluster VM is instantiated and accessible on vSphere. Read and write I/O are directed to the VM stored on the storage abstraction (NFS) presented from a single node.
  3. New data being created is not protected nor is it being distributed across the other nodes in the data protection and recovery cluster.
  4. SvMotion starts the migration of the VM back to the primary storage platform – this can take a long time
  5. If the node in the data protection and recovery cluster from where the storage abstraction (NFS) is being presented to vSphere fails the following happens:
  • The storage abstraction (NFS) becomes inaccessible to vSphere
  • The VM is no longer available or accessible
  • SvMotion fails
  • Any newly created data can be lost

That is not an acceptable outcome when you depend on a data protection and recovery solution as your insurance policy. The results of this depending on the magnitude of the failure can put a company out of business or at the very least cost someone their jobs.

Impact of Cohesity with Strict Consistency in VMware vSphere and VMware Cloud Foundation environments when a VM needs to be restored:
  1. Prepare and restore the VM locally onto a storage abstraction that is presented to vSphere in the form of an NFS volume. The abstraction is presented from the Cohesity cluster via virtual IP enforcing strict consistency model.
  2. Automatically present and mount the storage abstraction to vSphere (NFS) from a virtual IP from the Cohesity cluster. The VM is instantiated and accessible on vSphere. Read and write I/O are directed to the VM stored on the storage abstraction (NFS) presented from the virtual IP of the Cohesity cluster.
  3. New data being created is being distributed and acknowledge across the other nodes in the Cohesity cluster.
  4. SvMotion starts the migration of the VM back to the primary storage platform – this can take a long time
  5. if a node in the Cohesity cluster fails the storage abstraction (NFS) being presented to vSphere remains available, the SvMotions will continue until completed because the SvMotion will continue until completed because of the use of virtual IPs and strict consistency which together mitigate the risk of data loss.

That is the outcome enterprise organizations should expect and demand from their data protection and recovery solution when looking to leverage features such as instant recovery.

In order to mitigate the risks of data inaccessibility and potential data loss in the event of a node failure Cohesity data resiliency and high availability, consistency implementation is based on strict consistency. All of Cohesity’s features and capabilities and modern architectures were designed to meet today’s current business requirements without any affinity to business requirements collected two to three decades ago. I hope this information is useful and educational. Now take a look at this these concepts in action in the demonstration below. Get some popcorn and sit back, this is a long one but you’re going to love it!!

Cohesity Instant Recovery of VMware Cloud Foundation Management Stack with Strict Consistency validation

-Enjoy

For future updates about Cohesity, Hyperconverged Secondary Storage, Cloud Computing, Networking, VMware vSAN, vSphere Virtual Volumes (VVol), vSphere Integrated OpenStack (VIO), and Cloud-Native Applications (CNA), and anything in our wonderful world of technology be sure to follow me on Twitter: @PunchingClouds.

X