I was having a conversation with a colleague of mine about a client request. We discussed disaster recovery. In the new cloud-based computing architectures, it has become clear that this is one of the many new benefits from moving to the cloud. The following sections in this blog layout a few key elements that support this conclusion.
“Disaster recovery involves a set of policies, tools and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster.” – Wikipedia
“Disaster recovery is a set of policies and procedures which focus on protecting an organization from any significant effects in case of a negative event, which may include cyberattacks, natural disasters or building or device failures.” – Techopedia
The basic underlying assumption is that some technology failure has to occur that creates a disaster in business terms. The recovery aspect assumes that the failure requires some intentional effort based on some procedure to restore the technology failure to normal operations. With the cloud, we will see that we are able to avoid the failure in the first place. This comes at a much lower cost than traditional backup/recovery environments did in the past.
Most recovery plans assume a “primary” and “secondary/recovery” type of environment. This will normally involve data centers or cohosting environments with some significant number of miles of separation. There is usually some kind of direct communications link between these two locations. This is set up to allow for movement of application code and data on a regular basis. Another option that some opt for is a manual offsite storage of some kind that will be used to restore the application and data in the event of disaster.
Of course, there is always the need to test this backup location on a regular schedule. This is done to be sure it is fully functional and can quickly be activated by following the documented procedures. There are literally hundreds of other details that must be considered within the context of these larger activities. All of these details consume large amounts of labor and material cost every year for the average organization with multiple large mission critical applications.
The cost of these efforts and systems can be in the hundreds of thousands of dollars a year. As expected, the cost depends on the size of the business and its dependency on technology for core daily operations.
Introducing Continuous Operations
With the introduction of large public-cloud providers, both local and global companies gained a new way to avoid the cost and headaches associated with the disaster recovery approach of the past. Yes, I can hear the objectors to this claim screaming now, but give me a chance to provide some supporting evidence. With all of the hype we have experienced around cloud computing for the last 10 years, it’s hard to think clearly about some of the basic technology requirements in this area.
Here are a few key items to consider why as to why public cloud allows us to eliminate the need for formal disaster recovery plans and cost:
- Most large public-cloud providers have built or purchased many different data centers all across the globe. They therefore can easily provide physical geo separation to support any distance requirement for continued operations. They are seldom subject to complete failure in any single data center, much less across their complete portfolio of data centers.
- Cloud providers have already built in all the required redundancy in each data center for all critical operational components. This redundancy crosses all critical systems including power, cooling, security, multiple telecom providers, multiple connections, etc. These redundant capabilities include failing over complete data centers with all workloads and only minimal interruption to transaction times.
- Most cloud providers have inexpensive solutions for real-time replication of data between data centers. This replication can also include user session data as needed.
- In a hot-warm configuration, the providers will reduce the cost of the “warm” environment, which improves the cost model.
- The ability to setup completely redundant hot-hot production environments allows a company to have zero downtime when configured properly. This continuous uptime support can even be provided on a global scale to support the most demanding requirements.
- Automated monitoring tools along with the AI capabilities built into many cloud platforms provides all the real-time notifications required to support the client’s tech and security teams. The intelligence of these monitoring tools is now providing the ability to fix many problems without any human intervention.
- Many cloud providers have “built-in” redundancy for most, if not all, IaaS, PaaS services at no or minimal additional cost. This means that the first level of most common failures is already eliminated by simply utilizing these services. This in turn comes without any redundancy in secondary data centers.
- The ability to “fully” automate these application environments from the power level through the infrastructure to the application level facilitates the full creation of any environment within minutes to a few hours versus days or weeks.
- These large global cloud providers spend billions of dollars every year securing their operations from bad guys in the cyber world. This automatically provides a first level of defense for all applications and data located in their data centers. This is usually a lot more security than the average company is able to provide in their own data centers. This reduces the possibility of having any kind of disaster event to recover from in the first place.
Of course the author wants to provide some balance in this statement. Making sure the reader understands a few assumptions that go along with this statement.
- There are many applications that might require significant changes to operate in this kind of environment.
- The client’s tech organization or vendor partner will need to be sure the architecture design for the solution in the cloud meets the cloud provider’s requirements to take advantage of all the redundant capabilities.
- All solutions are only as redundant and stable as their weakest part. If you have a solution in the cloud that is dependent on some kind of component or data coming from outside the cloud provider’s data center, then you will need to be sure that component or data has all the same capabilities you have in your cloud-based solution.
The statement that disaster recovery is a thing of the past is the truth for many applications. The author would even be willing to say it is a true statement. It will not be long before we will be able to say it is true for all applications. More and more companies continue to rework their application portfolios into more modern application architectures and patterns. The days of large standalone applications and databases are quickly disappearing. Today, we are producing more digitally agile solutions, which are part of a more nimble and flexible “App Warehouse”.