Disaster recovery – Critical to the ‘Always-On’ business

In the age of the ‘Always-On’ business, companies are facing new demands from end users including 24/7 access to data and applications and whom have no patience for downtime or data loss, all this while grappling with an exponential data growth of  30-50% per year, writes Gregg Petersen, Regional Director, Middle East and SAARC, Veeam Software.

To cope with these demands, many companies today are building modern data centres and investing in server virtualisation, modern storage applications and cloud-based services, in pursuit of higher speeds, more efficient use of existing resources and possible cost savings.

While this is definitely a step in the right direction, the irony is that many areas of the data centre still rely on legacy infrastructure and technologies which inhibit the network from functioning at optimal levels and leads to data loss, longer recovery times, unreliable data protection, a lack of transparency and the inability to analyze IT traffic. According to the Veeam Data Center Availability Report 2014, companies risk losing between $ 4.4 million and $ 7.9 million in lost data and applications failures each year; downtime can cost between $ 1.4 million and $ 2.3 billion a year in lost revenue, reduced productivity and lost opportunities; the total annual cost of downtime and data loss can reach more than $10 million a year.

In addition to these financial losses, there is the risk of damage to an organisation’s reputation; something that is very difficult to measure in pure financial terms but can cripple an organisation, setting it back years and putting it at a decided disadvantage vis-a-vis the competition.

There is clearly an availability gap between the requirements of an Always-On infrastructure, and IT’s ability to effectively deliver availability. In fact, 82% of CIOs say there is a gap between the level of availability they provide and what end users demand. One of the keys to bridging this availability gap is modernising the data centre, paying special attention to Business Continuity/Disaster Recovery planning (BC/DR) — how to carry on operating even after a disaster. Unfortunately this is an area that is often overlooked in large part because there is the misconception that it can be very expensive, according to the 2014 report on ‘The State of Global Disaster Recovery Preparedness’ published by the Disaster Recover Preparedness Council, three out of four organisations are at risk of failing to recover from a disaster.

Evaluate data protection needs

The first step in implementing a successful BC/DR plan is evaluation – organizations need to conduct a thorough risk assessment of the entire IT infrastructure and all services that support business critical applications. The next step is to define the Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs) for the various business critical applications. For businesses to be considered ‘Always-On’, it is recommended that they target recovery time and point objectives (RTPOs) of less than 15 minutes for all applications and data.

A good business continuity plan should include a ‘runbook’ or script that sets out exactly what needs to be done, by whom and in what order. For example, an Exchange server won’t connect unless Active Directory is running, so the organization knows that it will need Active Directory before it can get email back. Once the runbook is set up, much of the process can be automated so that key staff don’t have to make important decisions in the heat and pressure of the moment. Organizations will always want to ensure that the actual decision to fail over to the disaster recovery plan is made by an actual human, preferably a C-level executive, because once the decision is made to fail over it’s difficult to rewind. That being said, once the big red button is pushed, automation, except at a few key points, is extremely helpful.

Embrace DRaaS – The next Generation in disaster recovery solutions

For comprehensive data protection and recovery, particularly in case of disasters, organizations should follow the 3-2-1 rule; they should have three copies of the data, stored on two different kinds of media, with one of them stored offsite.

This means that in addition to the primary date, organizations should have at least two more backups as having more copies of the data reduces risk of losing the data during a disaster. In terms of storing the data, organizations should keep the copies of the data on at least two different storage types (such as internal hard disk drives and removable storage media like tapes, external hard drives, CDs, etc.) or on two internal hard disk drives in different locations.

Finally, while storing the data on different media is important and a good start, it really isn’t a good idea to keep the external storage device in the same room as the production storage in case of a catastrophe like a fire. It is prudent to physically separate the copies and keep at least one offsite. Specifically applied to Disaster Recovery as a Service (DRaaS), the offsite workload should be ready to go in a usable form. One way to meet that require is with replicated workloads.

There are a number of options available to organisations when it comes to deployment of DR systems – organisations can either choose to deploy a secondary physical site (either owned and manage by the organisation or hosted by a service provider) or adopt a DRaaS model. An overwhelming majority of organisations still prefer to use secondary sites in large part driven by some of the prevailing misconceptions that surround the cloud – lack of control, security and compatibility with existing infrastructure. However as we continue to debunk the myths around the cloud and organizations begin to understand the value of DRaaS, the adoption levels will only rise. In fact, according to the ‘Disaster Recovery as a Service Market by Solution (Disaster Planning & Testing, Real-Time Replication, Backup Solution, Data Security & Compliance), by Service Provider (Disaster Recovery, Cloud, Telecom & Communication) – Global Forecast to 2020’ report by MarketsandMarkets, the DRaaS market is expected to grow by a CAGR of 52.9% from $1.42bn in 2015 to $11.92bn in 2020.

There are four key arguments in favour of DRaaS:

1. Increased flexibility

If an organisation chooses to have a physical secondary site for their DR services, then it is essential that this site be a carbon copy of the primary production site. This can be an extremely daunting task in of itself but even if done properly, there is the burden of having to continuously monitor the systems to ensure that they are synchronised. The advantage of DRaaS is that this synchronisation can be abstracted. Another advantage of DRaaS is that depending on the disaster mode, the organisation can select from a variety of options for how to handle the different business systems.

Since most of the processes are automated, opting for a DRaaS solution also frees up IT resources and gives them the flexibility to focus on business critical applications, that can yield tangible business benefits, rather than on support functions.

2. Reduced costs

Organisations that choose to deploy physical secondary sites to support DR services have to make a significant CapEX outlay associated with the physical infrastructure, hardware and software licenses and regular maintenance. DRaaS on the other hand works on a subscription/’pay-as-you-go’ model. Organisations only have to pay for the services they uses which works out to be extremely cost effective, particularly for smaller organisations that are cash strapped or for organisations whose DR requirements might frequently scale up or down.

3. More robust testing

Regularly testing the DR system is a key part of any BC/DR strategy. Unfortunately given the synchronisation issues with traditional DR systems, testing is both extremely expensive and time consuming which is why most organisations opt to test their DR systems annually if at all. DRaaS on the other hand gives organisations the luxury of conducting more frequent (be it quarterly or half-yearly) testing which increases the likelihood of efficient and successful recovery in case of a disaster.

4. Rapid recovery

As stated earlier, in the era of the ‘Always-On’ business, any downtime, even if it is a result of a natural disaster, can be catastrophic for an organisation. If all the servers are still physical, or if organisations are using data protection tools designed for physical environments, the recovery process is still going to be long and complex (often taking days). With DRaaS on the other hand, organisations can recover data in a matter of hours if not minutes.

As businesses begin to make the transition to ‘Always-On’, it is paramount that they modernise the data centre and have a robust disaster recovery plan and infrastructure in place. While this is a daunting task for many, the good news is that organisations now have access to comprehensive cloud-based DR solutions designed to empower the Always-On Business and keep critical apps up-and-running, all while ensuring that complete visibility and control remains at IT’s fingertips.