Formerly known as Wikibon

Oracle’s Recovery Appliance Reduces Complexity Through Automation

Premise:

Traditionally, the best practice for mission-critical Oracle Database backup and recovery was to use storage-led, purpose-built backup appliances (PBBAs) such as Data Domain, integrated with RMAN, Oracle’s automated backup and recovery utility. This disk-based backup approach solved two problems: 1) It enabled faster recovery (from disk versus tape); and 2) It increased recovery flexibility by storing many more backups online, enabling restoration from that data to recover production databases; and provisioning copies for test/dev.

At its core, however, this approach remains a batch process that involves many dozens of complicated steps for backups and even more steps for recovery.

Oracle’s Zero Data Loss Recovery Appliance (RA) customers report that total cost of ownership (TCO) and downtime costs (e.g. lost revenue due to database or application downtime) are significantly reduced due to the simplification and, where possible, the automation of the backup and recovery process.

Database-led Backup and Recovery

New TCO and cost of downtime KPIs emerge

In 2014, Oracle changed the entire model for protecting Oracle Database instances with its Recovery Appliance (RA). By continuously backing up Oracle databases, the system is able to take an end-to-end view of data protection and provide rapid recovery to virtually an instant in time. Our initial research and data from Recovery Appliance customers showed dramatic improvements in both TCO (~30%) and business impact (i.e. Cost of Downtime – measured as reduction in lost revenue and productivity) relative to a PBBA-based batch approach (and other conventional methods). New research confirms our initial projections and we’ve seen even greater improvements in many cases. Our latest data shows that Oracle customers using Recovery Appliance have lowered their total cost of ownership by 30% – 50%, in some cases saving several million dollars in operational costs.

Much more importantly, our latest data shows that digital business initiatives are putting greater pressure on organizations to deliver always-on services. Specifically, the business impact of moving to an application-led data protection architecture such as Recovery Appliance can result in many tens or even hundreds of millions of dollars of benefit from improved productivity and avoided revenue loss. Actual results will depend on three main factors: 1) the size of the organization; 2) mission-criticality of the applications being protected; 3) the data maturity of the organization and the degree to which it embraces a mindset that puts data value at its core; and further, evolves its systems to support emerging digital services.  

Key Findings

  1. On average a G2000 organization will have 50-80 steps and substeps associated with its backup and recovery processes. These steps are generally implemented with scripts which can be fragile and prone to failure. Leading organizations are using the Recovery Appliance to reduce these steps by up to 5X.
  2. Because of the number and complexity of these steps – there’s a roughly 1 in 4 chance (25%) of encountering an error sometime during the recovery process — leading to longer outages. The goal should be to reduce this down to approximately 1 in 100 (~1%). Customers using Recovery Appliance  are reporting this type of reduction in complexity.
  3. The impact of a simplified Recovery Appliance approach is a reduced TCO of between 30% – 50% relative to PBBAs (see Chart 1).
  4. The average cost of downtime at a typical G2000 company is between $75,000 – $215,000 per hour. Much higher for mission-critical data loss.
  5. On average, unplanned outages and data loss cost G2000 companies between 5-8% of annual revenues (lost revenue and productivity).
  6. Organizations that are driving digital initiatives are connecting the dots between data value and monetization. For Oracle Database customers, Recovery Appliance is an important new platform to build data protection strategies. For example, over a four-year period, our research shows that a $2B company is on track to reduce lost revenue by $140M over a four-year period. A $5B organization, $370M and a $15B company, $790M (See Table 2).

Note on the research: Oracle’s Recovery Appliance is designed to protect Oracle Databases only and the data in this post applies exclusively to protecting Oracle Databases.

Lowering TCO is Table Stakes

Oracle’s marketing will cite many examples of why TCO is lower with its Recovery Appliance as compared to PBBAs. Our research generally confirms these claims are true. Several factors can be noted here including greater consolidation of physical boxes, eliminating backup agents, which helps improve resource utilization and overall fewer boxes to manage. This all ripples down to better environmentals (e.g. power and cooling). As well, Exadata customers using RA will cite more effective use of Oracle Database licenses, which directly affects maintenance costs.

In our research, we wanted to isolate the business impact items most directly associated with the Oracle Recovery Appliance, and while the above factors are important, the big dollar benefits come from automating backup and recovery processes. As such, in our initial quantitative work we made a simplifying assumption: The acquisition cost (hardware and software) of a PBBA-based batch infrastructure is comparable to that required for an Oracle Recovery Appliance approach.

Chart 1: Total Cost of Ownership Comparing a Batch to Recovery Appliance Approach

The results of our research are shown in Chart 1. Despite our simplifying assumption on initial price, the data show that the 4-year hardware, software and maintenance cost (blue area) for protecting mission-critical Oracle Databases in a $5B enterprise are 21% more expensive for PBBAs relative to the Oracle RA. This is mainly due to environmentals and other savings from maintenance.

The real story from a TCO standpoint, however, is operational costs. Our findings indicate that operational costs (red area) are 68% higher for PBBAs relative to RA for a typical Global 2000 enterprise running Oracle Databases. The primary reason directly relates to the complexity of the backup process (See key Findings points 1-3 and Table 1).  

The bottom line of our research is shown in the Chart 1 example of a $5B company in the financial services sector. Over a four-year period, on balance, a PBBA disk-based approach is 45% more expensive than the Oracle Recovery Appliance for a typical Global 2000 organization running Oracle Databases ($11M vs. 7.6M).  

Complexity Kills – Why Recovery Appliance Lowers TCO

Reducing the number of steps in the backup & recovery process

IT executives we spoke with repeatedly tell us that they can’t keep throwing IT labor at the complexity problem. Rather, as they pursue digital initiatives they realize that automation is a key to minimizing complexity and cutting costs. Moreover, there is a strong link between complexity of processes, error rates (e.g. failed backups and incomplete recoveries) and IT costs. Mistakes cost money.

The real epiphany as we’ll see below, is the linkage between complexity and other business value, namely lost revenue and productivity as a result of planned or unplanned downtime. Our research below in Table 1 shows the number of steps in the backup and recovery process and how they impact the likelihood of errors and data loss.

 

Table 1: Reducing Complexity of Backup & Recovery Steps

Table 1 shows an organization with $5B in annual revenues. The first row indicates the average number of steps needed to perform backup and recovery in Oracle Database environments. The second row shows the average probability of a recovery error over a period of time. The third line of the table shows the probability that data has been lost and manual recovery is required (which of course takes longer).

Note: Customers should understand that these RA benefits are not achieved overnight. Rather they are accomplished over a period of time, making steady improvements toward a best practice end state.

There are literally dozens of steps and micro-steps associated with traditional backup and recovery systems and they have a ripple effect on data loss. While there is a relatively low probability of experiencing an error in any single step, the sheer number and complexity of steps in the PBBA case add up, resulting in a 25X higher total error rate.

Importantly, as described in the next section, our research shows that the complexity of backup steps has a direct domino effect on business value (measured in terms of downtime costs).

Business Impact: The Real Money is in Reducing Lost Revenue and Productivity

Virtually every CEO and board of directors is trying to “get digital right.” What does that mean? In discussions with customers several factors emerge, including:

  • Digital means data and putting data at the core of an organization is critical to success;
  • CxOs understand that data contributes to monetization;
  • In this digital, cloud world, the cost of data loss and downtime have never been higher (see Table 2).
  • Organizations and their customers, partners and broader ecosystem expect data will be accessible and systems will always be available. Typically, IT systems must be modernized to achieve this goal;
  • Security threats, malware, ransomware and data loss exposures are more costly than ever;
  • Organizations are moving to a cloud-first model. Even if companies aren’t putting all their data into a public cloud, they want to bring the cloud experience to their data. This means when there’s a problem they expect super fast recovery with virtually no data loss;
  • The regulatory climate has never been more strict and the impact of not having appropriate systems in place will be costly (e.g. GDPR).

Table 2 shows the typical risk profile of three organization types- Companies with $2B, $5B and $15B in revenue respectively. The green area shows our estimates of: 1) the percent of revenue lost to downtime; 2) the average cost per hour of downtime; and 3) the average time to recover from a mission-critical data loss. The bottom two rows show the potential business impact of moving from PBBAs to RA in two areas: 1) the TCO premium paid for using PBBA relative to RA at these organizations; and 2) the reduction in downtime costs as a result of moving to RA.

 

Table 2: Cost of Downtime for Global 2000 Customers

Despite these realities, in 2018, several high profile outages have been widely reported. Here’s a short list of outages that have occurred this year: IRS, Visa, National Australia Bank, Microsoft, Slack, AWS, Hulu, Apple, Sutter Health, London Stock Exchange. It’s not difficult to find many more examples.

One has to dig into each of these outages to better understand their causes, and we’re not suggesting that Recovery Appliance could have eliminated the problems completely. Rather these high profile failures underscore the degree to which many valuable organizations with significant resources are exposed.  Organizations running Oracle Databases report that RA is delivering a level of effectiveness that they’ve previously not been able to achieve with traditional backup and recovery approaches.

Reducing the Cost of Downtime is the #1 Business Value Factor

Chart 2 shows the results of our analysis and business value modeling. Again, we’re evaluating Oracle shops running applications using Oracle Databases. Results may vary significantly in situations where applications and the infrastructure is more heterogeneous.

Chart 2: Comparing Cost of Downtime with a Recovery Appliance vs. PBBA Approach

 

For Oracle Database-intensive workloads, Recovery Appliance demonstrates a significant reduction in downtime costs relative to PBBAs (47% by our estimates). This translates to a $370M reduction in potentially lost revenue and productivity (from $783M to $413M). This is a direct result of simplifying and automating the number of steps required to backup and recovery data; and eliminating fragile, error-prone scripts.

Conclusions

Downtime costs are high and trending even higher. Throwing more IT labor at the problem won’t resolve the issue, but automation will help dramatically. CxO’s we speak with are having conversations about changing the operating model and bringing a cloud-like automated experience to their data, wherever it resides.

As it relates to backup and recovery, organizations should focus on eliminating, reducing and automating the large number of manual steps and sub-steps necessary to adequately protect data. While a change in one step will not make a huge difference, taking a holistic view of data protection will dramatically reduce business exposures.

For Oracle customers running mission-critical databases, Recovery Appliance is the logical direction for data protection because it allows the software to ensure end-to-end recoverability in a very granular manner; reducing data loss exposure to significantly lower levels as compared to traditional backup approaches.

Despite our positive sentiments around Recovery Appliance, customers must be aware that the technology – in and of itself – is not a complete answer. Rather organizations must put in place high quality operational procedures that are evaluated and tested regularly. We also advise creating a set of KPIs that quantify the number of steps required to protect data and evaluate these steps regularly – with the explicit goal of getting rid of them. This is especially important as data becomes more distributed.

Finally, Oracle’s Recovery Appliance is a not a general purpose system. It is designed specifically to protect Oracle databases in heavily Oracle-oriented shops. As the number of clouds within organizations grows (private, public, hybrid, SaaS…) organizations must consider holistic data protection strategies that transcend Oracle databases and by definition the Recovery Appliance.

Action Item

We believe customers must rethink how they protect mission-critical Oracle Database environments. Specifically, by reducing the number of manual or scripted steps in the backup and recovery process, organizations can cut costs by 30-50% and reduce downtime costs by 40-50%+ relative to traditional backup approaches. Recovery Appliance is currently the most viable data protection platform to achieve these goals.  

 

 

Article Categories

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
"Your vote of support is important to us and it helps us keep the content FREE. One click below supports our mission to provide free, deep, and relevant content. "
John Furrier
Co-Founder of theCUBE Research's parent company, SiliconANGLE Media

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well”

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content