Formerly known as Wikibon

Breaking Analysis: Rethinking Data Protection in the 2020s

Techniques to protect sensitive data have evolved over thousands of years, literally. The pace of modern data protection is rapidly accelerating and presents both opportunities and threats for organizations. In particular, the amount of data stored in the cloud, combined with hybrid work models, the clear and present threat of cyber crime, regulatory edicts and ever-expanding edge use cases should put CxOs on notice that the time is now to rethink your data protection strategies. 

In this Breaking Analysis, we’re going to explore the evolving world of data protection and share some data on how we see the market evolving and the competitive landscape for some of the top players. 

The Evolving World of Data Protection

Steve Kenniston, AKA the Storage Alchemist, shared a story with us and it was pretty clever. Way back in 4,000 B.C., the Sumerians invented the first system of writing. They used clay tokens to represent transactions. To prevent tampering with these tokens, they sealed them in clay jars to ensure that the tokens – i.e. the data – would remain secure with an accurate record that was quasi-immutable and lived in a clay vault.  

Since that time we’ve seen quite an evolution in data protection. Tape of course was the main means of data protection during most of the mainframe era and that carried into client/server computing. Which really accentuated the issues around backup windows and challenges with RTO, RPO and recovery nightmares. 

Then in the 2000’s, data reduction made disk-based backup more popular and pushed tape into an archive, last resort media. Data Domain, then EMC now Dell still sell many purpose built backup appliances as a primary backup target. 

The rise of virtualization brought more changes in backup and recovery strategies as a reduction in physical resources squeezed the one application that wasn’t under utilizing compute – i.e. backup. And we saw the rise of Veeam, the cleverly named company that became synonymous with data protection for VMs. 

The cloud has created new challenges related to data sovereignty, governance, latency, copy creep, expense, etc. 

But more recently, cyber threats have elevated data protection to become a critical adjacency to information security. Cyber resilience to specifically protect against ransomware attacks is the new trend being pushed by the vendor community as organizations are urgently looking for help with this insidious threat. 

Cloud & Cyber as Disruptors

The two major disruptors we’re going to discuss today are the rapid adoption of cloud and the escalating threats in cybercrime, especially as it relates to ransoming your data. 

Every customer is using cloud , 76% are using multiple clouds according to a recent study by HashiCorp. 

We’ve extensively covered the digital skills gap and the challenges this brings to organizations. It’s especially acute in the complicated world of cybersecurity and that is bleeding into backup, recovery and related data protection strategies.

Customers are building (or buying) abstraction layers to hide the underlying cloud complexity and essentially build out their own clouds. This is good in that it creates standards and simplifies provisioning and management. However there is a downside in that by creating that layer, it makes things less transparent and creates other problems.  

We see these challenges as fundamentally data problems that are accentuated by distributed cloud architectures. For example, ensuring fast, accurate & complete backup & recoveries. Adhering to compliance and data sovereignty edicts. How to facilitate safe data sharing. Managing copy creep. Ensuring cyber resiliency and protecting privacy. These are just some of the issues these disruptors bring to organizations.  

As it relates to cybersecurity, we’re all learning how remote workers are especially vulnerable and as clouds expand rapidly, data protection technologies are struggling to keep pace. 

Public Cloud is Becoming the Standard Architectural Construct

The chart below quantifies the worldwide revenue and growth of the big four hyperscale cloud vendors and underscores the rapid adoption of these platforms. 

The so-called Big 4 will surpass $115B in revenue this year. That’s around 35% growth relative to 2019, when they generated a combined $86B. Notably, last year these four spent more than $100B in CAPEX building out their clouds.  

We see this as a gift to the rest of the industry. 

To date, the legacy vendor community has been defensive but that narrative is starting to change as large tech companies like Dell, IBM, Cisco, HPE and others see opportunities to build on top of this infrastructure. 

Listen to how Michael Dell is thinking about this opportunity when questioned on theCUBE by John Furrier about the cloud. 

Clouds are infrastructure, right? So you can have a public cloud, you can have an edge cloud, a private cloud, a telco cloud, a hybrid cloud, or multi-cloud, here cloud, there cloud, everywhere cloud cloud. Yes, they’ll all be there, but it’s basically infrastructure. And how do you make that as easy to consume and create the flexibility that enables…everything.  

Michael nailed it in our view and is exactly the right message. The cloud is everywhere. You have to make it easy. And you have to admire the scope of his comments. We know this is an individual who thinks big, right? “Enables everything.” He’s basically saying that technology is at the point where it has the potential to touch virtually every industry, every person, every problem…everything. 

The Rise of the Data Protection Cloud

Let’s discuss how this informs the changing world of data protection. 

The digital mandate has dragged a data protection imperative along with it. A digital business is a data business and no longer can backup, recovery, and data management as it relates to data protection be bolted on as an after thought. Rather it must be architected as a fundamental component of an overall cloud strategy. 

For this segment, we’ve purposely borrowed the title of a recent book written by Snowflake CEO Frank Slootman called the Rise of the Data Cloud. In this book, Slootman lays out his vision for building value on top of the hyperscaler gift and leveraging network effects to create new value for customers at massive scale. Snowflake is executing on this vision for database and data management and while much of this vision has yet to be delivered, we believe it’s one of the most powerful “North Stars” in the industry today.

Snowflake’s vision is an elegant and easy to understand application of Michael Dell’s cloud everywhere comments: On-prem/hybrid–>cross cloud(s)–>edge strategy and we believe serves as an excellent example that can be applied to the data protection space. 

This vision hides the underlying complexity of the clouds and creates the same experience across all estates with automation and orchestration build into the data protection cloud. 

The data protection cloud provides a variety of services across any cloud (as well as on-prem) including backup/recovery for virtualized and bare metal, any OS, container data protection and a variety of other services.

It includes analytics that not only report but use machine intelligence to anticipate problems or anomalous behavior.  And possibly includes protections for personally identifiable information (PII).

The attributes of the data protection cloud are that it manages the underlying cloud primitives, exploits cloud native technologies for performance, machine intelligence and lowest cost. It has a distributed metadata capability to track files, volumes and any organizational data, irrespective of location. And fundamentally enables sets of services to intelligently govern data, in a federated manner, while ensuring integrity. 

And it’s automated to help with the skills gap.

Connection to Cyber Recovery

As it relates to cyber recovery, air gapped solutions must be part of the portfolio but managed outside of the data protection cloud. In other words, the orchestration and management of the air gapped data must also be gapped and dis. connected 

This strategy is a cohort to and a complimentary piece of cyber security  regimes. But that is a complicated world and one in which technologies and processes can become messy. 

The Fragmented World of Cybersecurity

In other words, you don’t want your data protection strategy to get lost in this mess. 

This is a chart we often use to describe the complexity and sea of point products that has permeated the industry. So try to think about data protection strategy as a cohort or an overlay to your cybersecurity approach. Yes this may create some overheads and integration challenges, which is why you’ll likely need a partner.

We see the rise of MSPs and specialists service providers, not the public cloud providers, not your technology arms dealer, rather managed service providers that have intimate relationships with customers, understand their business and specialize in architecting solutions to these difficult challenges. 

Quantifying the Data Threat

A closer look at the risk factors that organizations face is a useful exercise. The chart below was shared with us by the Storage Alchemist. It’s based on a study that IBM funds with the Ponemon Institute, which is a firm that research things like the cost of breaches and has for years. 

The chart shows the total cost of a typical breach within each dot and on the Y-axis and the frequency on the horizontal axis in percentage terms. 

The two most frequent types of breaches are are compromised credentials and phishing, which once again proves bad user behavior trumps good security every time. 

The point here is that the adversary’s attack vectors are many. And specific technology companies specialize in solving these problems, often with point products which is why the slide we showed earlier from Optiv looks so cluttered.  

But this problem is top of mind today and that’s why we’ve seen the emergence of cyber recovery solutions from virtually all the major players. 

Zero Trust is a Mega Trend

Ransomware and the SolarWinds hack have made trust the #1 issue for CIOs and CISOs. 

We see major shifts in CISO spending patterns toward endpoint, identity & cloud. We see this in the ETR data and in the stock price momentum of disruptors like Okta, Crowdstrike and Zscaler. 

Cyber resilience is top of mind and robust solutions are required. Several companies including Dell, IBM, Veeam and virtually every major player, are building cyber recovery solutions. It’s common of course for backup and recovery vendors to focus their solutions on the backup corpus as that is often a prime hacker target.

We believe there is an opportunity to expand the scope from just the backup corpus to all data in a more comprehensive data management strategy. 

Many companies use a 3, 2, 1 or 3, 2, 1, 1 strategy. Three copies, 2 backups, 1 in the cloud and 1 air-gapped. This strategy can be extended to primary storage, copies, snaps, containers, data in motion, etc. 

As we said earlier, many customers are increasingly looking to MSPs & specialists to help solve this problem due to skills gaps.

And the best practice is to physically and logically separate the orchestration and automation of the air gapped solution.

Sizing up Some of the Major Players

Let’s look at some of the ETR data on the competition. The chart below is a two-dimensional view with Net Score or spending velocity on the vertical axis and Market Share or pervasiveness in the ETR data set on the horizontal axis. Market Share is an indicator of response presence in the data, not revenue share. 

This chart is a cut of the storage sector in the ETR taxonomy and isolates on pure plays backup and recovery / data protection vendors. The 40% red line is our subjective view of excellence – in other words anything over that line is considered elevated. 

Note that only Rubrik above the 40% line. Also note the red highlight is the position of Rubrik and Cohesity from the January 2020 survey.

Veeam, although it’s below the 40% mark has been impressive and steady over the last several quarters and years. 

Commvault is moving steadily up – Sanjay Merchandani is making moves and the Metallic offering appears to be driving cloud affinity within Commvault’s large customer base. The company is a good example of a legacy player evolving its strategy and staying relevant. 

Veritas continues to underperform relative to the other players in the ETR data set as does Barracuda. 

Large Portfolio Companies Also Play

The ETR taxonomy includes a total storage sector, not a backup and recovery view. So let’s add IBM and Dell to the chart, noting this comprises their entire respective storage portfolios, not just backup and recovery/data protection. 

In this view we’ve also inserted the data table that shows the actual Net Score and Shared N data that inform the plot postion. While Rubrik and Cohesity, for example, have smaller Ns, we feel there is enough data to track trends over time.

Veeam is impressive. Its Net Score has always been respectable in the mid-to-high 30% range over the last several quarters and years. It has solid spending momentum and a consistent presence in the data.  

Simplivity has a small N but has improved its position relative to previous surveys.  

The Compute Renaissance

We now want to emphasize something we’ve been hitting on for quite some time now and that is the renaissance that’s coming in compute.

We’re all familiar with Moore’s Law, the doubling of transistor density every 18-24 months, which leads to a 2x performance boost in that timeframe. The blue line represents the x86 curve. The math averages out to around 40% per annum performance improvement as measured in trillions of operations per second. That figure is moderating for x86 and is now down to around 30% or so. 

The orange line represents the Arm ecosystem improvements calculated from Apples custom designed A series chips, culminating most recently in the A15, which is the basis for the M1 chip that replaced Intel in Apple’s laptops.

That orange line is accelerating at a pace of more than 100% per annum when you include the processing power of the CPU, GPU, NPU and other alternative processors included in the chip.

The point is there’s a new performance improvement curve and it’s being led by the Arm ecosystem.

Data Protection Must Exploit Future Architectures

So what’s the tie to data protection? We’ll leave you with this chart below which shows Arm’s Confidential Compute Architecture. 

We believe this architecture is ushering in a new era of security and data protection using the concept of realms. 

Zero Trust is the new mandate and what realms do is create separation of vulnerable components by creating a physical bucket to deposit code and data, away from the OS. Remember, the OS is one of the most valuable entry points for hackers because it contains privileged access. It’s also a weak link because of things like memory leakages and other vulnerabilities. Malicious code can be placed by bad guys within data inside the OS and appear benign – even though it’s anything but. 

So in this architecture, all the OS does is create API calls to the realm controller – that’s the only interaction with the data, which makes it much harder for bad actors to get access to the code and data. And it’s an end-to-end architecture so there’s protection throughout. 

The link to data protection is that backup needs to be the most trusted of applications because it’s one of the most targeted areas in a cyber attack. Realms provide an end-to-end separation of data and code from the OS and is a better architectural construct to support zero trust and confidential computing in critical use cases like data protection/backup and other digital business applications. 

Our call to action is backup software vendors – You can lead the charge. Arm is several years ahead at the moment in our view so pay attention to that and use your relationships with Intel to accelerate its version of this architecture.

Or ideally – agree on common standards for the industry and solve this problem together. Pat Gelsinger told us on theCUBE that if it’s the last thing he’s going to do in his life he’s going to solve this security problem. Well Pat you don’t have to solve it yourself. You can’t and you know that. So while you’re going about your business saving Intel look to partner with arm to use these published APIs and push to collaborate and open source an architecture that address the cyber problem.

If anyone can do it you can. 

Keep in Touch

Remember these episodes are all available as podcasts wherever you listen.

Email david.vellante@siliconangle.com | DM @dvellante on Twitter | Comment on our LinkedIn posts.

Also, check out this ETR Tutorial we created, which explains the spending methodology in more detail.

Watch the full video analysis:

Image credit: TarikVision

Note: ETR is a separate company from Wikibon/SiliconANGLE.  If you would like to cite or republish any of the company’s data, or inquire about its services, please contact ETR at legal@etr.ai.

Article Categories

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
"Your vote of support is important to us and it helps us keep the content FREE. One click below supports our mission to provide free, deep, and relevant content. "
John Furrier
Co-Founder of theCUBE Research's parent company, SiliconANGLE Media

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well”

You may also be interested in

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content