Formerly known as Wikibon
Search
Close this search box.

Oracle Exadata vs. Roll Your Own: Wikibon's Take

Executive Summary – Oracle Exadata vs. Roll Your Own (RYO)*

Premise

IT is labor intensive. Much of that labor is allocated to infrastructure management which is an undifferentiated investment. Integrated systems like Oracle Exadata have changed the game by lowering operational costs by at least 30% or more relative to bespoke systems supporting Oracle databases. But historically this benefit has come at a higher acquisition cost. New research shows that nearly ten years of R&D have resulted in acquisition cost parity between Exadata and RYO* systems to support Oracle databases (see “Definition of Terms in the paragraph below”. This research explains why and what it means to IT professionals.

*Definition of Terms: “Roll your Own (RYO)” is defined in this research to describe enterprise IT installation’s purchase of systems hardware and software components separately. Enterprise IT is responsible for the integration of the components during planing, installation, maintenance and upgrade of systems. In this research the components are assumed to be commodity (volume) x86 servers, SAN storage, and a separate network(s). The original advantages were lower cost components and reduction of vendor lock-in. However, the increasingly important disadvantage is that every system is a unique combination of components at different service levels. As datacenters grow in size and complexity, the economics of volume, pre-integrated, pretested, and continuously updated Full Stack solutions wins out against RYO solutions which are always unique, but do not have the degree of integration with the Oracle database or the ability to improve functionality and cost across a large volume of instances.   

Overview

Most IT infrastructure is still purchased in a bespoke fashion. Best-of-breed components are assembled by an integrator or end customer to support applications that drive business value. There are four main reasons buyers take this approach:

  1. It allows the flexibility to pick and choose server, storage and networking components;
  2. Acquisition costs have been historically lower because you can shop for the best deal on components;
  3. Many buyers feel it lowers lock-in and subsequent price gouging risk;
  4. Buyers are stove-piped and buying in piece parts reflects their organizational structure.

Does this thinking deliver the best deal for customers? Our research shows that it depends on workloads. The simpler and more diverse the workloads, the more attractive RYO systems become. But for Tier-1 systems of record running an Oracle Database, our research shows that the lifetime costs of traditional RYO approaches are more than 50% higher than those of Full Stack systems such as Exadata. In addition, our data suggests that Full Stack systems offer greater business value, increased availability, better security and faster time-to-value.

Exadata Full Stack vs. RYO Executive Summary Analysis
Figure 1 – Analysis of Operational Costs for Roll-you-Own Infrastructure vs. Full-stack System running Systems of Record Workloads. 
Source © Wikibon 2018. Infrastructure costs exclude Database and Application License Costs. Operational Costs include infrastructure and database support.

Note: We differentiate Full Stack systems from other converged infrastructure (CI) in that they integrate technology components above the operating system including the database, middleware and often application layers. The Full Stack examples in this research are based solely on Exadata, not other Oracle Full Stack offerings which include applications.

Wikibon research shows that historically, the acquisition costs of converged systems have been about 25% higher than RYO. Today, however, Full Stack systems of integrated  compute, storage, networking, networking software, infrastructure software, database  and middleware have comparable acquisition cost compared with RYO infrastructure. At the same time, lifetime operational costs are much lower because of reduced IT labor complexity. In addition there is migration of responsibility from the data center to the vendor for simplifying the installation and maintenance of these systems. As a business bonus, availability is improved, time-to-value is reduced, business project risk is reduced and maintaining security is simpler.

Figure 1 above shows a summary analysis of the 3-year costs of three main components for a Tier-1 System of Record workload using the Oracle Database. Those components are:

  • Infrastructure Acquisition Costs (bottom purple component, about equal);
  • Environmentals (middle blue component, RYO 20% higher than a Full Stack System);
  • Operational Support Costs (top orange component, over three times higher for RYO than a Full Stack System).

Note: The Operational Support Costs line includes installation, integration, maintenance (including security patching), system tuning and upgrades. This analysis excludes database license costs but does include the cost of integrating the database and middleware components. 

Overall, the Roll-your-Own (RYO) infrastructure has 53% higher costs over 3-years than the Full Stack System.

To many readers, the statement that RYO and Full Stack systems are comparably priced at acquisition will sound absurd. However our research with customers shows Oracle’s Full Stack systems consistently deliver better utilization of resources, more efficient offload data handling, tighter software integration and a more integrated top-to-bottom architecture. These attributes offset a 25% higher price with 25% better throughput. What Oracle has done is spent money in R&D to lower the amount of resources required to achieve comparable performance relative to RYO systems. Customers in our research sample take the benefit by either spending less to get comparable performance or spending more and increasing application performance. Wikibon’s assessment is that 46% of the cost of RYO is operational costs, compared to 21% for Oracle Exadata systems.

Regardless of approach, most organizations report that they’ve been able to shift IT labor costs toward differentiable activities that drive competitive edge, such as DevOps, AI, IoT, analytics and other digital initiatives.

This research deals with the cost elements of the above argument. The discussion of business value (beyond IT cost savings) requires a much longer narrative and is dependent on a number of factors, including organization size and the value of the application portfolio. For simplicity, we’re isolating the infrastructure and operational costs in this research to test the assertion that “RYO is cheaper.”

Levels of Converged Systems

Wikibon research with customers over the past five years shows two key findings:

  1. Business value increases as integration levels incorporate more business data (e.g. database and application content).
  2. Infrastructure and operational costs decrease as the levels of integration increase. This is a direct function of reducing IT labor complexity and risk.

One of the earliest examples of so-called converged infrastructure (CI) came in 1984, in the form of a specialized database machine from Teradata. The modern era of CI began with Exadata in 2008. This system was originally based on HP infrastructure.  Shortly thereafter, with Oracle’s acquisition of Sun, it became an all-Oracle hardware and software solution — what Oracle now calls Engineered Systems. The following year, EMC, Cisco and VMware created Acadia (which eventually became VCE), and delivered Vblock. Others joined the market shortly thereafter, which is today a multi-billion dollar business.

In the early part of this decade, the term hyper-converged became common. This described a “software-led” converged system, where the functional elements of the infrastructure are implemented in software running on commodity hardware. Nutanix is an early example of this approach as are Pivot3, Simplivity (now HPE) and many others.

The difference between these types of converged systems can be substantial and customer value will vary based on a number of factors including:

  • The degree of virtualization and automation currently in place;
  • The maturity of existing processes;
  • Organizational size, complexity and skill levels;
  • The organizational structure and willingness to manage a single entity of compute, storage and networking, versus a set of bespoke components;
  • The degree to which the system is integrated with database, middleware and business applications, and the availability of integration skills;
  • The desired level of infrastructure asset leverage versus a greater business level integration.

This last two points are particularly relevant. Systems that are infrastructure-centric (i.e. include primarily compute, storage, networking and infrastructure management software) have the advantage that they can better serve as a horizontal layer across the application portfolio; in other words, they are more application vendor versatile.  Systems that include Full Stack technologies (e.g. database, middleware and applications) are more narrow, but more deeply integrated into the business processes. Our research shows Full Stack technologies deliver greater value at a much lower cost of operational and integration support.

The Scope of Infrastructure for Oracle Workloads

Classic IT Infrastructure Management – Roll-your-own (RYO)

In our discussions with Oracle customers, many shops continue to favor purchasing best-of-breed bespoke x86 server, off-the-shelf storage and commodity network components (RYO) for the reasons cited above. The result is a unique set of hardware and infrastructure components that need to be integrated, tested, maintained, and sustained. This unique solution results in a wide variety of configurations and limits economies of scale.

Our research shows that Oracle workloads are often isolated, and run on specially tuned hardware that primarily supports Oracle systems. The reason for this choice is because the cost of the Oracle Database software is many times higher than the cost of the hardware. As a result, optimizing the system to run without virtualization, with much faster IO and much more processor memory reduces the number of Oracle Database licenses required and allows RYO to be more competitive. Nonetheless, as Oracle matures its engineered systems approach, it becomes more challenging for RYO to keep up for Oracle-only workloads– especially for tier-1 systems of record.

As such, we have encountered many customers that prefer to use RYO for mixed workloads, which often include their Oracle systems. This use case, in our view, is the more logical for RYO systems; especially in situations where non-Oracle workloads are of comparable or greater value than Oracle workloads within the shop.

Converged Infrastructure from Non-Oracle Vendors

Converged Systems integrate key hardware components (servers, SAN storage, & networking subsystems) and then bundle in added virtualization, operating system and infrastructure software. These are packaged, delivered, maintained and upgraded as a single managed entity, with the same vendor responsible for all aspects of the hardware and software. These systems may be tuned for Oracle workloads but the hardware and Oracle software are not engineered together to the degree that Oracle is capable of delivering.

Oracle has made a strategic decision to limit integration depth with its former hardware partners who have become competitors since the acquisition of Sun. This leaves competitors to do API-level, versus source code-level integrations. Oracle uses this to its marketing advantage with customers because it can demonstrate unique capabilities that it claims competitors can’t replicate in full (see examples in the Footnotes section below).

As indicated above, hyper-converged systems are another form of converged infrastructure that use software to deliver storage and networking services on a commodity server infrastructure. This approach attempts to replicate public cloud service offerings from the likes of Amazon, Facebook, Google and Microsoft. It has become popular in the marketplace for customers that want to try and replicate cloud-like attributes on premises. Like converged systems, hyper-converged systems on the market are not integrated through the full stack and require an integrator or customer to bear the responsibility of integrating database and middleware.

Oracle Exadata Full Stack Systems

As noted, the reference Full Stack system for this research is Oracle Exadata. The hardware and software components of this are:

  • Standard commodity Intel servers;
  • CPUs to accelerate data intensive workloads such as real-time analytics and AI;
  • High speed standard flash disks, using low-latency NVMe protocols with NVMe capable SSDs and built-in storage management software;
  • Standard high-capacity magnetic disks (HDD) with built-in software management;
  • High speed point-to-point InfiniBand networking, enabling RDMA protocol to allow direct low-latency communication between different nodes;
  • OVM (Oracle VM) as the low-overhead virtualization layer;
  • Oracle Linux as the OS;
  • Oracle Database as the Middleware;
  • Automation and Orchestration Infrastructure Management Software.

The standardization of this solutions allows increased operational savings, since database integration is automated and little tuning is required. There is also improved availability and time to value over the life of the solution.

Oracle Exadata Drill-down

A list of features and functions Oracle touts for Exadata can be found in the Footnotes section at the end of this research. The assessment, financial analysis, and conclusions are based on a detailed analysis of the features and functions.

General Assessment of Exadata 

Oracle’s Exadata X7 offers significant offload capabilities, moving IO processing from the application servers to the storage subsystem. In addition, the latency for IO on the Exadata X7 has been reduced to 250 microseconds with flash caching. DRAM caching in the storage servers can reduce IO latency to about 100 microseconds. The low-latency and high-bandwidth InfiniBand networks within and between instances also significantly reduce IO and other application latencies. Lower latencies again reduce the amount of processing required in the application servers, especially for systems of record applications with high CPU wait times.

Exadata has the ability to significantly speed up analytic systems to make the results available in real-time, and then applied to the systems of record. Exadata X7 has the bandwidth, low-latency IO and future GPU support to enable real-time analytics, and enable Systems of Intelligence.

Most of the functions listed represent an optimization between the Oracle software and the specific Exadata hardware and firmware. This optimization is the reason the infrastructure costs are now similar between RYO systems and Full Stack systems. Full Stack systems run more efficiently and at higher utilizations rates. Consequently, this reduces the amount of time processors are waiting to do work. All processors wait at the same speed.

Summary Conclusions

Most of the functions listed above represent an optimization between the Oracle software and the specific Exadata hardware and firmware. In our assessment, this functionality yields a 25% improvement in performance and throughput, offsetting roughly 25% higher cost of acquisition, relative to RYO.

There are two bottom line conclusions:

  1. The higher cost of Exadata technology is offset by needing less of it. The Exadata optimizations are the reason the infrastructure acquisition costs are now similar for RYO systems and Full Stack systems. Full Stack systems run more efficiently and at higher utilizations rates. Consequently, this reduces the amount of time processors are waiting to do work. All processors wait at the same speed.
  2. The combination of Exadata and Oracle Database automates two thirds of the operational work that still has to be done in RYO systems. Forty-six percent (46%) of the cost of RYO is operational costs, compared to 21% for Exadata systems. This conclusion is analyzed in more detail in the Financial sections below.

Wikibon expects that Oracle’s volume strategy of using Exadata technologies in the Oracle Cloud, Oracle Cloud at Customer, and Exadata True Private Clouds will increase the level of automation over the next few years, and increase the operational cost differences between RYO and Full System stacks. Oracle is in a better position to introduce operational AI for the Red Full Stack than other system providers.

Detailed Financial Analysis Results

The methodology behind the financial analysis is detailed in Footnotes 1 below. The technical details of the Exadata functions that are evaluated in the financial analysis are detailed in Footnotes 2 below. Figure 2 below shows additional detail for the infrastructure costs that are shown in Figure 1 – i.e. the totals are the same but Figure 2 has more detail. A traditional unique Roll-your-Own  demonstrates more than 50% higher costs compared to a Full Stack systems.

Overall, the infrastructure server costs are higher in the Full Stack solution, because the storage and networking services require server resources that previously would be supplied as embedded servers in the storage and networking appliances. The Storage and networking costs decrease as storage and networking become software-led in the Full Stack solution. The infrastructure software is 25-30% more expensive in RYO, specifically because storage and networking software on traditional proprietary appliances is more expensive. Again, Figure 2 shows clearly that forty-six percent (46%) of the cost of RYO is operational costs, compared to 21% for Exadata systems.

Note: The workload model is Tier-1 stateful systems of record running Oracle.

Detailed Analysis RYO Infrastructure vs. Exadata
Figure 2 – Analysis of Full Operational Costs for unique Roll-your-Own Infrastructure vs. Full-stack System running Systems of Record Workloads. 
Source © Wikibon 2018

Future Trends

The end of the cloud era is here. Virtually all organizations want to experience cloud-like simplicity and economics regardless of where their data resides. This means wherever possible, replicating cloud functions and capabilities on-prem as well as in the public cloud. Increasingly technology suppliers are finding ways to substantially mimic the cloud experience regardless of physical location. The degree to which this is possible depends on a number of factors and will vary widely by vendor, platform and level of integration.

The world is moving beyond the cloud era. Every decade or so the names change and the center of industry gravity evolves. We’re moving to a digital world where horizontal technology services become the backbone for delivering consumer-like capabilities for customers. As artificial intelligence becomes operationalized and cloud economics become commonplace at scale, integration will increasingly be seen at the application level. Enterprises will, we believe, increasingly shift resources from infrastructure management into these emerging areas.

Oracle potentially makes this shifting of Tier-1 Oracle Database application resources easier by offering  different deployment models for Exadata: on-premises, in the Oracle Cloud with Exadata Cloud Service, and Exadata Cloud at Customer. Data, dev/test, applications, etc. can be moved between these environments with no changes and little testing. The deployment of thousands of closely related engineered systems across these environment can lead to sufficient volume for significant investments in AI operational automation and security advances.

Wikibon has predicted that over the next ten years, more than $150B will shift from non-differentiated IT infrastructure management into these new areas of growth.

Overall Conclusions

The analysis shows clearly that roll-your own systems are about the same acquisition costs as Full Stack systems, and have much higher operational support, database integration, and application integration costs (3 times higher). This is especially true for Tier-1 class workloads.  Figure 2 above shows that server costs will be lower on RYO, as they are not running the storage and networking software. However, the additional separate SAN and network hardware costs make up the difference and offset the advantage presumed with RYO. As almost always with total lifetime cost, the key difference is in operational support expenses, over three times higher on RYO compared to Full Stack Systems.

In addition, Wikibon believes that the move to standardized systems will enable Oracle to reduce its own support costs through increased volume, and improve the operational automation through artificial intelligence methods with data from an increasing number of customers. These higher availability, greater flexibility, faster time-to-market, and risk reduction factors will be analyzed in future Wikibon research.

Bottom Line: Our research shows that for Oracle-heavy workloads, Exadata’s acquisition costs are achieving parity with RYO systems. For these workloads, Exadata is almost always worth the investment and in our view offset

s the benefits of RYO systems in most cases. The combination of Exadata and Oracle Database automates two thirds of the operational work that still has to be done in RYO systems. 46% of the cost of RYO is operational costs, compared to 21% of Exadata Full Stack systems. This conclusion is analyzed in more detail in the Financial Methodology sections below.

Exceptions occur when the infrastructure is supporting both Oracle and non-Oracle workloads and the Oracle workloads contribute a minority of the value to the portfolio. Wikibon strongly recommends that enterprise IT should aggressively move the majority of its Oracle workloads to Full Stack systems and manage lock-in risk through better negotiation strategies.

Action Item

Wikibon strongly recommends that enterprise IT should not use traditional RYO infrastructure for Tier-1 Oracle workloads. Instead, these workloads should be aggressively moved to Full Stack systems, with a single throat to choke for the total system from hardware through database.

Footnotes 1: Financial Methodology

The data in this report is based on five years of researching converged systems with input from over 300 customers, including:

  • Forty (40) in-depth interviews (IDIs) with Exadata customers running exclusively Oracle workloads;
  • Approximately 175 IDIs with CI customers running non-Oracle hardware with both Oracle and non-Oracle workloads;
  • More than 100 interviews on theCUBE, SiliconANGLE Media’s digital TV production, with CI customers;
  • Extensive business value and economic modeling designed to evaluate the “before and after” conditions using normalization and other statistical techniques to quantify the value of systems across a variety of workloads, industries and company sizes.

The workload in this study is assumed Tier-1 class systems of record applications, including transaction processing, intense analytic processing, hybrid transactions with analytics and large data warehouse systems. The applications modeled in this study are running on Oracle Database and are stateful.

The baseline for this analysis is the unique Roll-your-Own approach, which optimizes each component of the solution. This analysis compares the same level of technology optimization (e.g., same level of flash storage) in each of the first four levels of convergence defined above. The focus of this analysis is the operational cost for purchase, deployment, maintenance, and operational support for a systems of record over a 3-year period. The analysis is based on detailed Wikibon models developed over the last decade of financial analyses. The underlying components in our model for the unique RYO approach assume best-of-breed compute, network and storage components in a roll-your-own flavor approach.

The full-stack system cost is based on Oracle’s Exadata infrastructure, and does not include the database and middleware. The savings in operational support includes the database and middleware. The savings are significantly higher the further up the stack the platform supports, and the greater the offload of service support to the vendor.

Note: Wikibon’s Economic Analysis Methodology differs from other methods in that it uses data from interviews and normalizes that data through modeling work. Most TCO or ROI studies capture data from customers and then simply report that data in some type of average or aggregate form. There are several flaws with in this approach in that often the sample is not random, rather it’s provided by a vendor, which can “stack the deck” with customers showing the best results. As well, the data may not be reflective of a specific workload, scope or organization’s size. Wikibon never relies solely on vendor-supplied customers or data. Rather we ingest data from customers and calibrate our models to reflect real world use cases across a variety of applications, workloads, organizational sizes, business processes and other factors.

Footnotes 2: Oracle Exadata Drill-down

The following is a list of features and functions Oracle touts for Exadata. Note: Not all of these functions can be leveraged for all workloads, and some are available from competitors. The capabilities that differentiate the system from competitive converged platforms are ones where they involve cooperation between the database and other layers. The order reflects Wikibon’s view of potential value to systems of record.

  • Smart Flash Cache
    • Exadata automatically caches data into low-latency flash on every read. This significantly reduces the latency for reads found in cache. Effectiveness depends of the degree of locality of reference of the data. Potential to reduce IO latency to 250 microseconds.
  • DRAM Cache at the Storage Layer
    • Smart Flash Cache helps keep the warmest data in low-latency flash. It uses DRAM as an additional layer of caching in the storage layer servers. This can be 2.5 times faster than the Flash layer. Has potential to reduce latency to about 100 microseconds. Effectiveness depends on  the degree of locality of reference of the data.
  • NVMe Flash Technology
    • NVMe flash SSDs deployed in Exadata X7. These reduce latency and increase bandwidth. This technology is available and has been adopted from other storage providers.
  • Storage & Interconnect Networks based on InfiniBand.
    • Provides point-to-point high Bandwidth and Low Latency Network. Offloads Network from application processors. Uses RDMA protocol for node-to-node communication.
  • Storage Indexes
    • Storage Indexes are automatically created by the storage systems. Can reduce the number of additional indices.
  • Smart Fusion Block Transfer – Fast Clustered OLTP
    • Exadata transfers database blocks across clustered nodes without waiting for completion of redo log writes. Can improve performance of OLTP applications running across clustered database nodes within Exadata.
  • Smart Flash Redo Logging
    • Smart Flash Redo logging reduces any delay from periodic spikes in latency from the SSD write mechanisms.
  • Bloom Filters
    • Bloom Filters can offload data filtration in SQL JOIN processing to Active Storage. More useful for data warehouse workloads.
  • Write-Back Flash Cache
    • Write-back Flash Cache can improve  redo and large writes. Effectiveness depends of the degree of locality of reference of the data.
  • Columnar Flash Cache
    • Organizes HCC data into a pure columnar format. Can  increase performance by up to 5X on queries that access data in the Flash Cache. Helps data warehouse workloads rather than row-orientated systems of record.
  • All-Flash Exadata
    • The all-flash Exadata can reduce latency, especially for workloads with low locality of reference. Reduces DBA effort. All-flash arrays are available from many other storage providers.
  • Active Storage – Cell Offloads 
    • Offloads are available within the Exadata X7. These include SQL, XML & JSON, RMAN Backup (BCT) Filtering, Data file vs. REDO I/O Segregation, Encryption/Decryption Offload, and Fast Data File Creation. Moves some work from application processors to storage processors.
  • Fast Node Death Detection
    • Can reduce detection time from about 30 seconds to a few seconds.

 

You may also be interested in

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content