Co-Authored with David Floyer and Ralph Finos
The Big Data market continued its maturation in 2014, experiencing both significant growth as measured by vendor revenue associated with the sale of Big Data products & services and increased adoption of Big Data tools and technologies by large enterprises across vertical markets.
For the calendar year 2014, the Big Data market – as measured by revenue associated with the sale of Big Data-related hardware, software and professional services – reached $27.36 billion, up from $19.6 billion in 2013. While growing significantly faster than other enterprise IT markets, the Big Data market’s overall growth rate slowed year-over-year from 60% in 2013 to 40% in 2014. This is to be expected in an emerging but quickly maturing market such as Big Data, and Wikibon does not believe this slightly slower growth rate indicates any structural market issues.
Wikibon also extended its market forecast for Big Data through 2026 (Figure 1) from 2017. Wikibon expects the Big Data market to top $84 billion in 2026, which represents a 17% compound annual growth rate over the 15 year period beginning in 2011, the first year Wikibon sized the Big Data market. After a several-year period of intense growth, Big Data market growth will slow considerably in the outlying years of the forecast. This growth pattern, represented by the well-known Ogive curve, is common to disruptive technology markets as they mature.
Below is a a breakdown of the Big Data market by hardware, software and services (Figure 2). Note that Wikibon expects this breakdown to shift dramatically over the outlying years of the forecast. Specifically, Wikibon believes a significant shift in revenue derived from professional services to software.
Wikibon further breaks down its Big Data market forecast by segment (Figure 3).
Below is further analysis of the various sub-segments making up the larger Big Data market.
Hardware. Hardware (a roll-up of compute, storage and networking) represents 37% of the total Big Data market revenue generated in 2014. A fundamental premise of Big Data is to leverage a linear scale-out (versus scale-up) approach to the underlying hardware architecture supporting Big Data storage, processing and analysis. As such, the majority of hardware-related revenue in the Big Data market is associated with the sale of commodity servers with direct attached storage to support scale-out Hadoop and MPP analytics database clusters. However, interest in virtualizing Big Data deployments to create private cloud environments began to emerge from very large enterprise practitioners, particularly in the financial services sector. Wikibon expect this interest to continue in 2015, with early adopter Hadoop and other Big Data practitioners looking to leverage the elasticity of public cloud but within the confines of their own data centers.
Vendors include: IBM, HP, Dell, Cisco, Intel
Data Management. Data management software includes tooling such as data integration, data transformation, data quality and data governance software as applied to Big Data workloads. The data management market segment reached $930 million in 2014. This segment benefits from the emergence of the concept of the data lake as a foundational approach to Big Data platform deployments. Data management tools are used to “fill” the lake (data integration), transform data into appropriate format for further analysis (data transformation), ensuring the veracity of the data (data quality), and devising and applying governance and compliance policies (data governance.) Data management tools are critical to ensuring that data lakes don’t turn into data swamps. Wikibon expects the data management software segment of the Big Data market to grow significantly as Big Data pilot projects and proof-of-concepts mature to production deployments supporting operational analytic applications, with regulated industries (financial services and healthcare) leading the way in terms of overall investment.
Vendors include: IBM, SAP, Informatica, Oracle, Red Hat, Trifacta, Paxata
Hadoop. The Hadoop software market segment reached $190 million in 2014. This segment includes Hadoop distribution software revenue and associated Hadoop monitoring and management software (note: for the purposes of the Big Data forecast, Wikibon does not include cloud-based Hadoop software in the Hadoop sub-segment but in the cloud services sub-segment.) Hadoop is a core foundational technology that enables large-scale data storage, processing and, ultimately, analytics. For a detailed analysis of this market segment, please see Hadoop-NoSQL Software and Services Market Forecast, 2014-2017.
Vendors include: IBM, Cloudera, Hortonworks, MapR
NoSQL. The NoSQL software segment, which includes revenue associated with NoSQL database management software as well as the core database software itself, topped $220 million (note: for the purposes of the Big Data forecast, Wikibon does not include cloud-based NoSQL software in the NoSQL sub-segment but in the cloud services sub-segment.) Like Hadoop, NoSQL database technologies are critical components of Big Data workloads, enabling operational applications across large volumes of multi-structured data. For a detailed analysis of this market segment, please see Hadoop-NoSQL Software and Services Market Forecast, 2014-2017.
Vendors include: Amazon, MarkLogic, DataStax, MongoDB, Couchbase, Basho, Aerospike
SQL. Despite the emphasis on new approaches to data analytics associated with Big Data (namely Hadoop), SQL-based relational technologies also play an important role. This includes both relational database management systems, which both feed Big Data technologies such as Hadoop and in some cases are used to operationalize Big Data insights, as well as data warehouses that often are used as environments to blend Big Data-driven insights with more traditional data and reporting. According to Wikibon’s 2014-2015 Big Data Adoption Survey, fully 51% of Big Data early adopters include data warehousing technology as part of their deployments. There is also a big emphasis on applying SQL techniques to Hadoop, with all the Hadoop distribution vendors developing SQL-on-Hadoop technologies. Wikibon expects the Hadoop and SQL segments of the Big Data market to increasingly overlap in the coming years. For 2014, the SQL sub-segment of the Big Data market reached $1.95 billion.
Vendors include: IBM, HP, SAP, Teradata, Pivotal, Oracle, Microsoft, Amazon
Applications & Analytics. This market segment includes data science, analytics, business intelligence and data visualization tools and applications as well as packaged analytics-focused applications. The applications and analytics segment of the Big Data market reached $2.08 billion in 2014. The vast majority of that revenue is associated with business intelligence and data visualization tooling as opposed to packaged, operational Big Data applications. Wikibon believes the latter, however, will have a bigger impact on value delivered from Big Data than business intelligence. Applications that automate business processes based on Big Data insights – in other words, operationalized Big Data – hold significant promise to transform a number of vertical markets. But maturation of this space will lag other parts of the Big Data market due to the complexity of the technology (artificial intelligence, machine learning, cognitive computing) involved. However, over time Wikibon expects this segment of the market to grow significantly as vendors mature packaged application offerings that leverage and deliver the results of advanced analytics, data science and cognitive computing to end-users. This growth will come at the expense of professional services, which will become less critical to Big Data projects as software matures.
Vendors include: IBM, SAP, Palantir, SAS Institute, Splunk, Tableau Software, Qlik, Tresata
Professional Services. The professional services segment is the single largest segment of the Big Data market, representing 43% of all revenue generated in 2014 and topping $10.4 billion. Professional services play a crucial role in helping Big Data practitioners efficiently and effectively apply the technology to real-world business problems. This includes identifying initial use cases, designing and deploying the supporting infrastructure, architecting data flows and transformations to enable so-called data lakes, practicing analytics and data science to derive insights from Big Data, and services to operationalize Big Data as applied to specific business challenges. The market is made up of thousands of small and mid-sized Sis, consultancies and professional services firms with the large Sis beginning to staff up their own Big Data practices. While the Big Data professional services market makes up the largest slice of the overall Big Data market today, Wikibon expects professional services to become less critical to Big Data projects over the long-term (5-10+ years out) as software (both platforms and applications) mature, making Big Data more accessible to less sophisticated practitioners and organizations without the assistance of armies of consultants.
Vendors include: IBM, Accenture, Cap Gemini, Deloitte, Think Big (Teradata)
Research Methodology
Regarding methodology, the Big Data market size, forecast, and related market-share data was determined based on extensive research of public revenue figures, media reports, interviews with vendors, venture capitalists and resellers regarding customer pipelines, product roadmaps, and feedback from the Wikibon community of IT practitioners.
Many vendors were not able or willing to provide exact figures regarding their Big Data revenue, and because many of the vendors are privately held, Wikibon had to triangulate many types of information to determine its final figures. We also held extensive discussions with former employees of Big Data companies to further calibrate our models.
Information types used to estimate revenue of private Big Data vendors included supply-side data collection, number of employees, number of customers, size of average customer engagement, amount of venture capital raised, and age of vendor.
We also conducted extensive albeit somewhat bespoke demand-side research from three sources:
- Web-based surveys of Big Data practitioners;
- Many dozens of in-depth interviews (IDIs of approximately 1 hour in length) with both Big Data technical and business practitioners; and
- Hundreds of shorter interviews on theCUBE with Big Data practitioners, executives, buyers and strategists.
Finally, we also talk to channel participants to get a real-world sense of market momentum.
Regarding the forecast, we cite the following methodological approaches:
- We began conducting this research in 2012 for the years 2011 through 2016. This report is our fourth pass at quantifying the Big Data market;
- We used the supply side-research and consequent demand-side calibration to set a baseline of the market size.
- We then built a model broken down by vendor and by major area of emphasis (hardware, software, services– further broken into professional services, apps and analytics, NoSQL, SQL, Hadoop, data management, networking, storage, compute).
- The model calibrates against publicly available financial data for both high level categories (e.g. a vendor’s services business) further calibrated by public statements regarding participation in various Big Data segments. Importantly, we endeavor to map any such statements into our definitions.
- From this baseline we perform a bottoms-up task on each component part of the market and view the history of that individual component. We then map this historical data to an Ogive curve (i.e. a continuous normal distribution) for each element. Each Ogive curve is defined by the mean, standard deviation and max value of each individual component; where the mean is the start of the time series, the standard deviation is the normal description of the curve and the max is the endpoint of the curve.
- Finally, in 2015, we chose to extend our five-year forecast view, to a ten-year window because we feel there are significant market forces – particularly in the software and application area – that will contribute to volume and affect share shifts in the market. By extending our window we are able to better describe these trends.
Big Data Definitions
It is critically important to understand how Wikibon defines Big Data as it relates to the market size overall and to revenue estimates for specific vendors in particular. Wikibon’s definition of Big Data contains two equally important parts.
First, from a technology perspective, Wikibon defines Big Data as those data sets whose size, type, and speed-of-creation make them impractical to process and analyze with traditional database technologies and related tools in a cost- or time-effective way.
Second, Wikibon believes Big Data requires practitioners to embrace an exploratory and experimental mindset regarding data and analytics, one that replaces gut instinct with data-driven decision-making, and exchanges stubbornness for a willingness to question long-held assumptions. Projects whose processes are informed by this mindset meet Wikibon’s definition of Big Data, even in cases where some of the tools and technology involved may not.
Based on the above definition, Wikibon includes the following products and services under the umbrella of Big Data:
- Hadoop software and related hardware and services;
- NoSQL database software and related hardware and services;
- Next-generation data warehouses/analytic database software and related hardware and services;
- Non-Hadoop Big Data platforms, software, and related hardware and services;
- In-memory – both DRAM and flash – databases as applied to Big Data workloads;
- Data integration and data quality platforms, tools and services as applied to Big Data deployments;
- Advanced analytics and data science platforms, tools and services;
- Application development platforms, tools and services as applied to Big Data use cases;
- Business intelligence and data visualization platforms, tools and services as applied to Big Data use cases;
- Analytic and transactional applications and services as applied to Big Data use cases;
- Other Big Data support, training, and professional services.
Big Data Revenue By Vendor
As part of its market-sizing efforts, Wikibon tracked and/or modeled the 2014 Big Data revenue of more than 60 vendors. This list includes both Big Data pure-plays – those vendors that derive close to if not all their revenue from the sale of Big Data products and services – and vendors for whom Big Data sales is just one of multiple revenue streams.
Vendor | 2014 Big Data Revenue | % Big Data Hardware Revenue | % Big Data Software Revenue | % Big Data Services Revenue |
IBM | $1,601 | 26% | 35% | 39% |
HP | $932 | 43% | 14% | 43% |
SAP | $923 | 0% | 79% | 21% |
Teradata | $687 | 29% | 40% | 31% |
Dell | $685 | 85% | 0% | 15% |
Palantir | $544 | 0% | 35% | 65% |
SAS Institute | $533 | 0% | 67% | 33% |
Microsoft | $532 | 0% | 70% | 30% |
Accenture | $498 | 0% | 0% | 100% |
Oracle | $493 | 29% | 40% | 31% |
Splunk | $451 | 0% | 74% | 26% |
Amazon | $440 | 0% | 0% | 100% |
PwC | $406 | 0% | 0% | 100% |
Deloitte | $375 | 0% | 0% | 100% |
Informatica | $353 | 0% | 87% | 13% |
Cisco Systems | $321 | 85% | 0% | 15% |
EMC | $315 | 71% | 0% | 29% |
Intel | $268 | 81% | 4% | 15% |
$225 | 0% | 0% | 100% | |
Mu Sigma | $225 | 0% | 0% | 100% |
CSC | $210 | 0% | 0% | 100% |
Microstrategy | $192 | 0% | 75% | 25% |
NetApp | $184 | 73% | 0% | 27% |
Red Hat | $169 | 0% | 74% | 26% |
Pivotal | $159 | 0% | 77% | 23% |
Cap Gemini | $145 | 0% | 0% | 100% |
Opera Solutions | $130 | 0% | 0% | 100% |
TCS | $124 | 0% | 0% | 100% |
VMware | $106 | 0% | 79% | 21% |
MarkLogic | $102 | 0% | 80% | 20% |
Qlik | $101 | 0% | 89% | 11% |
Rackspace | $95 | 0% | 0% | 100% |
Actian | $94 | 0% | 89% | 11% |
Cloudera | $91 | 0% | 53% | 47% |
Tableau Software | $90 | 0% | 89% | 11% |
DDN | $86 | 84% | 0% | 16% |
TIBCO | $64 | 0% | 67% | 33% |
Guavus | $62 | 0% | 68% | 32% |
Alteryx | $55 | 0% | 87% | 13% |
1010data | $53 | 0% | 89% | 11% |
Hortonworks | $43 | 0% | 63% | 37% |
MapR | $42 | 0% | 83% | 17% |
Syncsort | $35 | 0% | 86% | 14% |
MongoDB | $35 | 0% | 71% | 29% |
DataStax | $34 | 0% | 76% | 24% |
Attivio | $32 | 0% | 63% | 38% |
GoodData | $27 | 0% | 78% | 22% |
Fractal Analytics | $25 | 0% | 0% | 100% |
Datameer | $25 | 0% | 80% | 20% |
Sumo Logic | $25 | 0% | 0% | 100% |
Talend | $25 | 0% | 68% | 32% |
Attunity | $24 | 0% | 83% | 17% |
Pentaho | $24 | 0% | 79% | 21% |
Couchbase | $18 | 0% | 78% | 22% |
SiSense | $15 | 0% | 67% | 33% |
Basho | $14 | 0% | 79% | 21% |
Aerospike | $13 | 0% | 85% | 15% |
Neo Technology | $13 | 0% | 85% | 15% |
Revolution Analytics | $12 | 0% | 67% | 33% |
Think Big Analytics | $12 | 0% | 0% | 100% |
Digital Reasoning | $12 | 0% | 67% | 33% |
Paxata | $11 | 0% | 82% | 18% |
Tresata | $12 | 0% | 83% | 17% |
Trifacta | $10 | 0% | 90% | 10% |
ODM | $5,814 | 100% | 0% | 0% |
Other | $7,891 | 22% | 6% | 72% |
Total | $27,361 | 37% | 20% | 43% |
Big Data Market Drivers and Headwinds
There were a number of factors driving growth of the Big Data market in 2014. They include:
- The continued explosion of multi-structured data at ever-increasing speed. This factor should go without saying, but predictions that data volumes will grow exponentially year-over-year are bearing out. Driving this explosion of data is the ever-increasing datafication of inanimate objects via sensors, the phenomenon also known as the Internet of Things. From commercial devices such as health trackers and watches to industrial equipment such as power generators and jet engines, more and more previously offline objects and devices are being brought online. While the application of data science and analytics to the IoT is in its infancy, Wikibon believes IoT-based applications will drive significant societal value over time (See Defining and Sizing the Industrial Internet.)
- The emergence and maturation of data warehouse optimization as a definitive, initial Big Data use case applicable across vertical markets. Typical deployment scenarios involve off-loading data and specific data transformation workloads from existing enterprise data warehouses to less-expensive Hadoop environments. A by-product of this deployment scenario is the creation of an enterprise data lake. As such, this use case has the advantage of an easily calculable reduction-on-investment, the savings of which can then be used to fund revenue-generating Big Data analytics use cases supported by the now-existing enterprise data lake. A number of technology providers and well as professional services firms developed data warehouse optimization and data lake creation products and service practices in 2014 and Wikibon expects this initial use-case to serve as the typical initial deployment scenario among new Big Data practitioners in 2015.
- The establishment of Big Data-driven decision making as a key strategic priority in board rooms and C-suites across vertical market but particularly in the financial services, retail, healthcare and telecommunications industries. In conversations with hundreds of Big Data decision-makers throughout 2014, it is clear to Wikibon that the Big Data and the intelligent application of analytics to critical business practices has bubbled up to the highest levels of many large enterprises. Even for those enterprises that have yet to invest in Big Data technologies and services, the question is not if to do so but when and how.
There were also a number of factors inhibiting growth of the Big Data market in 2014. They include:
- The increasing appreciation of data management and data governance as must-have capabilities for Big Data analytics by enterprise Big Data practitioners. This is particularly the case in highly regulated industries, such as finance and healthcare, as well as in consumer-facing industries, namely retail, that recognize data privacy as a top consumer concern. While this appreciation will ultimately fuel further market growth, namely in the data management software sector, in 2014 such concerns caused some Big Data practitioners to postpone or reduce their investment in Big Data technologies and services in order to first get a better handle on the potential data governance and compliance implications of Big Data to their enterprises.
- The continued complexity of Big Data deployments. While the benefits of Big Data are becoming clear to the C-Suite, how to de-risk Big Data deployments made up of many disparate technologies and how to increase the likelihood of successful deployments continues to cause concern for many potential Big Data practitioners. According to Wikibon’s 2014-2015 Big Data Adoption Survey, Big Data deployments usually involve multiple technologies, including modern approaches to data processing, storage and analytics such as Hadoop along with more traditional technologies including data integration tools, data warehouses, and business intelligence applications. In addition, Big Data is by its very nature a risky undertaking with the benefits, though potentially transformation in nature, difficult to predict at the outset of projects.
- Adoption of Big Data technologies and services is concentrated in two areas. One area includes large ($1 billion plus in annual revenue) and particularly very large ($10 billion plus in annual revenue) enterprises. The second area includes born-data-driven web companies and start-ups. There was relatively little investment taking place in the small to mid-sized enterprise space in 2014. This type of adoption trend is not uncommon in emerging technology markets, with large companies with the resources, vision and appetite for risk leading the way. Wikibon expects the SME market to slowly begin experimenting with Big Data technologies and services in 2015, but does not believe large-scale adoption by SMEs to pick up until the technologies themselves further mature and are easier to consume at less sophisticated enterprises.
Forecast
Wikibon believes the Big Data market will continue to enjoy significant growth relative to other IT markets over the next 11 year period. As the market matures, however, growth will inevitably slow. Wikibon also expects an overall shift in the distribution of revenue between market sub-segments, with an overall trends of revenue moving from professional services to software (platforms/databases, analytics and packaged applications) in the long-term (2020 and beyond.) As software matures, making it easier for enterprises to deploy and manage Big Data platforms and the insights of data science accessible to end-users through packaged applications that leverage cognitive computing capabilities, the requirement for large numbers of consultants and other third-party professional services will diminish.
Below (Figure 4) is Wikibon’s Big Data market forecast broken down by market component through 2020. As stated earlier, Wikibon expects the Big Data market to top $84 billion in 2020, a 17% compound annual growth rate over the 15 year period beginning in 2011.
Conclusion
The Big Data market has come a long way in just in the four short years that Wikibon has been actively tracking and forecasting the market. From just $7 billion in total revenue in 2011 to over $27 billion in 2014, the market, like the data itself, is exploding. Large enterprises across vertical markets are recognizing that Big Data is one of a handful of key competitive differentiators in the emerging digital economy and are aggressively investing in the related technologies and services required to tap its full potential. Wikibon expects this investment to continue apace, with the market forecast to grow by 29% in 2015 and topping $35 billion. There are, however, a number of factors that could curtail market growth, namely practitioner concerns around security, governance, compliance and privacy. It is important for both practitioners and vendors to address these critical issues while continuing to develop the core skills and technology foundations necessary to derive maximize insight and value from Big Data.
Please visit http://premium.wikibon.com/big-data/ for the complete access to Wikibon Big Data research, including more in-depth analysis of the opportunities and challenges facing practitioners as they look to harness the value of Big Data technologies and services.