Premise
Though there have been headline-grabbing statements about the death of the general purpose SQL DBMS, they’re not true. Rather, Systems of Intelligence, created by Google, Amazon, Twitter, and others have pioneered a new way of managing data. Like all platform transitions, they don’t replace the old technology but instead augment it. In this case they use a proliferation of specialized databases optimized for specific workloads.
This research note will explain how and why enterprises can substitute specialized databases for SQL DBMS’s in many workloads after 30 years in which they enjoyed a monopoly on enterprise data management.
Unlike traditional Systems of Record such as SAP, the master data about a customer, the customer profile, evolves and grows continually in Systems of Intelligence. The traditional SQL DBMS has had great difficulty accommodating that. MongoDB is an example of a database that stores these changing user profiles.
By comparing the collective customer journey toward specialized databases, practitioners can determine how and when to make this choice for themselves as well.
It’s not just about technology
Despite the proliferation of specialized databases at consumer Internet services vendors, each optimized for a different workload, it’s not just the technology that determines the best choice. From a purely technical perspective, Oracle 12c can handle many of the new workloads.
The non-technical considerations are best summed-up in Ray Ozzie’s description of how to go to market with software products for developers in the Web era: “discover, learn, try, buy, recommend.” Making a product dead simple to download, get up and running on a laptop, and then deploy to a server or cluster radically transforms everything about a product in addition to its technology. Customers insist on a new approach to pricing; licensing; training and skill sets; and even how it fits into an organizational culture, which is often determined by the part of the organization driving the purchase decision.
But new technology requirements initiated reconsideration
Systems of Record and the SQL DBMS design center
While SQL databases aren’t dying, they are beginning to bleed from a thousand cuts. For 30 years they have been the foundation of Systems of Record, the heart of enterprise applications. In fact, the traditional SQL DBMS was designed specifically for these applications. So you have to compare them with Systems of Intelligence to understand why the consumer Internet services designed new databases.
Systems of Record such as SAP were designed with exquisite care over many years, typically a decade or more. They standardized business processes and the transactions that drive them. Their goal is essentially about stability, not change. As a result, vendors designed them to anticipate just about all needs in advance because changes can break so many things. In fact, SAP has in the neighborhood of 100,000 tables pre-designed in its database. But enterprises were essentially out of luck if they had to capture information that didn’t fit into those tables.
Traditional SQL DBMS’s have great difficulty with evolution and flexibility. Even MySQL, the heart of Web 2.0 LAMP (Linux, Apache, MySQL, PHP) stack at just about all consumer Internet vendors during the early 2000’s, shows just how hard it was to modify its data. Just adding a single field to a screen form, equivalent to adding a new column in a table, could reduce the server to thrashing and take 5-8 hours for several million records.
Systems of Intelligence and new design centers for databases
The consumer Internet services vendors created a new consensus about databases over the last 5-10 years. These Systems of Intelligence were continually evolving, keeping track of ever changing information. In our example we’re going to look at user profiles and how new databases sprang up to address the need for scale and flexibility.
The core of Systems of Intelligence is their ability to anticipate and influence consumer interactions in real-time across distribution channels and touch points.
That capability requires profile data about those consumers’ interactions as well as ambient intelligence about what’s going on around them. In order to continually get smarter about the consumers, enterprises have to collect ever more data about them.
Unlike Systems of Record, there’s no way to anticipate all these additions. So the ability to handle constantly changing data becomes a critical requirement of the database.
MongoDB quickly became the anti-MySQL. It stores information in a JSON-like format, where each document is like a record in a SQL database. Rather than having a fixed set of fields like in traditional SQL DBMS’s, each field can itself contain many other fields. What’s even more important is that each document can be different from every other document in the database. That means developers can add and remove fields as the application changes.
In reality, developers would keep some core set of fields common to all related documents. That makes it possible to retrieve them by looking up all profile documents belonging to players who made it to the top level in an online game, for example.
Action Item
The one-size-fits-all approach that suited the database industry so well in the era of Systems of Record is effectively unworkable. Systems of Intelligence have too many data types to process (geeks call this data models). And there’s the need for both high performance and high capacity databases, where the design center for each is completely different.
The critical point is that the new workloads need databases that do a smaller number of things really well, really easily, and *really cheaply*. Wikibon has identified 10 different workloads and the types of databases required to support them. Exactly which vendors, how many databases, and how they need to be delivered depend on an enterprise’s internal skill sets and their need for optimal functionality balanced against more integrated simplicity.
The next post will review the non-technical pricing, licensing, skills, and cultural reasons driving database proliferation.