Premise
There is no consensus yet about the ultimate destination and persistence of data flowing from millions or billions of sensors. Because they are distributed, the transmission costs to a central repository may seem prohibitive, leading to schemes to reduce and/or aggregate data at the edge. Alternatively, new and unforeseen applications may be waiting for it.
Suppose your organization implements a very large sensor network that streams a great deal of data at very small intervals. For example, you produce cars and trucks (perhaps autonomous ones). This includes telemetry from cameras, radar, sonar, GPS and LIDAR, the latter about 70MB/sec. This could easily amount to four terabytes per day (per vehicle). How much of this data needs to be retained? After all, the location of a cat crossing the street isn’t very interesting a few minutes later. The initial point of all that data (and intelligence at the edge, on board) is to keep the vehicle running well and from crashing.
Setting aside the question of whether data about stray animals in a neighborhood are worth retaining, beyond the instantaneous management of the vehicle, the data that might be retained could be reduced through compression or aggregation. Instead of four terabytes of data, the reduced volume (and urgency) of this type of data could easily be handled by a mid-range analytical platform, preferably in the cloud.
Thus, costs would suggest aggressive efforts to reduce data. But what about business quality or opportunity? Under what circumstances should businesses choose to keep larger volumes of data available for derivative purposes? Our research shows that keeping “more data is better” when:
- There an “Informating” application for your IoT data.
Sometimes the data is more valuable than the activity being automated.
- You have the infrastructure to capitalize on it.
Business models capable of monetizing data will encourage keeping more data.
- You have an agile enough culture to launch something completely alien.
Realizing potential returns on data is highly sensitive to the right mix of skills, business processes, and strategic imagination.
Is There An “Informating” Application for Your IoT Data?
Shoshana Zuboff, in her landmark book, “In the Age of the Smart Machine: The Future of Work and Power,” coined the term “infomating” to describe circumstances under the technology implementation achieved its goals, but the data turned out to be vastly more valuable from unforeseen developments. She discovered infomating when studying the introduction of laser scanners at grocery store checkout counters (perhaps you can see the analogy here to IoT).
Grocery stores adopted scanners to speed up the checkout process, to eliminate the need to place price tags on every item to manage inventory most effectively and even to adjust prices with a keystroke. But what the inventors didn’t see was the multi-billion-dollar industry of collecting the scanned data, syndicating it across many stores and reselling it to at great profit to manufacturers, distributions, transportation companies and even back to the stores themselves.
Where are the syndicated data opportunities in a domain like autonomous vehicles? According to a 2016 study by AAA, the average American spends about 50 minutes per day driving. The only activities that get more time are sleeping and TV watching. What new business opportunities might be buried in that data? To insurance companies, to smart cities, to automobile and truck manufacturers, to fleet operators, tire companies…the opportunities are there. It is here that the reduction of data at the edge may not be the best approach.
Do you have the infrastructure to capitalize on it?
Originators create original content. Syndicators package that content for distribution, often integrating it with content from other originators. Distributors deliver the content to customers. A company can play one role in a syndication network, or it can play two or three roles simultaneously. It can also shift from one role to another over time.
Consider IMS Quintiles, the largest data syndicator in healthcare. By gathering data from originators, they provide the healthcare industry with sales-force effectiveness, compensation plans, effectiveness of multiple distribution channels, market opportunity analysis and penetration strategies, competitive analysis, marketing programs. IRI provides integrated big data, predictive analytics and forward-looking insights, all on a single leading technology platform, IRI Liquid Data®, to help CPG, over-the-counter health care, retail and media companies personalize their marketing and grow their businesses.
Obviously Originators need to consider how to package the data they collect. Syndicators have a more complex, distributed architecture at scale, and Distributors have the most complex architecture and business.
Do You Have an Agile Enough Culture to Launch Something Completely Alien?
It takes more than a good idea to launch a successful business, especially when that business is tangential to your existing business. The simplest path is to think if there is a way to drive secondary value from the data you collect in your instrumented or IoT business. Is there a buyer for this data? Or, can it be used internally for driving new insights beyond those of the original purpose of the sensor? This latter point is the central reason for building a data lake: the data has all been used for something else, so you collect it with the assumption it has more value if you can leverage it.
Even if you choose to discard IoT data, you can always turn the spigot back on. Unless you have an informating application already, it may be wise to make your data persistence decision based on what you know. In the meantime, try to envision opportunities to either create a business from it, or find a way to engage with a syndicator that would be interested in buying the data or partnering with you. Remember, companies like IMS Health, Information Resources, Axciom and Comscore got rich on other companies’ data.
Action Item
Whether you choose to act on informating your data, be through in evaluating whether to store original sensor data. Even if you meet you IoT requirements without persisting it, there may be opportunities down the line if you do.