At Informatica World 2019 this week in Las Vegas, we all came expecting the sponsoring company to deepen its AI story, which revolves around a portfolio of technologies that it calls “CLAIRE.” That certainly happened, but it was predominantly through incremental enhancements to the AI-driven features that now permeate Informatica’s diversified solution portfolio.
AI drives insights that automate data pipeline processes and augment human decision making throughout Informatica LLC’s product architecture. For example, the company announced a new master data management solution that uses embedded AI to automate delivery of insights from cloud data.
There’s nothing surprising about the new Informatica MDM Reference 360. This new solution evolves the core functionality in the vendor’s long-established MDM portfolio, but primarily represents a rearchitecting of the underlying platform around microservices interfaces, in order to support more flexible, distributed deployment of this functionality in future multicloud environments.
However, this rearchitecting is little more than table stakes for this and every other data management vendor to deploy in the mesh clouds that will almost certainly dominate enterprise computing in the coming decade.
Public cloud partnerships advance Informatica in hybrid clouds
In the same vein, many observers expected Informatica to advance its hybrid-cloud value proposition through deepened partnerships with the leading public-cloud providers. It certainly obliged in that respect, clearly differentiating itself as a data management partner for enterprises that have committed to Amazon Web Services, Microsoft Azure or Google Cloud Platform. Wikibon was impressed that Informatica’s public-cloud partnership announcements addressed three distinct set of hybrid-cloud use cases:
- Cloud data migration: Informatica and AWS announced an offering that leverages Informatica Enterprise Data Catalog to identify data assets for migration to AWS’ public cloud. In addition, Informatica Intelligent Cloud Services can now load more than one trillion transactions per month to Amazon Redshift, leveraging AWS’ concurrency scaling capabilities. This follows Informatica recently stated that it is seeing significant join customer momentum with AWS targeting cloud data warehousing opportunities.
- Cloud data stewardship: Informatica and Microsoft announced support for Microsoft’s PowerBI, Common Data Model and Azure Data Lake Storage Gen2 within Informatica’s integration platform as a service. This helps joint customers to accelerate data-driven insights by leveraging Informatica CLAIRE’s metadata-driven AI across applications running in the Azure cloud. Specifically, the integration is designed to help customers use these insights to scalably optimize data quality, harmonize data semantics and manage data across multiple enterprise applications.
- Digital transformation: Informatica and Google announced deeper connectivity across the Informatica Intelligent Data Platform for joint customers pursuing digital transformation initiatives on Google Cloud Platform. Informatica expanded its platform’s integration with all key Google Cloud Platform data stores, including Google BigQuery, Google Cloud Storage and Marketing Analytics. It added support for Google Cloud Dataproc, which is a cloud-based managed Spark and Hadoop service. It optimized the execution throughout its portfolio with Google Cloud. And it announced that the vendors are collaborating on bringing Informatica Intelligent Cloud Services — its enterprise integration platform as a services offering — to Google Cloud Platform as managed multitenant services.
Informatica’s multicloud strategy is still under development
Existing Informatica customers that tie their hybrid cloud initiatives to one of these leading public clouds will benefit from the added functionality being delivered through these partnerships. However, enterprises that have committed to more complex multicloud and mesh cloud deployments may still find Informatica’s value proposition lacking.
At this week’s event, there was no clear statement from the vendor on whether, when and how specifically it will execute its product strategy to realize those aims. There was no express commitment by Informatica to implement greater consistency in public-cloud integrations across its solution portfolio.
In lieu of a detailed multicloud roadmap, Informatica’s customers will, for now, have to settle for the promising “hybrid integration platform” vision that the vendor spelled out in this statement. This document broadly mentions “support for complex multicloud and hybrid (on-premises and cloud) environments” as a key pillar in the vendor’s future target architecture.
Where agile multicloud support is concerned, Wikibon is encouraged by Informatica’s architectural focus on the following three pillars:
- Comprehensive catalog: Informatica is placing increased reliance on its Enterprise Data Catalog as a strategic platform for unifying the management of data, metadata, APIs, machine learning models, business rules and other key assets across multiclouds.
- Consolidated control: Informatica has built a unified experience and control plane for both technical and business users to connect, manage, monitor, secure and administer policy governing enterprise data and applications in clouds of all types.
- Contextual recommendations: Informatica has embedded AI-driven “intelligent recommendations” to automate the “next best actions” that accelerate productivity of data integration, quality and governance professional using its tools.
Informatica’s data-driven intelligent recommendations boost dataops productivity
In that latter regard, the most important announcement from this week’s event may be Informatica’s integration of its Enterprise Data Catalog with solutions from partners Microsoft Corp., Databricks Inc., DataRobot Inc. and Tableau Software Inc.
These add AI-driven features to the data catalog to support intelligent automation, crowdsourced curation, and team collaboration in support of hybrid and multicloud data management. The catalog has been engineered to deliver intelligent recommendations for many data DevOps functions, ranging from fast data discovery and self-service data exploration to lineage assessment and change-impact analysis across complex data pipelines.
In addition to supporting both streaming and batch data preparation in a Spark serverless environment for AI, machine learning and data science pipelines, Informatica’s Enterprise Data Catalog now uses AI to automate creation of a multicloud data “catalog of catalogs” that spans these pipelines. Laying the groundwork for complex multiclouds, the catalog now integrates with new metadata scanners for Delta Lake, an open source data-lake governance project from Databricks, and also for Microsoft Azure Data Lake Storage Gen2.
Here is what CEO Anil Chakravarty had to say on theCUBE this week about the critical importance of the Enterprise Data Catalog in Informatica’s hybrid and multicloud strategy:
“From an operationalization perspective, what [data managers] need is, first of all, to help your data scientists and others find the right data… through the catalog. For example, it’ll tell you what data you can access. And then what’s metadata around the data and what you can use the data for. So maybe there’s some data that you say, but we have the data set, but we don’t have the customer opt-in to use the data…. So that’s the first step finding the right data, then getting access to the data. That’s what you get to read.… Then you prepare the data. We have a number of tools to prepare the data to make sure that the air in machine learning models can use them. Well, then you feed the data, you run it, you get your results. But then the explainability is a big deal. Whether it’s regulators or even your own internal executives, they say, Oh, that’s the result of the running the AI model. But how did how did it come to that decision? You know, for instance, in financial services for, let’s say, a decision on who gets to get a loan or not. Well, you got to make sure that there is no bias in that. And so, in order to explain the results, you need to know where the source data came from. And that’s what we do as well to our governance and lineage. “
Taken as a whole, these new features in Informatica Enterprise Data Catalog pave the way for what Wikibon predicts will be a more comprehensive Informatica “AIops” direction focused on managing data and application integration across multiclouds. It will need this comprehensive capability to hold off fresh challenges in data integration from IBM/Red Hat, Dell/VMware, Cisco Systems Inc., Google and other enterprise solution providers.
Takeaways for data integration, quality and governance professionals
If you’re an established Informatica customer, you’ve long ago absorbed its solutions’ AI-driven intelligent recommendations, single pane of control, hybrid-cloud integrations and catalog-based DevOps into your data management operations. Consequently, little of the news that came out this week during Informatica World 2019 is unprecedented or unexpected. These product and partner announcements simply deepen the sophisticated data operations capabilities you’ve come to rely on from Informatica’s tooling.
If you’re the user of rival data integration, quality or governance solutions, this week’s announcements point up the advantages of switching to a comprehensive, end-to-end platform that automates these capabilities and boosts the productivity of diversified teams handling new data workloads such as machine learning pipelines. Informatica remains the go-to vendor for a truly integrated enterprise-grade data management suite for the era of cloud computing.
With these and other announcements orbiting around Informatica’s “hybrid information platform” vision, you should pay fresh attention to the vendor as a possible foundation for your increasingly complex, distributed cloud deployments of data integration and governance.