Informatica has announced version 9.1 for Big Data. I wrote previously about Informatica 9.1,the latest iteration of the company’s data integration platform, following its industry analyst summit. At that event in February, the company officials alluded to future plans regarding Hadoop and other big-data sources yet to be finalized. This announcement reveals those plans. Informatica will support three types of “big data”: big transaction data from relational databases and data warehouse system, big interaction data from social media, customer interaction systems and other systems, and big data processing, which means Hadoop, the open source software framework. Let’s look at each of these types.
With respect to relational databases, Informatica adds support for additional analytic databases so its PowerCenter connectors are now available for “traditional” database alternatives including IBM DB2, Microsoft SQLServer, Oracle and Sybase as well as analytical databases and data warehouse systems from Aster Data, Greenplum, Netezza, ParAccel, Teradata and Vertica. While Hadoop gets a lot of attention these days, it’s important to recognize that big data also exists in these other sources. Many of the customers of these vendors probably use Informatica already and will benefit from having official support for their configurations.
Social media and other customer interaction data are important sources for companies seeking to build a complete view of the customer. My colleague Richard Snow has written about the role of social media in this context, and our firm has conducted benchmark research on other customer interaction technologies. With version 9.1, Informatica makes it easier to collect social media data and includes specific connectors for Facebook, LinkedIn and Twitter.
Informatica’s developments around big-data processing and Hadoop will come in two phases. The first phase, which the company said will be “shipping soon,” provides access to data stored in HDFS as both a target and a source for Informatica processes. A second phase in a future release will provide graphical codeless development of Hadoop MapReduce jobs, which will support preparing and integrating data in Hadoop. While phase one begins to incorporate Hadoop, the additional features of phase two are necessary to make Hadoop a first-class citizen in the Informatica ecosystem. Smaller, more nimble vendors such as Karmasphere are offering graphical development capabilities today, and Informatica will need to offer these as well to compete.
As part of the launch, Informatica enlisted Tim Leonard, chief technology officer of U.S. Xpress, to talk publicly about its use of Informatica. This transportation company has an innovative application combining large amounts of streaming real-time data, location intelligence and mobile devices. The application enables U.S. Xpress to combine driver location and other data to reduce fuel consumption costs as well as provide better customer service through more detailed information about delivery schedules and the ability to reroute deliveries when necessary.
So although Informatica is moving more slowly than some smaller vendors on particular features such as graphical development of Hadoop jobs, the U.S. Xpress application provides an example of the value of working with a vendor that has such an extensive portfolio of products. That customer is able to source from a single vendor data integration capabilities to handle big data, streaming data and location-based data. This is a promising position for Informatica.