At its annual industry analyst summit last month and in a more recent announcement of enterprise support for parallelizing the R language on its Aster Discovery Platform, Teradata showed that it is adapting to changes in database and analytics technologies. The presentations at the conference revealed a unified approach to data architectures and value propositions in a variety of uses including the Internet of Things, digital marketing and ETL offloading. In particular, the company provided updates on the state of its business as well as how the latest version of its database platform, Teradata 15.0, is addressing customers’ needs for big data. My colleague Mark Smith covered these announcements in depth. The introduction of scalable R support was discussed at the conference but not announced publicly until late last month.
Teradata now has a beta release of parallelized support for R, an open source programming language used significantly in universities and growing rapidly in enterprise use. One challenge is that R relies on a single-thread, in-memory approach to analytics. Parallelization of R allows the algorithm to run on much larger data sets since it is not limited to data stored in memory. For a broader discussion of the pros and cons of R and its evolution, see my analysis. Our benchmark research shows that organizations are counting on companies such as Teradata to provide a layer of abstraction that can simplify analytics on big data architectures. More than half (54%) of advanced analytics implementations are custom built, but in the future this percentage will go down to about one in three (36%).
Teradata’s R project has three parts. The first includes a Teradata Aster R library, which supplies more than 100 prebuilt R functions that hide complexity of the in-database implementation. The algorithms cover the most common big data analytic approaches in use today, which according to our big data analytics benchmark research are classification (used by 39% of organizations), clustering (37%), regression (35%), time series (32%) and affinity analysis (29%). Some use innovative approaches available in Aster such as Teradata’s patented nPath algorithm, which is useful in areas such as digital marketing. All of these functions will receive enterprise support from Teradata, likely through its professional services team.
The second part of the project involves the R parallel constructor. This component gives analysts and data scientists tools to build their own parallel algorithms based on the entire library of open source R algorithms. The framework follows the “split, apply and combine” paradigm, which is popular among the R community. While Teradata won’t support the algorithms themselves, this tool set is a key innovation that I have not yet seen from others in the market.
Finally, the R engine has been integrated with Teradata’s SNAP integration framework. The framework provides unified access to multiple workload specific engines such as relational (SQL), graph (SQL-GR), MapReduce (SQL-MR) and statistics. This is critical since the ultimate value of analytics rests in the information itself. By tying together multiple systems, Teradata enables a variety of analytic approaches. More importantly, the data sources that can be merged into the analysis can deliver competitive advantages. For example, JSON integration, recently announced, delivers information from a plethora of connected devices and detailed Web data.
Teradata is participating in industry discussions about both data management and analytics. As Mark Smith discussed, its unified approach to data architecture addresses challenges brought on competing big data platforms such as Hadoop and other NoSQL approaches like that one announced with MongoDB supporting JSON integration. These platforms access new information sources and help companies use analytics to indirectly increase revenues, reduce costs and improve operational efficiency. Analytics applied to big data serve a variety of uses, most often cross-selling and up-selling (for 38% of organizations), better understanding of individual customers (32%) and optimizing price (30%) and IT operations (24%). Teradata is active in these areas and is working in multiple industries such as financial services, retail, healthcare, communications, government, energy and utilities.
Current Teradata customers should evaluate the company’s broader analytic and platform portfolio, not just the database appliances. In the fragmented and diverse big data market, Teradata is sorting through the chaos to provide a roadmap for largest of organizations to midsized ones. The Aster Discovery Platform can put power into the hands of analysts and statisticians who need not be data scientists. Business users from various departments, but especially high-level marketing groups that need to integrate multiple data sources for operational use, should take a close look at the Teradata Aster approach.