Alteryx has released version 9.0 of Alteryx Analytics that provides a range of data to predictive analytics in advance of its annual user conference called Inspire 2014. I have covered the company for several years as it has emerged as a key player in providing a range of business analytics from predictive to big data analytics. The importance of this category of analytics is revealed by our latest benchmark research on big data analytics, which finds that predictive analytics is the most important type of big data analytics, ranked first by nearly half (47%) of research participants. The new version 9 includes new capabilities and integration with a range of new information sources including read and write capability to IBM SPSS and SAS for range of analytic needs.
After attending Inspire 2013 last year, I wrote about capabilities that are enabling an emerging business role, that which Alteryx calls the data artisan. The label refers to analysts who combines both art and science in using analytics to help direct business outcomes. Alteryx uses an innovative and intuitive approach to analytic tasks, using workflow and linking various data sources through in-memory computation and processing. It takes a “no code” drag and drop approach to integrate data from files and databases, prepare data for analysis, and build and score predictive models to yield relevant results. Other vendors in the advanced analytics market are also applying this approach, but few mature tools are currently available. The output of the Alteryx analytic processes can be shared automatically in numerous data formats including direct export into visualization tools such as those from Qlik (new support) and Tableau. This can help users improve their predictive analytics capabilities and take action on the outcomes of analytics, which are the two capabilities most-often cited in our research as needed to improve big data analytics.
Alteryx now works with Revolution Analytics to increase the scalability of its system to work with large data sets. The open source language R continues to gain popularity and is being embedded in many business intelligence tools, but it runs only on data that can be loaded into memory. Running only in memory does not address analytics on datasets that run into Terabytes and hundreds of millions of values, and potentially requires use of a sub-sampling approach to advanced analytics. With its RevoScaleR, Revolution Analytics rewrites parts of the R algorithm so that the processing tasks can be parallelized and run in big data architectures such as Hadoop. Such capability is important for analytic problems including recommendation engines, unsupervised anomaly detection, some classification and regression problems, and some clustering problems. These analytic techniques are appropriate for some of the top business uses of big data analytics, which according to our research are cross-selling and up-selling (important to 38%), better understanding of individual customers (32%), analyzing all data rather than a sample (30%) and price optimization (28%). Alteryx Analytics automatically detects whether to use RevoScaleR or open source R algorithms. This approach simplifies the technical complexities of scaling R by providing a layer of abstraction for the analytic professional.
Scoring – the ability to input a data record and receive the probability of a particular outcome – is an important if not well understood aspect of predictive analytics. Our research shows that companies that score models on a timely basis according to their needs get better organizational results than those that score all models the same way. Working with Revolution Analytics, Alteryx has enhanced scoring scalability for R algorithms with new capabilities that chunk data in a parallelized fashion. This approach bypasses the memory-only approach to enable a theoretically unlimited number of scores to be processed. For large-scale implementations and consumer applications in industries such as retail, an important target market for Alteryx, and these capabilities are becoming important.
Alteryx 9.0 also improves on open source R’s default approach to scoring, which is “all or nothing.” That is, if data is missing (a null value) or a new level for a categorical variable is not included in the original model, R will not score the model until the issue is addressed. This process is a particular problem for analysts who want to score data in small batches or individually. In contrast, Alteryx’s new “best effort” approach scores the records that can be run without incident, and those that cannot be run are returned with an error message. This adjustment is particularly important as companies start to deploy predictive analytics into areas such as call centers or within Web applications such as automatic quotes for insurance.
Alteryx 9.0 also has new predictive modeling tools and functionality. A spline model helps address regression and classification problems such as data reduction and nonlinear relationships and their interactions. It uses a clear box way to serve users with differing objectives and skill levels. The approach exposes the underpinnings of the model so that advanced users can modify a model, but at the same time less sophisticated users can use the model without necessarily understanding all of the intricacies of the model itself. Other capabilities include a Gamma regression tool allows data matching to model the Gamma family of distributions using the generalized linear modeling (GLM) framework. Heat plot tools for visualizing joint probability distributions, such as between customer income level and customer advocacy, and more robust A/B testing tools, which are particularly important in digital marketing analytics, are also part of the release.
At the same time, Alteryx has expanded its base of information sources. According to our research, working with all sources of data, not just one, is the most common definition for big data analytics, as stated by three-quarters (76%) of organizations. While structured data from transaction systems and so-called systems of record is still the most important, new data sources including those coming from external sources are becoming important. Our research shows that the most widely used external data sources are cloud applications (54%) and social media data (46%); five additional data sources, including Internet, consumer, market and government sources, are virtually tied in third position (with 39% to 42% each). Alteryx will need to be mindful of best practices in big data analytics as I have outlined to ensure it can stay on top of a growing set of requirements to blend big data but also apply a range of advanced analytics.
New connectors to the social media data provider Gnip give access to social media websites through a single API, and a DataSift (http://www.datasift.com) connector helps make social media more accessible and easier to analyze for any business need. Other new connectors in 9.0 include those for Foursquare, Google Analytics, Marketo, salesforce.com and Twitter. New data warehouse connectors include those for Amazon Redshift, HP Vertica, Microsoft SQL Server and Pivotal Greenplum. Access to SPSS and SAS data files also is introduced in this version; Alteryx hopes to break down the barriers to entry in accounts dominated by these advanced analytic stalwarts. With already existing connectors to major cloud and on-premises data sources, the company provides a robust integration platform for analytics.
Alteryx is on a solid growth curve as evidenced by the increasing number of inquiries and my conversations with company executives. It’s not surprising given the disruptive potential of the technology itself and its unique analytic workflow technology for data blending and advanced analytics. This data blending and workflow technology that Alteryx provides is not highlighted enough as it is one of the largest differentiators of its software and reduces the data related tasks like preparing (47%) and reviewing (43%) data that our customer analytics research finds gets in the way of analysts performing analytics. Additionally Alteryx ability to apply location analytics within its product is a key differentiation that our research found delivers exponential value from analytics than just viewing traditional visualization and tables of data. Also location analytics like Alteryx provides helps rapidly identify areas where customer experience and satisfaction can be improved and is the top benefit found in our research. The flexible platform resonates particularly well with line-of-business and especially in fast-moving, lightly regulated industries such as travel, retail and consumer goods where speed of analytics are critical to be performed. The work the company is doing with Revolution Analytics and the ability to scale is important for advanced analytic that operate on big data. The ability to seamlessly connect and blend information sources is a critical capability for Alteryx and it’s a wise move to invest further in this area but Alteryx will need to examine where collaborative technology could be used to help business work together on analytics within the software. Alteryx will need to continue to adapt to the market demand for analytics and keep focused on varying line of business areas so it can continue its growth. Just about any company involved in analytics today should evaluate Alteryx and see how it can streamline analytics in a very unique approach.