SAS Institute held its 24th annual analyst summit last week in Steamboat Springs, Colorado. The 37-year-old privately held company is a key player in big data analytics, and company executives showed off their latest developments and product roadmaps. In particular, LASR Analytical Server and Visual Analytics 6.2, which is due to be released this summer, are critical to SAS’ ability to secure and expand its role as a preeminent analytics vendor in the big data era.
For SAS, the competitive advantage in Big Data rests in predictive analytics, and according to our benchmark research into predictive analytics, 55 percent of businesses say the challenge of architectural integration is a top obstacle to rolling out predictive analytics in the organization. Integration of analytics is particularly daunting in a big-data-driven world, since analytics processing has traditionally taken place on a platform separate from where the data is stored, but now they must be combined. How data is moved into parallelized systems and how analytics are consumed by business users are key questions in the market today that SAS is looking to address with its LASR and Visual Analytics.
Jim Goodnight, the company’s founder and plainspoken CEO, says he saw the industry changing a few years ago. He speaks of a large bank doing a heavy analytical risk computation that took upwards of 18 hours, which meant that the results of the computation were not ready in time for the next trading day. To gain competitive advantage, the time window needed to be reduced, but running the analytics in a serialized fashion was a limiting factor. This led SAS to begin parallelizing the company’s workhorse procedures, some of which were first developed upwards of 30 years ago. Goodnight also discussed the fact that building these parallelizing statistical models is no easy task. One of the biggest hurdles is getting the mathematicians and data scientists that are building these elaborate models to think in terms of the new parallelized architectural paradigm.
Its Visual Analytics software is a key component of the SAS Big Data Analytics strategy. Our latest business technology innovation benchmark research [http://www.ventanaresearch.com/bti/] found that close to half (48%) of organizations present business analytics visually. Visual Analytics, which was introduced early last year, is a cloud-based offering running off of LASR and the Amazon Web Services infrastructure. This web-based approach allows SAS to iterate quickly without worrying a great deal about revision management while giving IT a simpler server management scenario. Furthermore, the web-based approach provides analysts with a sandbox environment for working with and visualizing in the cloud big data analytics; the analytic assets can then be moved into a production environment. This approach will also eventually allow SAS to combine data integration capabilities with the data analysis capabilities.
With descriptive statistics being the ante in today’s visual discovery world, SAS is focusing Visual Analytics to take advantage of the company’s predictive analytics history and capabilities. Visual Analytics 6.2 integrates predictive analytics and rapid predictive modeling (RPM) to do, among other things, segmentation, propensity modeling and forecasting. RPM makes it possible for models to be generated via sophisticated software that runs through multiple algorithms to find the best fit based on the data involved. This type of commodity modeling approach will likely gain significant traction as companies look to bring analytics into industrial processes and address the skills gap in advanced analytics. According to our BTI research, the skills gap is the biggest challenge facing big data analytics today, as participants identified staffing (79%) and training (77%) as the top two challenges.
Visual Analytics’ web-based approach is likely a good long-term bet for SAS, as it marries data integration and cloud strategies. These factors, coupled with the company’s installed base and army of loyal users, give SAS a head start in redefining the world of analytics. Its focus on integrating visual analytics for data discovery, integration and commodity modeling approaches also provides compelling time-to-value for big data analytics. In specific areas such as marketing analytics, the ability to bring analytics into the applications themselves and allow data-savvy marketers to conduct a segmentation and propensity analysis in the context of a specific campaign can be a real advantage. Many of SAS’ innovations cannibalize its own markets, but such is the dilemma of any major analytics company today.
The biggest threat to SAS today is the open source movement, which offers big data analytic approaches such as Mahout and R. For instance, the latest release of R includes facilities for building parallelized code. While academics working in R often still build their models in a non-parallelized, non-industrial fashion, the current and future releases of R promise more industrialization. As integration of Hadoop into today’s architectures becomes more common, staffing and skillsets are often a larger obstacle than the software budget. In this environment the large services companies loom larger because of their role in defining the direction of big data analytics. Currently, SAS partners with companies such as Accenture and Deloitte, but in many instances these companies have split loyalties. For this reason, the lack of a large in-house services and education arm may work against SAS.
At the same time, SAS possesses blueprints for major analytic processes across different industries as well as horizontal analytic deployments, and it is working to move these to a parallelized environment. This may prove to be a differentiator in the battle versus R, since it is unclear how quickly the open source R community, which is still primarily academic, will undertake the parallelization of R’s algorithms.
SAS partners closely with database appliance vendors such as Greenplum and Teradata, with which it has had longstanding development relationships. With Teradata, it integrates into the BYNET messaging system, allowing for optimized performance between Teradata’s relational database and the LASR Analytic Server. Hadoop is also supported in the SAS reference architecture. LASR accesses HDFS directly and can run as a thin memory layer on top of the Hadoop deployment. In this type of deployment, Hadoop takes care of everything outside the analytic processing, including memory management, job control and workload management.
These latest developments will be of keen interest to SAS customers. Non-SAS customers who are exploring advanced analytics in a big data environment should consider SAS LASR and its MPP approach. Visual Analytics follows the “freemium” model that is prevalent in the market, and since it is web-based, any instances downloaded today can be automatically upgraded when the new version arrives in the summer. For the price, the tool is certainly worth a test drive for analysts. For anyone looking into such tools and foresee the need for inclusion predictive analytics, it should be of particular interest.